Program outputting 2^n instead of intended

Program outputting 2^n instead of intended - c

I was recently trying to make a small program in c to find the nth fibannaci number. For some reason, when I run it it instead does the calculation of 2^n and returns that, I have asked around a bit but no one seems to have been able to determine why. I was hoping someone may be able to help me figure it out.
float wat(int n){
int a = 0x3fcf1bbd, b = 0x3f1e377a, c = 0x807fffff, d = 0x400f1bbd;
int e = (((a >> 23) + n) << 23) | (a & c);
int f = (((b >> 23) + n) << 23) | (b & c);
return ((*(float*)&e) + (*(float*)&f))/(*(float*)&d);
}

Your code is a standards violating, unportable hack that might have been meaningful 20 years ago in very special situations and when floating point hardware was magnitudes slower than the rest of the cpu. It's completely meaningless today and asking someone today to debug it for you is like asking for help to insulate your house with asbestos. We don't do things this way anymore, for good reasons.
It can all be written in correct, portable floating point operations like this:
#include <math.h>
float
wat(int n)
{
float a = 0x1.9e377ap+0;
float b = -0x1.3c6ef4p-1;
float d = 0x1.1e377ap+1;
return (ldexpf(a, n) - ldexpf(b, n)) / d;
}
This does exactly the same thing, but without disgusting hacks. Of course it won't do anything useful because adding n to the exponent of X doesn't do X^n, it does X*2^n. So your calculation ends up being:
s = sqrt(5)
(2^n * (1 + s)/2 - 2^n * (1 - s)/2)/s =
(2^n/2 * ((1 + s) - (1 - s)))/s =
2^n

Related

I am not sure why my code is producing this arithmetic error?

Hey everybody I am working on a program in c that tells you the least number of coins needed for any given amount of money. I have a program written that works for for every amount I have tested except for $4.20.
Here is my code:
#include <cs50.h>
#include <stdio.h>
#include <math.h>
int main(void)
{
float f;
int n, x, y, z, q, s, d, t;
do {
printf("How much change do you need?\n");
f = GetFloat();
} while(f <= 0);
{
n = (f * 100);
}
q = (n / 25);
x = (n % 25);
y = (x / 10);
z = (x % 10);
s = (z / 5);
d = (z % 5);
t = (q + y + s + d);
{
printf("%d\n" ,t);
}
}
The strange thing is when I input 4.20 the output is 22 instead of 18 (16 quarters and 2 dimes). I did some sleuthing and found that the problem is with my variable x. When I input 4.2, x gives me 19 and not 20 like it should. I tried other cases that I thought should have produced the same problem like 5.2 and 1.2 but it worked correctly in those cases. It might be a rounding issue but I would think that same error would also happen with those similar values.
Does anyone have an idea about why this might be happening?
PS I am fairly new to coding and I haven't gotten much formal instruction so I also welcome tips on better indentation and formatting if you see anything obvious.

IEEE 754 floating point is often slightly imprecise, and casting will truncate, not round. What's likely happening is that 4.20 * 100 evaluates to 419.999999999999994 (exact number is immaterial, point is, it's not quite 420), and the conversion to int drops the decimal portion, producing 419.
The simple approach is to just do:
n = f * 100 + 0.5;
or you can use a proper function:
n = round(f * 100);
If the number is "almost" exact, either one will be fine, you'd only get discrepancies when someone passed non-integer cents ("4.195" or the like), and if you're using float for monetary values, you've already accepted precision issues in the margins; if you want exact numbers, you'd use the decimal formats that have fixed precision for decimal values, and are intended for financial calculations.

Try this: Provides up to 2 digit precision.
//float f
double f
f *= 1000;
f = floor(f); /* optional */
f /= 10;
f = floor(f); /* optional */
n = f;

How to use Modulo efficiently?

Im doing a (for myself) very complex task, where i have to calculate the largest possible number of sequences when given a number n of segments.
I found out that the Catalan Number represents this sequences, and i got it to work for n<=32. The results i get should be calculated mod 1.000.000.007. The problem i have is that "q" and "p" get to big for a long long int and i can't just mod 1.000.000.007 before dividing "q" and "p" because i would get a different result.
My question is, is there a really efficient way to solve my problem, or do i have to think about storing the values differently?
My limitations are the following:
- stdio.h/iostream only
- only Integers
- n<=20.000.000
- n>=2
#include <stdio.h>
long long cat(long long l, long long m, long long n);
int main(){
long long n = 0;
long long val;
scanf("%lld", &n);
val = cat(1, 1, n / 2);
printf("%lld", (val));
return 0;
}
long long cat(long long q, long long p, long long n){
if (n == 0) {
return (q / p) % 1000000007;
}
else {
q *= 4 * n - 2;
}
p *= (n + 1);
return cat(q, p, n - 1);
}

To solve this efficiently, you'll want to use modular arithmetic, with modular inverses substituting for division.
It's simple to prove that, in the absence of overflow, (a * b) % c == ((a % c) * b) % c. If we were just multiplying, we could take results mod 1000000007 at every step and always stay within the bounds of a 64-bit integer. The problem is division. (a / b) % c does not necessarily equal ((a % c) / b) % c.
To solve the problem with division, we use modular inverses. For integers a and c with c prime and a % c != 0, we can always find an integer b such that a * b % c == 1. This means we can use multiplication as division. For any integer d divisible by a, (d * b) % c == (d / a) % c. This means that ((d % c) * b) % c == (d / a) % c, so we can reduce intermediate results mod c without screwing up our ability to divide.
The number we want to calculate is of the form (x1 * x2 * x3 * ...) / (y1 * y2 * y3 * ...) % 1000000007. We can instead compute x = x1 % 1000000007 * x2 % 1000000007 * x3 % 1000000007 ... and y = y1 % 1000000007 * y2 % 1000000007 * y3 % 1000000007 ..., then compute the modular inverse z of y using the extended Euclidean algorithm and return (x * z) % 1000000007.

If you're using gcc or clang and a 64-bit target, there exists a __int128 type. This gives you extra bits to work with, but obviously only to a point.
Most likely the easiest way to deal with this kind of issue is to use a "bignum" library, i.e. a library that deals with representing and doing arithmetic on arbitrarily large numbers. The arguably most popular open source example is libgmp - you should be able to get your algorithm going quite easily with that. It's also tuned to high performance standards.
Obviously you can reimplement this yourself, by representing your numbers as e.g. arrays of integers of a certain size. You'll have to implement algorithms for doing basic arithmetic such as +, -, *, /, % yourself. If you want to do this as a learning experience that's fine, but there's no shame in using libgmp if you just want to focus on the algorithm you're trying to implement.

what (r+1 + (r >> 8)) >> 8 does?

In some old C/C++ graphics related code, that I have to port to Java and JavaScript I found this:
b = (b+1 + (b >> 8)) >> 8; // very fast
Where b is short int for blue, and same code is seen for r and b (red & blue). The comment is not helpful.
I cannot figure out what it does, apart from obvious shifting and adding. I can port without understanding, I just ask out of curiosity.

y = ( x + 1 + (x>>8) ) >> 8 // very fast
This is a fixed-point approximation of division by 255. Conceptually, this is useful for normalizing calculations based on pixel values such that 255 (typically the maximum pixel value) maps to exactly 1.
It is described as very fast because fully general integer division is a relatively slow operation on many CPUs -- although it is possible that your compiler would make a similar optimization for you if it can deduce the input constraints.
This works based on the idea that 257/(256*256) is a very close approximation of 1/255, and that x*257/256 can be formulated as x+(x>>8). The +1 is rounding support which allows the formula to exactly match the integer division x/255 for all values of x in [0..65534].
Some algebra on the inner portion may make things a bit more clear...
x*257/256
= (x*256+x)/256
= x + x/256
= x + (x>>8)
There is more discussion here: How to do alpha blend fast? and here: Division via Multiplication
By the way, if you want round-to-nearest, and your CPU can do fast multiplies, the following is accurate for all uint16_t dividend values -- actually [0..(2^16)+126].
y = ((x+128)*257)>>16 // divide by 255 with round-to-nearest for x in [0..65662]

Looks like it is meant to check if blue (or red or green) is fully used. It evaluates to 1, when b is 255, and is 0 for all lower values.

A common use case of when you'd want to use a formula that's more accurate than 257/256 is when you have to combine a lot of alpha values together for each pixel. As one example, when doing image shrinking, you need to combine 4 alphas for each source pixel contributing to the destination, and then combine all the source pixels contributing to the destination.
I posted an infinitely accurate bit twiddling version of /255 but it was rejected without reason. So I'll add that I implement alpha blending hardware for a living, I write real time graphics code and game engines for a living, and I've published articles on this topic in conferences like MICRO, so I really know what I'm talking about. And it might be useful or at least entertaining for people to understand the more accurate formula that is EXACTLY 1/255:
Version 1: x = (x + (x >> 8)) >> 8
- no constant added, won't satisfy (x * 255) / 255 = x, but will look fine in most cases.
Version 2: x = (x + (x >> 8) + 1) >> 8
- WILL satisfy (x * 255) / 255 = x for integers, but won't hit correct integer values for all alphas
Version 3: (simple integer rounding):
(x + (x >> 8) + 128) >> 8
- Won't hit correct integer values for all alphas, but will on average be closer than Version 2 at the same cost.
Version 4: Infinitely accurate version, to any level of precision desired, for any number of composite alphas: (useful for image resizing, rotation, etc.):
[(x + (x >> 8)) >> 8] + [ ( (x & 255) + (x >> 8) ) >> 8]
Why is version 4 infinitely accurate?
Because 1/255 = 1/256 + 1/65536 + 1/256^3 + 1/256^4 + ...
The simplest expression above (version 1) doesn't handle rounding, but it also doesn't handle the carries that occur from this infinite number of identical sum columns. The new term added above determines the carry out (0 or 1) from this infinite number of base 256 digits. By adding it, you are getting the same result as if you added all the infinite addends. At which point you can round by adding a half bit to whatever accuracy point you want.
Not needed for the OP perhaps, but people should know that you don't need to approximate at all. The formula above is actually more accurate than double precision floating point.
As for speed: In hardware, this method is faster than even a single (full width) add. In software, you have to consider throughput vs latency. In latency, it may still be faster than a narrow multiply (definitely faster than a full width multiply), but in the OP context, you can unroll many pixels at once, and since modern multiply units are pipelined, you are still OK. In translation to Java, you probably have no narrow multiplies, so this could still be faster, but need to check.
WRT the one person who said "why not use the built in OS capabilities for alpha blitting?": If you already have a substantial graphical code base in that OS, this might be a fine option. If not, you're looking at hundreds to thousands as many lines of code to leverage the OS version - code that's far harder to write and debug than this code. And in the end, the OS code you have isn't portable at all, while this code can be used anywhere.

I suspect that it is trying to do the following:
boolean isBFullyOn = false;
if (b == 0xff) {
isBFullyOn = true;
}
Back in the days of slow processors; smart bit-shifting tricks like the above could be faster than the obvious if-then-else logic. It avoids a jump statement which was costly.
It probably also sets an overflow flag in the processor which was used for some latter logic. This is all highly dependant upon the target processor.
And also on my part speculative!!

Is value of b+1 + b/256, this calculation divided by 256.
In that way, using bit shift the compiler tranlte using CPU level shift instruction, instead of using FPU or library division functions.

b = (b + (b >> 8)) >> 8; is basically b = b *257/256 .
I would consider +1 being an ugly hack of the -0.5 mean reduce caused by the inner >>8.
I would write it as b = (b + 128 + ((b +128)>> 8)) >> 8; instead.

Running this test code:
public void test() {
Set<Integer> results = new HashSet<Integer>();
// short int ranges between -32767 and 32767
for (int i = -32767; i <= 32767; i++) {
int b = (i + 1 + (i >> 8)) >> 8;
if (!results.contains(b)) {
System.out.println(i + " -> " + b);
results.add(b);
}
}
}
Produces all possible values between -129 and 128. However, if you are working with 8-bit colours (0 - 255) then the only possible outputs are 0 (for 0 - 254) and 1 (for 255) so it is likely that it is attempting the function #kaykay posted.

Take the average of two signed numbers in C

Let us say we have x and y and both are signed integers in C, how do we find the most accurate mean value between the two?
I would prefer a solution that does not take advantage of any machine/compiler/toolchain specific workings.
The best I have come up with is:(a / 2) + (b / 2) + !!(a % 2) * !!(b %2) Is there a solution that is more accurate? Faster? Simpler?
What if we know if one is larger than the other a priori?
Thanks.
D
Editor's Note: Please note that the OP expects answers that are not subject to integer overflow when input values are close to the maximum absolute bounds of the C int type. This was not stated in the original question, but is important when giving an answer.

After accept answer (4 yr)
I would expect the function int average_int(int a, int b) to:
1. Work over the entire range of [INT_MIN..INT_MAX] for all combinations of a and b.
2. Have the same result as (a+b)/2, as if using wider math.
When int2x exists, #Santiago Alessandri approach works well.
int avgSS(int a, int b) {
return (int) ( ((int2x) a + b) / 2);
}
Otherwise a variation on #AProgrammer:
Note: wider math is not needed.
int avgC(int a, int b) {
if ((a < 0) == (b < 0)) { // a,b same sign
return a/2 + b/2 + (a%2 + b%2)/2;
}
return (a+b)/2;
}
A solution with more tests, but without %
All below solutions "worked" to within 1 of (a+b)/2 when overflow did not occur, but I was hoping to find one that matched (a+b)/2 for all int.
#Santiago Alessandri Solution works as long as the range of int is narrower than the range of long long - which is usually the case.
((long long)a + (long long)b) / 2
#AProgrammer, the accepted answer, fails about 1/4 of the time to match (a+b)/2. Example inputs like a == 1, b == -2
a/2 + b/2 + (a%2 + b%2)/2
#Guy Sirton, Solution fails about 1/8 of the time to match (a+b)/2. Example inputs like a == 1, b == 0
int sgeq = ((a<0)==(b<0));
int avg = ((!sgeq)*(a+b)+sgeq*(b-a))/2 + sgeq*a;
#R.., Solution fails about 1/4 of the time to match (a+b)/2. Example inputs like a == 1, b == 1
return (a-(a|b)+b)/2+(a|b)/2;
#MatthewD, now deleted solution fails about 5/6 of the time to match (a+b)/2. Example inputs like a == 1, b == -2
unsigned diff;
signed mean;
if (a > b) {
diff = a - b;
mean = b + (diff >> 1);
} else {
diff = b - a;
mean = a + (diff >> 1);
}

If (a^b)<=0 you can just use (a+b)/2 without fear of overflow.
Otherwise, try (a-(a|b)+b)/2+(a|b)/2. -(a|b) is at least as large in magnitude as both a and b and has the opposite sign, so this avoids the overflow.
I did this quickly off the top of my head so there might be some stupid errors. Note that there are no machine-specific hacks here. All behavior is completely determined by the C standard and the fact that it requires twos-complement, ones-complement, or sign-magnitude representation of signed values and specifies that the bitwise operators work on the bit-by-bit representation. Nope, the relative magnitude of a|b depends on the representation...
Edit: You could also use a+(b-a)/2 when they have the same sign. Note that this will give a bias towards a. You can reverse it and get a bias towards b. My solution above, on the other hand, gives bias towards zero if I'm not mistaken.
Another try: One standard approach is (a&b)+(a^b)/2. In twos complement it works regardless of the signs, but I believe it also works in ones complement or sign-magnitude if a and b have the same sign. Care to check it?

Edit: version fixed by #chux - Reinstate Monica:
if ((a < 0) == (b < 0)) { // a,b same sign
return a/2 + b/2 + (a%2 + b%2)/2;
} else {
return (a+b)/2;
}
Original answer (I'd have deleted it if it hadn't been accepted).
a/2 + b/2 + (a%2 + b%2)/2
Seems the simplest one fitting the bill of no assumption on implementation characteristics (it has a dependency on C99 which specifying the result of / as "truncated toward 0" while it was implementation dependent for C90).
It has the advantage of having no test (and thus no costly jumps) and all divisions/remainder are by 2 so the use of bit twiddling techniques by the compiler is possible.

For unsigned integers the average is the floor of (x+y)/2. But the same fails for signed integers. This formula fails for integers whose sum is an odd -ve number as their floor is one less than their average.
You can read up more at Hacker's Delight in section 2.5
The code to calculate average of 2 signed integers without overflow is
int t = (a & b) + ((a ^ b) >> 1)
unsigned t_u = (unsigned)t
int avg = t + ( (t_u >> 31 ) & (a ^ b) )
I have checked it's correctness using Z3 SMT solver

Just a few observations that may help:
"Most accurate" isn't necessarily unique with integers. E.g. for 1 and 4, 2 and 3 are an equally "most accurate" answer. Mathematically (not C integers):
(a+b)/2 = a+(b-a)/2 = b+(a-b)/2
Let's try breaking this down:
If sign(a)!=sign(b) then a+b will will not overflow. This case can be determined by comparing the most significant bit in a two's complement representation.
If sign(a)==sign(b) then if a is greater than b, (a-b) will not overflow. Otherwise (b-a) will not overflow. EDIT: Actually neither will overflow.
What are you trying to optimize exactly? Different processor architectures may have different optimal solutions. For example, in your code replacing the multiplication with an AND may improve performance. Also in a two's complement architecture you can simply (a & b & 1).
I'm just going to throw some code out, not looking too fast but perhaps someone can use and improve:
int sgeq = ((a<0)==(b<0));
int avg = ((!sgeq)*(a+b)+sgeq*(b-a))/2 + sgeq*a

I would do this, convert both to long long(64 bit signed integers) add them up, this won't overflow and then divide the result by 2:
((long long)a + (long long)b) / 2
If you want the decimal part, store it as a double.
It is important to note that the result will fit in a 32 bit integer.
If you are using the highest-rank integer, then you can use:
((double)a + (double)b) / 2

This answer fits to any number of integers:
int[] array = { 1, 2, 3, 4, 5, 6, 7, 8, 9 };
decimal avg = 0;
for (int i = 0; i < array.Length; i++){
avg = (array[i] - avg) / (i+1) + avg;
}
expects avg == 5.0 for this test

implementation of rand()

I am writing some embedded code in C and need to use the rand() function. Unfortunately, rand() is not supported in the library for the controller. I need a simple implementation that is fast, but more importantly has little space overhead, that produces relatively high-quality random numbers. Does anyone know which algorithm to use or sample code?
EDIT: It's for image processing, so "relatively high quality" means decent cycle length and good uniform properties.

Check out this collection of random number generators from George Marsaglia. He's a leading expert in random number generation, so I'd be confident using anything he recommends. The generators in that list are tiny, some requiring only a couple unsigned longs as state.
Marsaglia's generators are definitely "high quality" by your standards of long period and good uniform distribution. They pass stringent statistical tests, though they wouldn't do for cryptography.

Use the C code for LFSR113 from L'écuyer:
unsigned int lfsr113_Bits (void)
{
static unsigned int z1 = 12345, z2 = 12345, z3 = 12345, z4 = 12345;
unsigned int b;
b = ((z1 << 6) ^ z1) >> 13;
z1 = ((z1 & 4294967294U) << 18) ^ b;
b = ((z2 << 2) ^ z2) >> 27;
z2 = ((z2 & 4294967288U) << 2) ^ b;
b = ((z3 << 13) ^ z3) >> 21;
z3 = ((z3 & 4294967280U) << 7) ^ b;
b = ((z4 << 3) ^ z4) >> 12;
z4 = ((z4 & 4294967168U) << 13) ^ b;
return (z1 ^ z2 ^ z3 ^ z4);
}
Very high quality and fast. Do NOT use rand() for anything.
It is worse than useless.

Here is a link to a ANSI C implementation of a few random number generators.

I've made a collection of random number generators, "simplerandom", that are compact and suitable for embedded systems. The collection is available in C and Python.
I've looked around for a bunch of simple and decent ones I could find, and put them together in a small package. They include several Marsaglia generators (KISS, MWC, SHR3), and a couple of L'Ecuyer LFSR ones.
All the generators return an unsigned 32-bit integer, and typically have a state made of 1 to 4 32-bit unsigned integers.
Interestingly, I found a few issues with the Marsaglia generators, and I've tried to fix/improve all those issues. Those issues were:
SHR3 generator (component of Marsaglia's 1999 KISS generator) was broken.
MWC low 16 bits have only an approx 229.1 period. So I made a slightly improved MWC, which gives the low 16 bits a 259.3 period, which is the overall period of this generator.
I uncovered a few issues with seeding, and tried to make robust seeding (initialisation) procedures, so they won't break if you give them a "bad" seed value.

I recommend the academic paper Two Fast Implementations of the Minimal Standard Random Number Generator by David Carta. You can find free PDF through Google. The original paper on the Minimal Standard Random Number Generator is also worth reading.
Carta's code gives fast, high-quality random numbers on 32-bit machines. For a more thorough evaluation, see the paper.

Mersenne twister
A bit from Wikipedia:
It was designed to have a period of 219937 − 1 (the creators of the algorithm proved this property). In practice, there is little reason to use a larger period, as most applications do not require 219937 unique combinations (219937 is approximately 4.3 × 106001; this is many orders of magnitude larger than the estimated number of particles in the observable universe, which is 1080).
It has a very high order of dimensional equidistribution (see linear congruential generator). This implies that there is negligible serial correlation between successive values in the output sequence.
It passes numerous tests for statistical randomness, including the Diehard tests. It passes most, but not all, of the even more stringent TestU01 Crush randomness tests.
source code for many languages available on the link.

I'd take one from the GNU C library, the source is available to browse online.
http://qa.coreboot.org/docs/libpayload/rand_8c-source.html
But if you have any concern at all about the quality of the random numbers, you should probably look at more carefully written mathematically libraries. It's a big subject and the standard rand implementations aren't highly thought of by experts.
Here's another possibility: http://www.boost.org/doc/libs/1_39_0/libs/random/index.html
(If you find you have too many options, you could always pick one at random.)

I found this: Simple Random Number Generation, by John D. Cook.
It should be easy to adapt to C, given that it's only a few lines of code.
Edit: and you could clarify what you mean by "relatively high-quality". Are you generating encryption keys for nuclear launch codes, or random numbers for a game of poker?

Better yet, use multiple linear feedback shift registers combine them together.
Assuming that sizeof(unsigned) == 4:
unsigned t1 = 0, t2 = 0;
unsigned random()
{
unsigned b;
b = t1 ^ (t1 >> 2) ^ (t1 >> 6) ^ (t1 >> 7);
t1 = (t1 >> 1) | (~b << 31);
b = (t2 << 1) ^ (t2 << 2) ^ (t1 << 3) ^ (t2 << 4);
t2 = (t2 << 1) | (~b >> 31);
return t1 ^ t2;
}

The standard solution is to use a linear feedback shift register.

There is one simple RNG named KISS, it is one random number generator according to three numbers.
/* Implementation of a 32-bit KISS generator which uses no multiply instructions */
static unsigned int x=123456789,y=234567891,z=345678912,w=456789123,c=0;
unsigned int JKISS32() {
int t;
y ^= (y<<5); y ^= (y>>7); y ^= (y<<22);
t = z+w+c; z = w; c = t < 0; w = t&2147483647;
x += 1411392427;
return x + y + w;
}
Also there is one web site to test RNG http://www.phy.duke.edu/~rgb/General/dieharder.php

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Program outputting 2^n instead of intended - c

Related

I am not sure why my code is producing this arithmetic error?

How to use Modulo efficiently?

what (r+1 + (r >> 8)) >> 8 does?

Take the average of two signed numbers in C

implementation of rand()

Categories

Resources