I came across this code in parC.
I can't understand in which scenario (the yellow mark) it can happen?
I mean when does the result of division can round up when the value type is Int?
I know nothing about parC, but as far as C is considered, you're right: integer division of positive values is defined to truncate the fraction, that is to round towards zero.
However, there is another bug in the code: if the array is too long, the midpoint may be calculated incorrectly.
To be precise, if initial to is greater by 2 or more than a half of maximum value representable in int type (INT_MAX), then one of recursive call will get fr, to values both greater than (INT_MAX/2) and the expression (to+fr) will cause an arithmetic overflow. The result of (to+fr)/2 will be less than fr/2.
To avoid this, expression like fr + (to - fr)/2 is recommended instead of (to + fr)/2.
EDIT
There are also bugs in the description! See the image:
The (4,5) → (4, 5) recursion happens one level earlier than authors indicate, so one subtree should not actually appear in the graph. Additionally, if the program gets stuck in the red-arrow loop, then the (6,5) recursion will never happen – the process will never reach that branch.
Interesting, authors apparently overlooked yet another loop appearing in their drawing;
recursion (1,2) → (1,2) appears even earlier than those two mentioned above:
As a side note, I can't imagine how they obtained two different results of partitioning the same range:
(4,5) → (4,5) + (6,5) and (4,5) → (4,5) + (5,5)
(see green frames). Possibly they were so absorbed with forcing the idea of up-rounding that they neglected any reasoning about all other aspects of the problem.
Based on this single example I would recommend to put the book in a trash bin.
You are right, integer division for positive operands in C always rounded to floor of the division, check this question for more details. I'm not familiar with parC, but it's said to be full C++ language with some additions from other languages, so described task seems to be incorrect.
If there are concerns left, you always have opportunity to check task directly: setup environment for parC, type implementation of rsum() and execute rsum(A, 1, 5).
You might be reading a text which is older than 18 years.
In older versions of the C standard, division with negative operands could get rounded in an implementation-defined way: either downwards or upwards. Meaning that -3/2 could give you either -1 or -2 depending on compiler.
This was recognized as a design flaw in the language and was corrected with the C99 standard. Nowadays, C always uses "truncation towards zero" no matter compiler.
Related
I've occasionally noticed some C code insisting on using 0 - x to get the additive complement of x, rather than writing -x. Now, I suppose these are not equivalent for types smaller in size than int (edit: Nope, apparently equivalent even then), but otherwise - is there some benefit to the former rather than the latter form?
tl;dr: 0-x is useful for scrubbing the sign of floating-point zero.
(As #Deduplicator points out in a comment:)
Many of us tend to forget that, in floating-point types, we have both a "positive zero" and a "negative zero" value - flipping the sign bit on and off leaves the same mantissa and exponent. Read more on this here.
Well, it turns out that the two expressions behave differently on positive-signed zero, and the same on negative-signed zero, as per the following:
value of x
value of 0-x
value of -x
-.0
0
0
0
0
-.0
See this on Coliru.
So, when x is of a floating-point type,
If you want to "forget the sign of zero", use 0-x.
If you want to "keep the sign of zero", use x.
For integer types it shouldn't matter.
On the other hand, as #NateEldredge points out the expressions should be equivalent on small integer types, due to integer promotion - -x translates into a promotion of x into an int, then applying the minus sign.
There is no technical reason to do this today. At least not with integers. And at least not in a way that a sane (according to some arbitrary definition) coder would use. Sure, it could be the case that it causes a cast. I'm actually not 100% sure, but in that case I would use an explicit cast instead to clearly communicate the intention.
As M.M pointed out, there were reasons in the K&R time, when =- was equivalent to -=. This had the effect that x=-y was equivalent to x=x-y instead of x=0-y. This was undesirable effect, so the feature was removed.
Today, the reason would be readability. Especially if you're writing a mathematical formula and want to point out that a parameter is zero. One example would be the distance formula. The distance from (x,y) to origo is sqrt(pow(0-x, 2), pow(0-y, 2))
What exactly is the reason behind this peculiar interface of nextafter (and nexttoward) functions? We specify the direction by specifying the value we want to move toward.
At the first sight it feels as if something non-obvious is hidden behind this idea. In my (naive) opinion the first choice for such functions would be something like a pair of single-parameter functions nextafter(a)/nextbefore(a). The next choice would be a two-parameter function nextafter(a, dir) in which the direction dir is specified explicitly (-1 and +1, some standard enum, etc.).
But instead we have to specify a value we want to move toward. Hence a number of questions
(A vague one). There might be some clever idea or idiomatic pattern that is so valuable that it influenced this choice of interface in these standard functions. Is there?
What if decide to just blindly use -DBL_MAX and +DBL_MAX as the second argument for nextafter to specify the negative and positive direction respectively. Are there any pitfalls in doing so?
(A refinement of 2). If I know for sure that b is [slightly] greater than a, is there any reason to prefer nextafter(a, b) over nextafter(a, DBL_MAX)? E.g. is there a chance of better performance for nextafter(a, b) version?
Is nextafter generally a heavy operation? I know that it is implementation-dependent. But, again, assuming an implementation that is based in IEEE 754 representations, is it fairly "difficult" to find the adjacent floating-point value?
With IEEE-754 binary floating point representations, if both arguments of nextafter are finite and the two arguments are not equal, then the result can be computed by either adding one to or subtracting one from the representation of the number reinterpreted as an unsigned integer [Note 1]. The (slight) complexity results from correctly dealing with the corner cases which do not meet those preconditions, but in general you'll find that it is extremely fast.
Aside from NaNs, the only thing that matters about the second argument is whether it is greater than, less than, or equal to the first argument.
The interface basically provides additional clarity for the corner case results, but it is also sometimes useful. In particular, the usage nextafter(x, 0), which truncates regardless of sign, is often convenient. You can also take advantage of the fact that nextafter(x, x); is x to clamp the result at an arbitrary value.
The difference between nextafter and nexttowards is that the latter allows you to use the larger dynamic range of long double; again, that helps with certain corner cases.
Strictly speaking, if the first argument is a zero of some sign and the other argument is a valid non-zero number of the opposite sign, then the argument needs to have its sign bit flipped before the increment. But it seemed too much legalese to add that to the list, and it is still hardly a complicated transform.
I had recently an interview, where I failed and was finally told having not enough experience to work for them.
The position was embedded C software developer. Target platform was some kind of very simple 32-bit architecture, those processor does not support floating-point numbers and their operations. Therefore double and float numbers cannot be used.
The task was to develop a C routine for this architecture. This takes one integer and returns whether or not that is a Fibonacci number. However, from the memory only an additional 1K temporary space is allowed to use during the execution. That means: even if I simulate very great integers, I can't just build up the sequence and interate through.
As far as I know, a positive integer is a exactly then a Fibonacci number if one of
(5n ^ 2) + 4
or
(5n ^ 2) − 4
is a perfect square. Therefore I responded the question: it is simple, since the routine must determine whether or not that is the case.
They responded then: on the current target architecture no floating-point-like operations are supported, therefore no square root numbers can be retrieved by using the stdlib's sqrt function. It was also mentioned that basic operations like division and modulus may also not work because of the architecture's limitations.
Then I said, okay, we may build an array with the square numbers till 256. Then we could iterate through and compare them to the numbers given by the formulas (see above). They said: this is a bad approach, even if it would work. Therefore they did not accept that answer.
Finally I gave up. Since I had no other ideas. I asked, what would be the solution: they said, it won't be told; but advised me to try to look for it myself. My first approach (the 2 formula) should be the key, but the square root may be done alternatively.
I googled at home a lot, but never found any "alternative" square root counter algorithms. Everywhere was permitted to use floating numbers.
For operations like division and modulus, the so-called "integer-division" may be used. But what is to be used for square root?
Even if I failed the interview test, this is a very interesting topic for me, to work on architectures where no floating-point operations are allowed.
Therefore my questions:
How can floating numbers simulated (if only integers are allowed to use)?
What would be a possible soultion in C for that mentioned problem? Code examples are welcome.
The point of this type of interview is to see how you approach new problems. If you happen to already know the answer, that is undoubtedly to your credit but it doesn't really answer the question. What's interesting to the interviewer is watching you grapple with the issues.
For this reason, it is common that an interviewer will add additional constraints, trying to take you out of your comfort zone and seeing how you cope.
I think it's great that you knew that fact about recognising Fibonacci numbers. I wouldn't have known it without consulting Wikipedia. It's an interesting fact but does it actually help solve the problem?
Apparently, it would be necessary to compute 5n²±4, compute the square roots, and then verify that one of them is an integer. With access to a floating point implementation with sufficient precision, this would not be too complicated. But how much precision is that? If n can be an arbitrary 32-bit signed number, then n² is obviously not going to fit into 32 bits. In fact, 5n²+4 could be as big as 65 bits, not including a sign bit. That's far beyond the precision of a double (normally 52 bits) and even of a long double, if available. So computing the precise square root will be problematic.
Of course, we don't actually need a precise computation. We can start with an approximation, square it, and see if it is either four more or four less than 5n². And it's easy to see how to compute a good guess: it will very close to n×√5. By using a good precomputed approximation of √5, we can easily do this computation without the need for floating point, without division, and without a sqrt function. (If the approximation isn't accurate, we might need to adjust the result up or down, but that's easy to do using the identity (n+1)² = n²+2n+1; once we have n², we can compute (n+1)² with only addition.
We still need to solve the problem of precision, so we'll need some way of dealing with 66-bit integers. But we only need to implement addition and multiplication of positive integers, is considerably simpler than a full-fledged bignum package. Indeed, if we can prove that our square root estimation is close enough, we could safely do the verification modulo 2³¹.
So the analytic solution can be made to work, but before diving into it, we should ask whether it's the best solution. One very common caregory of suboptimal programming is clinging desperately to the first idea you come up with even when as its complications become increasingly evident. That will be one of the things the interviewer wants to know about you: how flexible are you when presented with new information or new requirements.
So what other ways are there to know if n is a Fibonacci number. One interesting fact is that if n is Fib(k), then k is the floor of logφ(k×√5 + 0.5). Since logφ is easily computed from log2, which in turn can be approximated by a simple bitwise operation, we could try finding an approximation of k and verifying it using the classic O(log k) recursion for computing Fib(k). None of the above involved numbers bigger than the capacity of a 32-bit signed type.
Even more simply, we could just run through the Fibonacci series in a loop, checking to see if we hit the target number. Only 47 loops are necessary. Alternatively, these 47 numbers could be precalculated and searched with binary search, using far less than the 1k bytes you are allowed.
It is unlikely an interviewer for a programming position would be testing for knowledge of a specific property of the Fibonacci sequence. Thus, unless they present the property to be tested, they are examining the candidate’s approaches to problems of this nature and their general knowledge of algorithms. Notably, the notion to iterate through a table of squares is a poor response on several fronts:
At a minimum, binary search should be the first thought for table look-up. Some calculated look-up approaches could also be proposed for discussion, such as using find-first-set-bit instruction to index into a table.
Hashing might be another idea worth considering, especially since an efficient customized hash might be constructed.
Once we have decided to use a table, it is likely a direct table of Fibonacci numbers would be more useful than a table of squares.
In some situations, one generally uses a large enough integer value to represent infinity. I usually use the largest representable positive/negative integer. That usually yields more code, since you need to check if one of the operands is infinity before virtually all arithmetic operations in order to avoid overflows. Sometimes it would be desirable to have saturated integer arithmetic. For that reason, some people use smaller values for infinity, that can be added or multiplied several times without overflow. What intrigues me is the fact that it's extremely common to see (specially in programming competitions):
const int INF = 0x3f3f3f3f;
Why is that number special? It's binary representation is:
00111111001111110011111100111111
I don't see any specially interesting property here. I see it's easy to type, but if that was the reason, almost anything would do (0x3e3e3e3e, 0x2f2f2f2f, etc). It can be added once without overflow, which allows for:
a = min(INF, b + c);
But all the other constants would do, then. Googling only shows me a lot of code snippets that use that constant, but no explanations or comments.
Can anyone spot it?
I found some evidence about this here (original content in Chinese); the basic idea is that 0x7fffffff is problematic since it's already "the top" of the range of 4-byte signed ints; so, adding anything to it results in negative numbers; 0x3f3f3f3f, instead:
is still quite big (same order of magnitude of 0x7fffffff);
has a lot of headroom; if you say that the valid range of integers is limited to numbers below it, you can add any "valid positive number" to it and still get an infinite (i.e. something >=INF). Even INF+INF doesn't overflow. This allows to keep it always "under control":
a+=b;
if(a>INF)
a=INF;
is a repetition of equal bytes, which means you can easily memset stuff to INF;
also, as #Jörg W Mittag noticed above, it has a nice ASCII representation, that allows both to spot it on the fly looking at memory dumps, and to write it directly in memory.
I may or may not be one of the earliest discoverers of 0x3f3f3f3f. I published a Romanian article about it in 2004 (http://www.infoarena.ro/12-ponturi-pentru-programatorii-cc #9), but I've been using this value since 2002 at least for programming competitions.
There are two reasons for it:
0x3f3f3f3f + 0x3f3f3f3f doesn't overflow int32. For this some use 100000000 (one billion).
one can set an array of ints to infinity by doing memset(array, 0x3f, sizeof(array))
0x3f3f3f3f is the ASCII representation of the string ????.
Krugle finds 48 instances of that constant in its entire database. 46 of those instances are in a Java project, where it is used as a bitmask for some graphics manipulation.
1 project is an operating system, where it is used to represent an unknown ACPI device.
1 project is again a bitmask for Java graphics.
So, in all of the projects indexed by Krugle, it is used 47 times because of its bitpattern, once because of its ASCII interpretation, and not a single time as a representation of infinity.
If I want to check that positive float A is less than the inverse square of another positive float B (in C99), could something go wrong if B is very small?
I could imagine checking it like
if(A<1/(B*B))
But if B is small enough, would this possibly result in infinity? If that were to happen, would the code still work correctly in all situations?
In a similar vein, I might do
if(1/A>B*B)
... which might be slightly better because B*B might be zero if B is small (is this true?)
Finally, a solution that I can't imagine being wrong is
if(sqrt(1/A)>B)
which I don't think would ever result in zero division, but still might be problematic if A is close to zero.
So basically, my questions are:
Can 1/X ever be infinity if X is greater than zero (but small)?
Can X*X ever be zero if X is greater than zero?
Will comparisons with infinity work the way I would expect them to?
EDIT: for those of you who are wondering, I ended up doing
if(B*A*B<1)
I did it in that order as it is visually unambiguous which multiplication occurs first.
If you want to handle the entire range of possible values of A and B, then you need to be a little bit careful, but this really isn't too complicated.
The suggestion of using a*b*b < 1. is a good one; if b is so tiny that a*b*b underflows to zero, then a is necessarily smaller than 1./(b*b). Conversely, if b is so large that a*b*b overflows to infinity, then the condition will (correctly) not be satisfied. (Potatoswatter correctly points out in a comment on another post that this does not work properly if you write it b*b*a, because b*b might overflow to infinity even when the condition should be true, if a happens to be denormal. However, in C, multiplication associates left-to-right, so that is not an issue if you write it a*b*b and your platform adheres to a reasonable numerics model.)
Because you know a priori that a and b are both positive numbers, there is no way for a*b*b to generate a NaN, so you needn't worry about that condition. Overflow and underflow are the only possible misbehaviors, and we have accounted for them already. If you needed to support the case where a or b might be zero or infinity, then you would need to be somewhat more careful.
To answer your direct questions: (answers assume IEEE-754 arithmetic)
Can 1/X ever be infinity if X is greater than zero (but small)?
Yes! If x is a small positive denormal value, then 1/x can overflow and produce infinity. For example, in double precision in the default rounding mode, 1 / 0x1.0p-1024 will overflow.
Can X*X ever be zero if X is greater than zero?
Yes! In double precision in the default rounding mode, all values of x smaller than 0x1.0p-538 (thats 2**-578 in the C99 hex format) or so have this property.
Will comparisons with infinity work the way I would expect them to?
Yes! This is one of the best features of IEEE-754.
OK, reposting as an answer.
Try using arithmetically equivalent comparison like if ( A*B*B < 1. ). You might get in trouble with really big numbers though.
Take a careful look at the IEEE 754 for your corner cases.
You want to avoid divisions so the trick is to modify the equation. You can multiply both sides of your first equation by (b*b) to get:
b*b*a < 1.0
This won't have any divisions so should be ok.
Division per se isn't so bad. However, standard IEEE 754 FP types allow for a greater negative negative range of exponents than positive, due to denormalized numbers. For example, float ranges from 1.4×10-45 to 3.4×10-38, so you cannot take the inverse of 2×10-44.
Therefore, as Jeremy suggests, start by multiplying A by B, where one has a positive exponent and the other has a negative exponent, to avoid overflow.
This is why A*B*B<1 is the proper answer.