I am having some trouble determining space and time complexities. For example, if I have a tree that has a branching factor b and will have at most a depth d, how can I calculate the time and space complexities? I know they are O(b^d) and O(bd), but my problem is how to get to those values.
Time
All the nodes in the tree have to be generated once at some point, and the assumption is that it costs a constant time c to generate a node (constant times can vary, you can just pick c to be the highest constant time to generate any node). The order is determined by the algorithm and ensures that nodes don't have to be repeatedly expanded.
nodes b=2 b=3
b^0 * *
/ \ / | \
b^1 * * * * *
/ \ / \ / | \ / | \ / | \
b^2 * * * * * * * * * * * * *
... ...
As you can see in the figure it costs c*b^0 cost to calculate the first level - exactly c. The next level in the tree will contain a b^1 nodes and it costs c*b^1 = c*b to generate the second level. For the third level there will be b nodes again for every node in the second level, this means b*b^1 = b^2$ nodes and a cost of c*b^2.
At the deepest level of the tree at depth d there will be b^dnodes, the work at that level therefor is c*b^d. The total amount of work done to this point is c*b^0 + c*b^1 + ... + c*b^d. For the complexity we only look at the fastest rising term and drop the constant so we get:
O(c + c*b + ... + c*b^d) = O(c*b^d) = O(b^d).
In essence: The time is a function f(d) = SUM(i=1..d){c*b^i}, and O(f(d)) = O(b^d).
Space
The figure shows the algorithm at different stages for b=3. * indicates currently expanded nodes, ? indicates unknown nodes and + indicates nodes who's score has been fully calculated.
branching factor b = 3 space
* * * * b
/ | \ / | \ / | \ / | \
* ? ? * ? ? + * ? + + * b
/ | \ / | \ / | \ / | \
* ? ? + + * + * ? + + * b
/ | \ / | \ / | \ / | \
* ? ? + * ? + * ? + + * b
In order to calculate the score of a node, you expand the node, pick a child and recursively expand until you reach a leaf node at depth d. Once a child node is fully calculated you move on to the next child node. Once all b child nodes are calculated the parents score is calculated based on the children and at that point the child nodes can be removed from storage. This is illustrated in the figure above, where the algorithm is shown at 4 different stages.
At any time you have one path expanded and you need c*b storage to store all the child nodes at every level. Here again the assumption is that you need a constant amount of space per node. The key is that any subtree can summarised by its root. Since the maximal length of a path is d, you will maximally need c*b*d of space. As above we can drop constant terms and we get O(c*b*d) = O(b*d).
Space complexity amounts to "how much memory will I need to allocate for this algorithm".
Time complexity amounts to "how long will it take to execute (in an abstract sense").
A tree with branching factor b and depth d will have one node at its zeroith level, b nodes at its first level, b*b = b^2 nodes at its second level, b^2 * b = b^3 at its third level, etc. In those four levels (depth 3) it has 1 + b + b^2 + b^3. In terms of complexity we only keep around the highest order term and drop any multiplying constants usually. So you end up with a complexity of O(b^d) for space complexity.
Now in time complexity, what your counting is not the number of nodes, but rather the number of loops or recursive calls your algorithm will take to complete (worst case).
I'm going to go out on a limb and assume you're talking about IDDFS. The explanation of where the O(b^d) and O(bd) come from is nicely explained in this wiki article.
Related
it is my first time posting but I'll start by apologizing in advance if this question has been asked before.
I have been struggling on how to implement a 3rd order polynomial formula in C because of either extremely small values or larger than 32bit results (on a 16bit MCU).
I use diffrent values but as an example I would like to compute for "Y" in formula:
Y = ax^3 + bx^2 + cx + d = 0.00000012*(1024^3) + 0.000034*(1024^2) + 0.056*(1024) + 789.10
I need to use a base32 to get a meaningful value for "a" = 515
If I multiply 1024^3 (10bit ADC) then I get a very large amount of 1,073,741,824
I tried splitting them up into "terms A, B, C, and D" but I am not sure how to merge them together because of different resolution of each term and limitation of my 16bit MCU:
u16_TermA = fnBase32(0.00000012) * AdcMax * AdcMax * AdcMax;
u16_TermB = fnBase24(0.000034) * AdcMax * AdcMax;
u16_TermC = fnBase16(0.056) * AdcMax;
u16_TermD = fnBase04(789.10);
u16_Y = u16_TermA + u16_TermB + u16_TermC + u16_TermD;
/* AdcMax is a variable 0-1024; u16_Y needs to be 16bit */
I'd appreciate any help on the matter and on how best to implement this style of computations in C.
Cheers and thanks in advance!
One step toward improvement:
ax^3 + bx^2 + cx + d --> ((a*x + b)*x + c)*x + d
It is numerically more stable and tends to provide more accurate answers near the zeros of the function and less likely to overflow intermediate calculations.
2nd idea; consider scaling the co-efficents if they maintain their approximate relative values as given on the question.
N = 1024; // Some power of 2
aa = a*N*N*N
bb = b*N*N
cc = c*N
y = ((aa*x/N + bb)*x/N + cc)*x/N + d
where /N is done quickly with a shift.
With a judicious selection of N (maybe 2**14 for high precision avoid 32-bit overflow), then entire code might be satisfactorily done using only integer math.
As aa*x/N is just a*x*N*N, I think a scale of 2**16 works well.
Third idea:
In addition to scaling, often such cubic equations can be re-written as
// alpha is a power of 2
y = (x-root1)*(x-root2)*(x-root3)*scale/alpha
Rather than a,b,c, use the roots of the equation. This is very satisfactory if the genesis of the equation was some sort of curve fitting.
Unfortunately, OP's equation roots has a complex root pair.
x1 = -1885.50539
x2 = 801.08603 + i * 1686.95936
x3 = 801.08603 - i * 1686.95936
... in which case code could use
B = -(x1 + x2);
C = x1 * x2;
y = (x-x1)*(x*x + B*x + C)*scale/alpha
I want compress best possible 50000 bytes, which are partly sorted.
That means there are 256 increasing runs of bytes like 0,0,3,4,5,6,6,9,....,250.
Furthermore there exists ca. 630000000 inversions. I have calculated the inversions by using bubblesort and count the decreasing pairs.
Each Byte is equal often present.
The runs.length differ in 20%
Currently I use deltacompression for the runs and huffman for the encoding and i get ca. 14000bytes out. How its possible to compress it further?
Example Link: http://pastebin.com/raw/we57yZUw
File bytes comma seperated in list. No encoding.
Short answer, you've already come impressively close to the theoretical limit. You can't do much better unless there is some information that you have about these sequences that you haven't shared with us. And by "much better", you can't actually get to 14k, let alone 10k.
Let's start by looking at how many such sequences there could be.
We can turn your sequence into a non-decreasing sequence in the range 0-65,535 by adding 256 times the number of completed runs to each number. That is, instead of "wrapping around", you just keep climbing. This is a one-to-one correspondence, though admittedly some (actually a lot) of the increasing sequences will not correspond with sequences that you are looking to encode. Please note that fact, I will return to that later.
How many non-decreasing sequences are there of length 50,000 in the range 0-65,535 are there? Well, those sequences are in a one-to-one correspondence with sequences of 115,535 bits, 50,000 of which are zero. The correspondence is that you replace each 0 bit with a count of how many 1 bits happened before it to get numbers out of the sequence.
The number of sequences of 115,535 bits, 50,000 of which are zero, is 115,535 choose 50,000. Which is 115535! / (50000! * 65535!). This is a rather large number, but we can estimate its log using Stirling's formula that log_2(n!) = n log_2(n) - n * log_2(e) + log_2(pi * n / 2) + O(log(n)). To that end:
log_2(115535) = 16.8179704401675
log_2(65535) = 15.9999779860527
log_2(50000) = 15.6096404744368
log_2(pi * 115535 / 2) = 17.4694665696399
log_2(pi * 65535 / 2) = 16.6514741155252
log_2(pi * 50000 / 2) = 16.2611366039092
log_2(e) = 1.44269504088896
And now, please double check for calculation mistakes,
log_2(115535 choose 50000)
= log_2(115535!) - log_2(65535!) - log_2(50000!)
= 115535 * log_2(115535) - 115535 * log_2(e) + log_2(pi * 115535 / 2) + O(log(115535))
- (65535 * log_2(65535) - 65535 * log_2(e) + log_2(pi * 65535 / 2) + O(log(65535)))
- (50000 * log_2(50000) - 50000 * log_2(e) + log_2(pi * 50000 / 2) + O(log(50000)))
= 115535 * 16.8179704401675 - 115535 * 1.44269504088896 + 17.4694665696399 + O(log(115535))
- (65535 * 15.9999779860527 - 65535 * 1.44269504088896 + 16.6514741155252 + O(log(115535)))
- (50000 * 15.6096404744368 - 50000 * 1.44269504088896 + 16.2611366039092 + O(log(115535)))
= 1776399.91272222 - 954028.189285421 - 708363.532813996 + O(16.8179704401675)
= 114008.190622803 + O(16.8179704401675)
If you wish to expand a couple of terms of the Stirling series, you'll find that 114008.190622803 is off from the true log by something less than 1.
Therefore you require somewhere around 114008 bits to specify one such sequence, and that takes 14251 bytes. Which means that if you're managing to compress it to the 14k range, you are very close to the theoretical limit. Unless you have more information about your sequences than you have provided, you can't do better.
BUT I'm cheating. You specified that the length of the runs varied by no more than about 20%. I'm including ones where the length of the runs not only varies, but even where some runs can be skipped entirely! How big a deal is that?
Well from a simulation that I just ran, a run gets within 10% of the median over 50% of the time. We have 256 runs. Their lengths are not actually independent, but long runs are correlated with short ones and vice versa, so with odds better than 1/2^256, we will randomly satisfy the length requirement. This means that the length condition saves no more than 32 bytes off of the theoretical best. Given that we're talking around 14k of data already, this isn't a significant improvement.
I want to implement A* and I looked to Wikipedia for a reference.
It looks like it can fail in the following case. Consider three nodes, A, B, and C.
START -> A -> C -> GOAL
| ^
\-> B
The path costs are:
START -> A : 10
START -> B : 1
B -> A : 1
A -> C : 100
C -> GOAL : 100
Clearly the solution is START -> B -> A -> C -> GOAL but what happens if the heuristic lets us expand A before expanding B?
Our heuristic costs are as follows (note these are all underestimates)
A -> GOAL : 10
B -> GOAL : 50
When A is expanded, the true cost to C will turn out out to be higher than B's heuristic cost, and so B will be expanded before C.
Fine, right?
The problem I see is that when we expand B and replace the datum "A comes from START with cost 10" to "A comes from B with cost 2" we aren't also updating "C comes from A with cost 110" to "C comes from A with cost 102". There is nothing in Wikipedia's pseudocode that looks like it will forward-propagate the cheaper path. Now imagine another node D which can reach C with cost 105, it will erroneously override "C comes from A with cost 110".
Am I reading this wrong or does Wikipedia need to be fixed?
If you are using graph search, i.e. you remember which nodes you visit and you don't allow revisiting the nodes, then your heuristic is not consistent. It says in the article, that for a heuristic to be consistent, following needs to hold:
h(x) <= d(x, y) + h(y) for all adjacent nodes x, y
In your case the assumption h(B) = 50 is inconsistent as d(B -> A) + h(A) = 1 + 10 = 11. Hence your heuristic is inconsistent and A* wouldn't work in this case, as you rightly noticed and as is also mentioned in the wikipedia article: http://en.wikipedia.org/wiki/A%2a_search_algorithm#Properties.
If you are using tree search, i.e. you allow the algorithm to revisit the nodes, the following will happen:
Add A and B to the queue, score(A) = 10 + 10 = 20, score(B) = 1 + 50 = 51.
Pick A from queue as it has smallest score. Add C to the queue with score(C) = 10 + 100 + h(C).
Pick B from the queue as it is now the smallest. Add A to the queue with score(A) = 2 + 10 = 12.
Pick A from the queue as it is now again smallest. Notice that we are using tree search algorithm, so we can revisit nodes. Add C to the queue with score(C) = 1 + 1 + 100 + h(C).
Now we have 2 elements in the queue, C via A with score 110 + h(C) and C via B and A with score 102 + h(C), so we pick the correct path to C via B and A.
The wikipedia pseudocode is the first case, i.e. graph search. And they indeed state right under the pseudocode that:
Remark: the above pseudocode assumes that the heuristic function is monotonic (or consistent, see below), which is a frequent case in many practical problems, such as the Shortest Distance Path in road networks. However, if the assumption is not true, nodes in the closed set may be rediscovered and their cost improved. In other words, the closed set can be omitted (yielding a tree search algorithm) if a solution is guaranteed to exist, or if the algorithm is adapted so that new nodes are added to the open set only if they have a lower f value than at any previous iteration.
An rpc server is given which receives millions of requests a day. Each request i takes processing time Ti to get processed. We want to find the 65th percentile processing time (when processing times are sorted according to their values in increasing order) at any moment. We cannot store processing times of all the requests of the past as the number of requests is very large. And so the answer need not be exact 65th percentile, you can give some approximate answer i.e. processing time which will be around the exact 65th percentile number.
Hint: Its something to do how a histogram (i.e. an overview) is stored for a very large data without storing all of data.
Take one day's data. Use it to figure out what size to make your buckets (say one day's data shows that the vast majority (95%?) of your data is within 0.5 seconds of 1 second (ridiculous values, but hang in)
To get 65th percentile, you'll want at least 20 buckets in that range, but be generous, and make it 80. So you divide your 1 second window (-0.5 seconds to +0.5 seconds) into 80 buckets by making each 1/80th of a second wide.
Each bucket is 1/80th of 1 second. Make bucket 0 be (center - deviation) = (1 - 0.5) = 0.5 to itself + 1/80th of a second. Bucket 1 is 0.5+1/80th - 0.5 + 2/80ths. Etc.
For every value, find out which bucket it falls in, and increment a counter for that bucket.
To find 65th percentile, get the total count, and walk the buckets from zero until you get to 65% of that total.
Whenever you want to reset, set the counters all to zero.
If you always want to have good data available, keep two of these, and alternate resetting them, using the one you reset least recently as having more useful data.
Use an updown filter:
if q < x:
q += .01 * (x - q) # up a little
else:
q += .005 * (x - q) # down a little
Here a quantile estimator q tracks the x stream,
moving a little towards each x.
If both factors were .01, it would move up as often as down,
tracking the 50 th percentile.
With .01 up, .005 down, it floats up, 67 th percentile;
in general, it tracks the up / (up + down) th percentile.
Bigger up/down factors track faster but noisier --
you'll have to experiment on your real data.
(I have no idea how to analyze updowns, would appreciate a link.)
The updown() below works on long vectors X, Q in order to plot them:
#!/usr/bin/env python
from __future__ import division
import sys
import numpy as np
import pylab as pl
def updown( X, Q, up=.01, down=.01 ):
""" updown filter: running ~ up / (up + down) th percentile
here vecs X in, Q out to plot
"""
q = X[0]
for j, x in np.ndenumerate(X):
if q < x:
q += up * (x - q) # up a little
else:
q += down * (x - q) # down a little
Q[j] = q
return q
#...............................................................................
if __name__ == "__main__":
N = 1000
up = .01
down = .005
plot = 0
seed = 1
exec "\n".join( sys.argv[1:] ) # python this.py N= up= down=
np.random.seed(seed)
np.set_printoptions( 2, threshold=100, suppress=True ) # .2f
title = "updown random.exponential: N %d up %.2g down %.2g" % (N, up, down)
print title
X = np.random.exponential( size=N )
Q = np.zeros(N)
updown( X, Q, up=up, down=down )
# M = np.zeros(N)
# updown( X, M, up=up, down=up )
print "last 10 Q:", Q[-10:]
if plot:
fig = pl.figure( figsize=(8,3) )
pl.title(title)
x = np.arange(N)
pl.plot( x, X, "," )
pl.plot( x, Q )
pl.ylim( 0, 2 )
png = "updown.png"
print >>sys.stderr, "writing", png
pl.savefig( png )
pl.show()
An easier way to get the value that represents a given percentile of a list or array is the scoreatpercentile function in the scipy.stats module.
>>>import scipy.stats as ss
>>>ss.scoreatpercentile(v,65)
there's a sibling percentileofscore to return the percentile given the value
you will need to store a running sum and a total count.
then check out standard deviation calculations.
I need to find 2 elements in an unsorted array such that the difference between them is less than or equal to (Maximum - Minimum)/(number of elements in the array).
In O(n).
I know the max and min values.
Can anyone think of something?
Thank you!
Step 1: Use Bucket Sort. Don't sort the individual buckets.
Should be pretty obvious what to do from here, and how to size the buckets.
Number of buckets = 2n.
values in each bucket = (min + k((max-min)/2n)) <= value < (min + (k+1)((max-min)/2n)).
0 <= k < 2n
Range of each bucket = ((max-min)/2n)
Assign each element into buckets. Dont sort inside buckets.
If any bucket has more than 1 element, the maximum possible difference between them is ((max-min)/2n) . Hence you have your answer.
If any two consecutive buckets have more than zero elements each, maximum difference between them is ((max-min)/2n)*2 = ((max-min)/n) . Hence you have your answer.
The correct question should be:
in an array A=[a0,a2,..an] find two elements a, b such that the difference between them is less than or equal to: (M-m)/n > | a - b| where M=max(A) and m = min(A).
The solution I’ll suggest is using quickSelect, time complexity of O(n) in expectation. it’s actual worst case is O(n^2). This is a tradeoff because most times it's O(n), but it demand O(1) space complexity (if quickSelect is implemented iteratively and my pseudo code is implemented with a while loop instead of recursion).
main idea:
At each iteration we find the median using quickSelect, if |max - medianValue | > |min - medianValue | we know that we should search to the left side of the array. That is because we have the same amount of elements at both side, but the median value is closer to the minimum thus there should be elements with smaller difference between them. Else we should search at the right side.
each time we do that we know the new maximum or minimum of the subArray should be the median.
we continue the search, each time dividing the array’s size by 2.
proof of runtime in expectation:
assume each iteration over n elements take c*n + d in expectation.
thus we have:
(cn + d) + 0.5(cn + d) + 0.25 (c*n + d) + … +(1/log_{2}(n)) (cn + d) <=
<=(1+0.5+0.25+….)d + (c*n + 0.5*cn +….) = (1+0.5+0.25+….)d + cn(1+0.5+0.25+….) =
=2*d +2*c*n
meaning we have O(n) in expectation.
pseudo-code using recursion:
run(arr):
M = max(arr)
m = min(arr)
return findPairBelowAverageDiff(arr,0,arr.length,m,M)
findPairBelowAverageDiff(arr, start, end, min, max) :
if start + 1 < end:
medianPos = start + (end - start) / 2
// median that is between start and end in the arr.
quickSelect(arr, start, medianPos, end)
if max - arr[medianPos] > arr[medianPos] - min:
return findPairBelowAverageDiff(arr, start, medianPos,
min, arr[medianPos])
else :
return findPairBelowAverageDiff(arr, medianPos,
end, arr[medianPos], max);
else :
return (arr[start], arr[start + 1])