The most efficient way to guess the number in range of x? - theory

I have a task where I need to guess the number in range. For example, lets say n is in range of 1 … 10^8, what is the most efficient algorithm to guess the number?

Let's look at the guessing game as an example to use the Divide and Conquer Algorithm by halfing our possible number of guesses. To play the guessing, a person (player A) will choose a random numer from n to m, another person (player B) will have to guess player A's number in "x" turns. Player A will assist player B by telling player B if the number they guessed is higher than or lower than player A's randomly chosen number. The Divide and Conquer Algoritm will tell you if it is possible to guess player A's number in the given amount of turns x, and will tell the maximum amount of tries or guesses you will need in order to guess there number correctly.
using Divide and Conquer Algorithm:
Let's look at the guessing game example from before, but this time we will use the Divide and Conquer Algorithm to solve it
Player A: I am thinking of a number from 1 to 100, can you guess my number within 6 turns ?
Player B: Sure and in 6 turns or less, is your number 100/2 = 50 ?
Player A: Nope guess a higher number.
Player B: Okay, is your number (100 + 50) / 2 = 75 ?
Player A: Nope, guess a higher number.
Player B: Okay is your number round up (100 +75)/2 = 88?
Player A: Yes, congratulations you guessed my number in 3 turns you
win!

Related

calculate time complexity of given function in C

I need to calculate the time complexity of the f3 function:
My problem is that I can't succeed on calculate how many time I can appley sqrt() function on n till its lower than 1: n^0.5^k < 1
I can assume that the time complexity of sqrt() is 1.
any ideas how can I get the k value out of n^0.5^k < 1 ? if I succeed that, then I think value the sum of the series: n/2, (n^0.5)/2, (n^0.5^2)/2,... would be easier.
I will show the lower and upper bound.
First we compute the cost of g3.
Take for example, n = 2^16.
How many iterations we make in the for loop?
i=2^0, i=2^1, i=2^2, i=2^3... < 2^16
More or less, that would be 16 steps. So the cost of g3 is O(log(n)).
Now lets try to compute f3. Since it's using g3 inside the loop, it would go as follows:
log(n) + log(n^(1/2)) + log(n^(1/4)) + log(n^(1/8)) + ...
That's for sure greater than log(n), so we could take log(n) as the lower bound.
Now, in order to compute the upper bound we have to think, how many iterations does the loop do?
Take again 2^16 as an example:
2^16, 2^16^(1/2), 2^16^(1/4), 2^16^(1/8), 2^16^(1/16),
That turns out to be:
2^16, 2^8, 2^4, 2^2, 2^1
And in the next iteration we would stop because sqrt(2) rounds to 1.
So in general, if n=2^2^k, we make k iterations. That's log(log(n)). That means we could say log(n)*log(log(n)) as the upper bound.
There is probably a more adjusted solution but this should be pretty accurate.

smallest subset of array whose sum is equal to key. Condition : Values can be used any number of times

I was asked this question in interview.
Given a list of 'N' coins, their values being in an array A[], return the minimum number of coins required to sum to 'S' (you can use as many coins you want). If it's not possible to sum to 'S', return -1
Note here i can use same coins multiple times.
Example:
Input #00:
Coin denominations: { 1,3,5 }
Required sum (S): 11
Output #00:
3
Explanation:
The minimum number of coins requires is: 3 - 5 + 5 + 1 = 11;
Is there any better way we can think except Sorting the array and start it by both ends?
This is the change-making problem.
A simple greedy approach, which you seem to be thinking of, won't always produce an optimal result. If you elaborate a bit on what exactly you mean by starting from both ends, I might be able to come up with a counter-example.
It has a dynamic programming approach, taken from here:
Let C[m] be the minimum number of coins of denominations d1,d2,...,dk needed to make change for m amount. In the optimal solution to making change for m amount, there must exist some first coin di, where di < m. Furthermore, the remaining coins in the solution must themselves be the optimal solution to making change for m - di.
Thus, if di is the first coin in the optimal solution to making change for m amount, then C[m] = 1 + C[m - di], i.e. one di coin plus C[m - di] coins to optimally make change for m - di amount. We don't know which coin di is the first coin; however, we may check all n such possibilities (subject to the constraint that di < m), and the value of the optimal solution must correspond to the minimum value of 1 + C[m - di], by definition.
Furthermore, when making change for 0, the value of the optimal solution is clearly 0 coins. We thus have the following recurrence.
C[p] = 0 if p = 0
min(i: di < p) {1 + C[p - di]} if p > 0
Pathfinding algorithms (Dijkstra, A*, meeting on the middle, etc.) could be suitable for this on graph like this:
0
1/|\5
/ |3\
/ | \
1 3 5
1/|\51/| ...
/ |3\/ |3
/ | /\ |
2 4 6
....
Other way is recursive bisection. Say, if we cannot get the sum S with one coin, we start to try to get amounts (S/2, S/2)...(S-1,1) recursively until we find suitable coin or reach S=1.

Find the Element Occurring b times in an an array of size n*k+b

Description
Given an Array of size (n*k+b) where n elements occur k times and one element occurs b times, in other words there are n+1 distinct Elements. Given that 0 < b < k find the element occurring b times.
My Attempted solutions
Obvious solution will be using hashing but it will not work if the numbers are very large. Complexity is O(n)
Using map to store the frequencies of each element and then traversing map to find the element occurring b times.As Map's are implemented as height balanced trees Complexity will be O(nlogn).
Both of my solution were accepted but the interviewer wanted a linear solution without using hashing and hint he gave was make the height of tree constant in tree in which you are storing frequencies, but I am not able to figure out the correct solution yet.
I want to know how to solve this problem in linear time without hashing?
EDIT:
Sample:
Input: n=2 b=2 k=3
Aarray: 2 2 2 3 3 3 1 1
Output: 1
I assume:
The elements of the array are comparable.
We know the values of n and k beforehand.
A solution O(n*k+b) is good enough.
Let the number occuring only b times be S. We are trying to find the S in an array of n*k+b size.
Recursive Step: Find the median element of the current array slice as in Quick Sort in lineer time. Let the median element be M.
After the recursive step you have an array where all elements smaller than M occur on the left of the first occurence of M. All M elements are next to each other and all element larger than M are on the right of all occurences of M.
Look at the index of the leftmost M and calculate whether S<M or S>=M. Recurse either on the left slice or the right slice.
So you are doing a quick sort but delving only one part of the divisions at any time. You will recurse O(logN) times but each time with 1/2, 1/4, 1/8, .. sizes of the original array, so the total time will still be O(n).
Clarification: Let's say n=20 and k = 10. Then, there are 21 distinct elements in the array, 20 of which occur 10 times and the last occur let's say 7 times. I find the medium element, let's say it is 1111. If the S<1111 than the index of the leftmost occurence of 1111 will be less than 11*10. If S>=1111 then the index will be equal to 11*10.
Full example: n = 4. k = 3. Array = {1,2,3,4,5,1,2,3,4,5,1,2,3,5}
After the first recursive step I find the median element is 3 and the array is something like: {1,2,1,2,1,2,3,3,3,5,4,5,5,4} There are 6 elements on the left of 3. 6 is a multiple of k=3. So each element must be occuring 3 times there. So S>=3. Recurse on the right side. And so on.
An idea using cyclic groups.
To guess i-th bit of answer, follow this procedure:
Count how many numbers in array has i-th bit set, store as cnt
If cnt % k is non-zero, then i-th bit of answer is set. Otherwise it is clear.
To guess whole number, repeat the above for every bit.
This solution is technically O((n*k+b)*log max N), where max N is maximal value in the table, but because number of bits is usually constant, this solution is linear in array size.
No hashing, memory usage is O(log k * log max N).
Example implementation:
from random import randint, shuffle
def generate_test_data(n, k, b):
k_rep = [randint(0, 1000) for i in xrange(n)]
b_rep = [randint(0, 1000)]
numbers = k_rep*k + b_rep*b
shuffle(numbers)
print "k_rep: ", k_rep
print "b_rep: ", b_rep
return numbers
def solve(data, k):
cnts = [0]*10
for number in data:
bits = [number >> b & 1 for b in xrange(10)]
cnts = [cnts[i] + bits[i] for i in xrange(10)]
return reduce(lambda a,b:2*a+(b%k>0), reversed(cnts), 0)
print "Answer: ", solve(generate_test_data(10, 15, 13), 3)
In order to have a constant height B-tree containing n distinct elements, with height h constant, you need z=n^(1/h) children per nodes: h=log_z(n), thus h=log(n)/log(z), thus log(z)=log(n)/h, thus z=e^(log(n)/h), thus z=n^(1/h).
Example, with n=1000000, h=10, z=3.98, that is z=4.
The time to reach a node in that case is O(h.log(z)). Assuming h and z to be "constant" (since N=n.k, then log(z)=log(n^(1/h))=log(N/k^(1/h))=ct by properly choosing h based on k, you can then say that O(h.log(z))=O(1)... This is a bit far-fetched, but maybe that was the kind of thing the interviewer wanted to hear?
UPDATE: this one use hashing, so it's not a good answer :(
in python this would be linear time (set will remove the duplicates):
result = (sum(set(arr))*k - sum(arr)) / (k - b)
If 'k' is even and 'b' is odd, then XOR will do. :)

How to select one of n objects at random without knowing n at first?

For concreteness, how would you read text line, and select and print one random line, when you don't know the number of lines in advance?
Yes this is a problem from the programming pearl which I get confused.
The solution choose the 1st element, then select the second with probability 1/2, the third with 1/3, and so forth.
An algorithm:
i = 0
while more input lines
with probability 1.0/++i
choice = this input line
print choice
Suppose the final choice is the 3rd element, the probability is
1 x 1/2 x 1/3 x 3/4 x ... x n-2/n-1 x n-1/n == 1/2n ? But 1/n should be correct.
Your algorithm is correct, but the analysis is not. You can prove it by induction. Loosely: It works for N = 1 of course. If it works up to N-1, then what happens at N? The chance that the Nth element is chosen and overwrites the last choice is 1/N -- good. The chance that it isn't is (N-1)/N. In which case the choice from the previous step is used. But at that point all elements had an 1/(N-1) chance of being chosen. Now it's 1/N. Good.
Your calculation is wrong :
Suppose the final choice is the 3rd element, the probability is
1 x 1/2 x 1/3 x 3/4 x ... x n-2/n-1 x n-1/n
The real probability is :
(1 x 1/2 + 1 x 1/2) x 1/3 x 3/4 x ... x n-2/n-1 x n-1/n == 1/n
since either you chose 2 or you don't chose 2 (chosing 2 has a proba of 1/2 and not chosing 2 a proba of 1/2)
Read 1
Read 2
50 % chance of either, keep one, discard one.
Read 3
(we should have either 1 and 3 or 2 and 3).
50% chance of either of the lines, discard the other.
Keep working with a 50% chance all the way through the file, this leaves you with 2 lines.
Take a 50/50 on either of the lines and you have a random line.
The odds were even for the whole file.
This is not truely random - as you are more likely to chose a line at the start of the file. You need to know the number of lines to make it random. (50% of the time you get the first line!)

Information Gain and Entropy

I recently read this question regarding information gain and entropy. I think I have a semi-decent grasp on the main idea, but I'm curious as what to do with situations such as follows:
If we have a bag of 7 coins, 1 of which is heavier than the others, and 1 of which is lighter than the others, and we know the heavier coin + the lighter coin is the same as 2 normal coins, what is the information gain associated with picking two random coins and weighing them against each other?
Our goal here is to identify the two odd coins. I've been thinking this problem over for a while, and can't frame it correctly in a decision tree, or any other way for that matter. Any help?
EDIT: I understand the formula for entropy and the formula for information gain. What I don't understand is how to frame this problem in a decision tree format.
EDIT 2: Here is where I'm at so far:
Assuming we pick two coins and they both end up weighing the same, we can assume our new chances of picking H+L come out to 1/5 * 1/4 = 1/20 , easy enough.
Assuming we pick two coins and the left side is heavier. There are three different cases where this can occur:
HM: Which gives us 1/2 chance of picking H and a 1/4 chance of picking L: 1/8
HL: 1/2 chance of picking high, 1/1 chance of picking low: 1/1
ML: 1/2 chance of picking low, 1/4 chance of picking high: 1/8
However, the odds of us picking HM are 1/7 * 5/6 which is 5/42
The odds of us picking HL are 1/7 * 1/6 which is 1/42
And the odds of us picking ML are 1/7 * 5/6 which is 5/42
If we weight the overall probabilities with these odds, we are given:
(1/8) * (5/42) + (1/1) * (1/42) + (1/8) * (5/42) = 3/56.
The same holds true for option B.
option A = 3/56
option B = 3/56
option C = 1/20
However, option C should be weighted heavier because there is a 5/7 * 4/6 chance to pick two mediums. So I'm assuming from here I weight THOSE odds.
I am pretty sure I've messed up somewhere along the way, but I think I'm on the right path!
EDIT 3: More stuff.
Assuming the scale is unbalanced, the odds are (10/11) that only one of the coins is the H or L coin, and (1/11) that both coins are H/L
Therefore we can conclude:
(10 / 11) * (1/2 * 1/5) and
(1 / 11) * (1/2)
EDIT 4: Going to go ahead and say that it is a total 4/42 increase.
You can construct a decision tree from information-gain considerations, but that's not the question you posted, which is only the compute the information gain (presumably the expected information gain;-) from one "information extraction move" -- picking two random coins and weighing them against each other. To construct the decision tree, you need to know what moves are affordable from the initial state (presumably the general rule is: you can pick two sets of N coins, N < 4, and weigh them against each other -- and that's the only kind of move, parametric over N), the expected information gain from each, and that gives you the first leg of the decision tree (the move with highest expected information gain); then you do the same process for each of the possible results of that move, and so on down.
So do you need help to compute that expected information gain for each of the three allowable values of N, only for N==1, or can you try doing it yourself? If the third possibility obtains, then that would maximize the amount of learning you get from the exercise -- which after all IS the key purpose of homework. So why don't you try, edit your answer to show you how you proceeded and what you got, and we'll be happy to confirm you got it right, or try and help correct any misunderstanding your procedure might reveal!
Edit: trying to give some hints rather than serving the OP the ready-cooked solution on a platter;-). Call the coins H (for heavy), L (for light), and M (for medium -- five of those). When you pick 2 coins at random you can get (out of 7 * 6 == 42 possibilities including order) HL, LH (one each), HM, MH, LM, ML (5 each), MM (5 * 4 == 20 cases) -- 2 plus 20 plus 20 is 42, check. In the weighting you get 3 possible results, call them A (left heavier), B (right heavier), C (equal weight). HL, HM, and ML, 11 cases, will be A; LH, MH, and LM, 11 cases, will be B; MM, 20 cases, will be C. So A and B aren't really distinguishable (which one is left, which one is right, is basically arbitrary!), so we have 22 cases where the weight will be different, 20 where they will be equal -- it's a good sign that the cases giving each results are in pretty close numbers!
So now consider how many (equiprobable) possibilities existed a priori, how many a posteriori, for each of the experiment's results. You're tasked to pick the H and L choice. If you did it at random before the experiment, what would be you chances? 1 in 7 for the random pick of the H; given that succeeds 1 in 6 for the pick of the L -- overall 1 in 42.
After the experiment, how are you doing? If C, you can rule out those two coins and you're left with a mystery H, a mystery L, and three Ms -- so if you picked at random you'd have 1 in 5 to pick H, if successful 1 in 4 to pick L, overall 1 in 20 -- your success chances have slightly more than doubled. It's trickier to see "what next" for the A (and equivalently B) cases because they're several, as listed above (and, less obviously, not equiprobable...), but obviously you won't pick the known-lighter coin for H (and viceversa) and if you pick one of the 5 unweighed coins for H (or L) only one of the weighed coins is a candidate for the other role (L or H respectively). Ignoring for simplicity the "non equiprobable" issue (which is really kind of tricky) can you compute what your chances of guessing (with a random pick not inconsistent with the experiment's result) would be...?

Resources