These days I am looking at skiplist code in Algorithms in C, Parts 1-4, and when insert a new value into skiplist is more complex than I think. During insert, code should ensure that the new value insert into level i with the probability of 1/2^i, and this implement by below code:
static int Rand()
{
int i, j = 0;
uint32_t t = rand();
for (i = 1, j = 2; i < lg_n_max; i ++, j += j)
if (t > RANDMAX / j)
break;
if (i > lg_n)
lg_n = i;
return i;
}
I don't know how the Rand function ensure this, can you explain this for me, thank you.
Presumably RANDMAX is intended to be RAND_MAX.
Neglecting rounding issues, half the return values of rand are above RAND_MAX / 2, and therefore half the time, the loop exits with i = 1.
If the loop continues, it updates i to 2 and j to 4. Then half the remaining return values (¾ of the total) are above RAND_MAX / 4, so, one-quarter of the time, the loop exits with i = 2.
Further iterations continue in the same manner, each iteration exiting with a portion of return values that is half the previous, until the lg_n_max limit is reached.
Thus, neglecting rounding issues and the final limit, the routine returns 1 half the time, 2 one-quarter of the time, 3 one-eighth the time, and so on.
lg_n is not defined in the routine. It appears to be a record of the greatest value returned by the routine so far.
Thanks Eric Postpischil very much for his answer, I have understand how to ensure the probability. And I have a more understood answer:
The t is a random value between 0 and RANDMAX, and we assume that the loop will run 2 times. In the first loop, value of t is smaller than RANDMAX/2^1, means that value of t fall into the range from 0 to RANDMAX/2 , the probability of this is 1/2. In the second loop, remember the fact that value of t is in the range of (0, RANDMAX/2^i), value of t is smaller that RANDMAX/2^2, means that value of t fall into the range from 0 to RANDMAX/2^2, the probability of this is also 1/2, because the range of (0, RANDMAX/2^2) is only 1/2 of the range in first loop, and the first loop show value of t is in the range of (0, RANDMAX/2^1). And notice that the probability of second loop is conditional probability,it‘s based on the probability of first loop, so the probability of second loop is 1/2*1/2=1/4.
In a short, every loop will bring a * 1/2 to last loop's probability.
The Minimum Size Subarray Sum problem:
given an array of n positive integers and a positive integer s, find the minimal length of a subarray of which the sum ≥ s. If there isn't one, return 0 instead.
For example, given the array [2,3,1,2,4,3] and s = 7,
the subarray [4,3] has the minimal length under the problem constraint.
The following is my solution:
public int minSubArrayLen(int s, int[] nums) {
long sum = 0;
int a = 0;
if (nums.length < 1)
return 0;
Arrays.sort(nums);
for (int i = nums.length-1; i >= 0; i--) {
sum += nums[i];
a++;
if (sum>=s)
break;
}
if (sum < s) {
return 0;
}
return a;
}
This solution was not accepted because it did not pass the following test case:
697439
[5334,6299,4199,9663,8945,3566,9509,3124,6026,6250,7475,5420,9201,9501,38,5897,4411,6638,9845,161,9563,8854,3731,5564,5331,4294,3275,1972,1521,2377,3701,6462,6778,187,9778,758,550,7510,6225,8691,3666,4622,9722,8011,7247,575,5431,4777,4032,8682,5888,8047,3562,9462,6501,7855,505,4675,6973,493,1374,3227,1244,7364,2298,3244,8627,5102,6375,8653,1820,3857,7195,7830,4461,7821,5037,2918,4279,2791,1500,9858,6915,5156,970,1471,5296,1688,578,7266,4182,1430,4985,5730,7941,3880,607,8776,1348,2974,1094,6733,5177,4975,5421,8190,8255,9112,8651,2797,335,8677,3754,893,1818,8479,5875,1695,8295,7993,7037,8546,7906,4102,7279,1407,2462,4425,2148,2925,3903,5447,5893,3534,3663,8307,8679,8474,1202,3474,2961,1149,7451,4279,7875,5692,6186,8109,7763,7798,2250,2969,7974,9781,7741,4914,5446,1861,8914,2544,5683,8952,6745,4870,1848,7887,6448,7873,128,3281,794,1965,7036,8094,1211,9450,6981,4244,2418,8610,8681,2402,2904,7712,3252,5029,3004,5526,6965,8866,2764,600,631,9075,2631,3411,2737,2328,652,494,6556,9391,4517,8934,8892,4561,9331,1386,4636,9627,5435,9272,110,413,9706,5470,5008,1706,7045,9648,7505,6968,7509,3120,7869,6776,6434,7994,5441,288,492,1617,3274,7019,5575,6664,6056,7069,1996,9581,3103,9266,2554,7471,4251,4320,4749,649,2617,3018,4332,415,2243,1924,69,5902,3602,2925,6542,345,4657,9034,8977,6799,8397,1187,3678,4921,6518,851,6941,6920,259,4503,2637,7438,3893,5042,8552,6661,5043,9555,9095,4123,142,1446,8047,6234,1199,8848,5656,1910,3430,2843,8043,9156,7838,2332,9634,2410,2958,3431,4270,1420,4227,7712,6648,1607,1575,3741,1493,7770,3018,5398,6215,8601,6244,7551,2587,2254,3607,1147,5184,9173,8680,8610,1597,1763,7914,3441,7006,1318,7044,7267,8206,9684,4814,9748,4497,2239]
The expected answer is 132 but my output was 80.
Does anyone have any idea what went wrong with my algorithm/code?
I will simply explain the flaw in the logic rather giving the correct logic to handle the problem statement
You are taking the numbers in a specific sequence and then adding them for comparison. Quite easily the case can be different where you take numbers in random order to get the exact sum.
For example [2,3,1,2,4,3] and s = 7.
Based on your logic
Step 1-> Sort the numbers and you get [1,2,2,3,3,4]
Step 2-> You pick last 2 number (3,4) to get your sum 7
Lets change the sum to 8
From Step 2-> You get 3+3+4 = 10 so u break out of the loop. After this step you return a = 2
Flaw here is 4+3+1 also makes 8 something your logic skips.
Same way 3+3+2 is also possible solution to achieve 8.
You sorting the array is first flaw in the logic itself. If you consider subarray of existing arrangement, sorting changes the arrangement therefore you will never be able to get the expected solution.
I'm writing some code to use random numbers to create a bell curve.
The basic approach is as follows:
Create an array of 2001 integers.
For some number of repeats, do the following:
• Start with a value of 1000 (the center-value)
• Loop 1000 times
• Generate a random number 0 or 1. If the random number is zero, subtract 1 from the value. If it's 1, add 1 to the value.
• Increment the count in my array at the resulting index value.
So 1000 times, we randomly add 1 or subtract 1 from a starting value of 1000. On average, we'll add 1 and subtract one about as often, so the outcome should be centered around 1000. Values greater or less than 1000 should be less and less frequent. A value at index 0 or index 1 would require a "coin toss" with the same result 1000 times in a row... a VERY unlikely event that is still possible.
Here is the code I came up with, written in C with a thin Objective C wrapper:
#import "BellCurveUtils.h"
#implementation BellCurveUtils
#define KNumberOfEntries 1000
#define KPinCount 1000
#define KSlotCount (KPinCount*2+1)
static int bellCurveData[KSlotCount];
+(void) createBellCurveData;
{
NSLog(#"Entering %s", __PRETTY_FUNCTION__);
NSTimeInterval start = [NSDate timeIntervalSinceReferenceDate];
int entry;
int i;
int random_index;
//First zero out the data
for (i = 0; i< KSlotCount; i++)
bellCurveData[i] = 0;
//Generate KNumberOfEntries entries in the array
for (entry =0; entry<KNumberOfEntries; entry++)
{
//Start with a value of 1000 (center value)
int value = 1000;
//For each entry, add +/- 1 to the value 1000 times.
for (random_index = 0; random_index<KPinCount; random_index++)
{
int random_value = arc4random_uniform(2) ? -1: 1;
value += random_value;
}
bellCurveData[value] += 1;
}
NSTimeInterval elapsed = [NSDate timeIntervalSinceReferenceDate] - start;
NSLog(#"Elapsed time = %.2f", elapsed);
int startWithData=0;
int endWithData=KSlotCount-1;
for (i = 0; i< KSlotCount; i++)
{
if (bellCurveData[i] >0)
{
startWithData = i;
break;
}
}
for (i = KSlotCount-1; i>=0 ; i--)
{
if (bellCurveData[i] >0)
{
endWithData = i;
break;
}
}
for (i = startWithData; i <= endWithData; i++)
printf("value[%d] = %d\n", i, bellCurveData[i]);
}
#end
The code does generate a bell-shaped curve. However, the array entries with odd indexes are ALL zero.
Here is some sample output:
value[990] = 23
value[991] = 0
value[992] = 22
value[993] = 0
value[994] = 20
value[995] = 0
value[996] = 25
value[997] = 0
value[998] = 37
value[999] = 0
value[1000] = 23
value[1001] = 0
value[1002] = 26
value[1003] = 0
value[1004] = 20
value[1005] = 0
value[1006] = 28
value[1007] = 0
value[1008] = 23
value[1009] = 0
value[1010] = 26
I have gone over this code line-by-line, and do not see why this is. When I step through it in the debugger, I get values that bounce around by single steps, starting at 1000, dropping to 999, incrementing to 1001, and various values even and odd. However, after 1000 iterations, the result of value is always even. What am I missing here?!?
I realize this isn't a typical SO development question, but I'm stumped. I cannot see what I am doing wrong. Can somebody explain these results?
//For each entry, add +/- 1 to the value 1000 times.
for (random_index = 0; random_index<KPinCount; random_index++)
{
int random_value = arc4random_uniform(2) ? -1: 1;
value += random_value;
}
For any two iterations of this loop, there are three potential outcomes:
random_value is zero both times, in which case "value" decreases by 2.
random_value is one both times, in which case "value" increases by 2.
random_value is zero once and one once, in which case "value" is unchanged.
Therefore, if the loop runs an even number of times (i.e. KPinCount is an even number), the parity of "value" will never change. Since it begins as an even number (1000), it ends as an even number.
Edit: If you want to resolve the problem but keep the same basic approach, then rather than starting with value = 1000 and running 1000 iterations in which you either add or subtract one, perhaps you could start with value = 0 and run 2000 iterations in which you add either one or zero. I'd have posted this as a comment to the discussion above, but can't comment since I just registered.
Youe immediate problem is at
for (random_index = 0; random_index < KPinCount; random_index++)
{
int random_value = arc4random_uniform(2) ? -1: 1;
value += random_value;
}
Because KPinCount is defined as 1000 (an even number), at the end of the loop, value will have changed by a multiple of 2.
Maybe try with KPinCount varying between 999 and 1000???
Ok, I've gotten some very useful feedback on this project.
To summarize:
If you always add or subtract one from a value, and do it twice, the possibilities are:
+1 +1 = even change
+1 -1 = no (even) change
-1 -1 = even change
Thus in that case the value always changes by 0 or 2, so result is always an even number.
Likewise, if you always apply an odd number of +1/-1 value changes, the resulting value will always be odd.
A couple of solutions were proposed.
Option 1: (The change I used in my testing) was before calculating each value, randomly decide to loop either 999 or 1000 times. That way half the time the result will be even and the other half of the time the value will be odd.
This has the effect that the spread of the graph will be infinitesimally narrower, because half of the time the possible range of values will be less by +/- 1.
Option 2 was to generate 3 random values, and add +1,0, or -1 to the value based on the result.
Option 3, suggested by #rhashimoto in the comments to one of the other answers, was to generate 4 random values, and add +1,0, 0, or -1 to the value based on the result.
I suspected that options 2 and 3 would cause a narrower spread of the curve because for 1/3 or 1/4 of the possible random values on each iteration, the value would not change, so the average spread of values would be smaller.
I've run a number of tests with different settings, and confirmed my suspicions.
Here are graphs of the different approaches. All sample graphs are plots of 1,000,000 points, with the graph clamped to values ranging from 800 to 1200 since there are never values outside that range in practice. The green bars on the graph are at the center point and +/- 50 steps
First, option 1, which randomly applies either 999 or 1000 +/-1 changes to the starting value:
Option 2, 1000 iterations of applying 3 random possible changes, -1,0, or +1:
And option 3, 1000 iterations of applying 4 random possible changes, -1,0, 0, or +1, as suggested by rhashimoto in the comments to pmg's answer:
And overlaying all the graphs on top of each other in Photoshop:
I have created graphs using a much larger number of points (100 million instead of 1 million) and the graphs are much smoother and less "jittery", but the shape of the curve is for all practical purposes identical. Applying a modest rolling average to the results from a one-million iteration graph would no doubt yield a very smooth curve.