apriori algorithom in machine learning in action - apriori

The apriori algorithom
def aprioriGen(Lk, k): #creates Ck
retList = []
lenLk = len(Lk)
for i in range(lenLk):
for j in range(i+1, lenLk):
L1 = list(Lk[i])[:k-2]; L2 = list(Lk[j])[:k-2]
L1.sort(); L2.sort()
if L1==L2: #if first k-2 elements are equal
retList.append(Lk[i] | Lk[j]) #set union
return retList
l3 = [frozenset([0, 1, 2]), frozenset([0, 1, 3]), frozenset([1, 2, 4])]
l4 = aprioriGen(l3, 4)
print l4
the result is :[frozenset([0, 1, 2, 3])],
the algorithm is right? why [frozenset([0, 1, 2, 4])] not including?

Related

Ruby - Pick one element from array by possibility

I have an array with 3 elements and I want to pick one and add that into another array base on possibility.
For example, num 1 has 5% chance to be picked, num 2 has 60% chance to be picked and num 3 has 35% chance to be picked.
arr = [{:num=>1, :diff=>-29}, {:num=>2, :diff=>5}, {:num=>3, :diff=>25}]
I found below methods from stackoverflow, just wondering if this would work? Or there is another way to do it?
def get_num(arr)
case rand(100) + 1
when 1..5
p arr[0]
when 6..65
p arr[1]
when 66..100
p arr[2]
end
end
get_num(arr)
Thanks!
Your code is fine but here are two other approaches.
Use a cumulative distribution function ("CDF")
CDF = [[0.05,0], [0.05+0.60,1], [0.5+0.60+0.35,2]]
#=> [[0.05,0], [0.65,1], [1.0,2]]
def get_num(arr)
n = rand
arr[CDF.find { |mx,_idx| n <= mx }.last]
end
arr = [{:num=>1, :diff=>-29}, {:num=>2, :diff=>5}, {:num=>3, :diff=>25}]
get_num(arr)
#=> {:num=>2, :diff=>5}
get_num(arr)
#=> {:num=>2, :diff=>5}
get_num(arr)
#=> {:num=>3, :diff=>25}
get_num(arr)
#=> {:num=>1, :diff=>-29}
get_num(arr)
#=> {:num=>2, :diff=>5}
Suppose:
n = rand
#=> 0.5385005480168696
then
a = CDF.find { |mx,_idx| n <= mx }
#=> [0.65,1]
i = a.last
#=> 1
arr[i]
#=> {:num=>2, :diff=>5}
Note that I've followed the convention of beginning the name of find's second block variable (_idx) with an underscore to signal to the reader that that block variable is not used in the block calculation. Often just an underscore (_) is used.
Now consider the fraction of times each element of arr will be randomly-drawn if n draws are made:
def outcome_fractions(arr, n)
n.times
.with_object(Hash.new(0)) { |_,h| h[get_num(arr)] += 1 }
.transform_values { |v| v.fdiv(n) }
end
Randomly select from an array of indices
outcome_fractions(arr, 1_000)
#=> {{:num=>2, :diff=>5} =>0.612,
# {:num=>3, :diff=>25} =>0.328,
# {:num=>1, :diff=>-29}=>0.06}
outcome_fractions(arr, 100_000)
#=> {{:num=>3, :diff=>25} =>0.34818,
# {:num=>1, :diff=>-29}=>0.04958,
# {:num=>2, :diff=>5} =>0.60224}
Notice that the fraction of each hash that is randomly drawn approaches its specified population probability as the sample size is increased (though the "pseudo-random" draws are not truly random).
Do not be concerned with how outcome_fractions works.
Here is another way that is more efficient (because it does not use find, which performs a linear search) but uses more memory.
CHOICE = [*[0]*5, *[1]*60, *[2]*35]
#=> [0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
# 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
# 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,
# 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
# 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,
# 2, 2, 2, 2, 2]
def get_num(arr)
arr[CHOICE[rand(100)]]
end
#=> {{:num=>2, :diff=>5} =>0.60029,
# {:num=>3, :diff=>25}=>0.35022,
# {:num=>1, :diff=>-29}=>0.04949}
Note that:
[*[0]*5, *[1]*60, *[2]*35]
produces the same array as
[[0]*5, [1]*60, [2]*35].flatten
The first * in *[0]*5 is the splat operator; the second is the method Array#*. [0]*5 #=> [0,0,0,0,0] is evaluated first.
CHOICE has 100 elements. If the three probabilities were, say, 0.048, 0.604 and 0.348, CHOICE would have 10**3 #=> 1_000 elements (48 zeros, 604 ones and 348 twos).
Here's a small variation / addition to Cary's great answer.
Instead of calculating the cumulative sums yourself, you can let Ruby build it for you out of the initial probabilities:
probs = [5, 60, 35]
sum = 0
sums = probs.map { |x| sum += x }
#=> [5, 65, 100]
we can now calculate a random number between 0 and the total sum and find the corresponding index:
r = rand(sum) #=> 37
sums.find_index { |i| r < i } #=> 1
Note that the initial probabilities don't have to sum to 100. instead of [5, 60, 35] you could also use:
probs = [1, 12, 7]
You can wrap the above code into a method:
def random_index(*probs)
sum = 0
sums = probs.map { |x| sum += x }
r = rand(sum)
sums.find_index { |i| r < i }
end
random_index(5, 60, 35) #=> 1
random_index(5, 60, 35) #=> 1
random_index(5, 60, 35) #=> 2
You could also make the method return a proc / lambda that can be reused:
def random_index_proc(*probs)
sum = 0
sums = probs.map { |x| sum += x }
-> {
r = rand(sum)
sums.find_index { |i| r < i }
}
end
prc = random_index_proc(5, 60, 35)
prc.call #=> 1
prc.call #=> 1
prc.call #=> 0
Last not least, you can also pre-populate an array this way: (using Cary's naming convention)
CHOICE = [5, 60, 35].flat_map.with_index { |v, i| [i] * v }
and get a random element via:
def get_num(arr)
arr[CHOICE.sample]
end
To keep the array small, you should prefer [1, 12, 7] (20 elements) over [5, 60, 35] (100 elements). With a little help from gcd you don't even have to calculate it yourself:
probs = [5, 60, 35]
gcd = probs.reduce { |a, b| a.gcd(b) }
#=> 5
probs.map { |i| i / gcd }
#=> [1, 12, 7]

Ruby Increment array from certain starting point

I feel this is a super simple query but I'm having a real tough time with immutable nums in my arrays.
I'd like to have a super simple method, which increments numbers in an array by distributing them from the max value.
eg [1,3,5,1] becomes [1,3,0,1] and then iterates upwards and back through to create [2,4,1,3]
what I currently have is the following
arr = [1,3,5,1]
with a method of
def increment_from_max_value(arr)
max = arr.max
max_index = arr.index(max)
arr[max_index] = 0
while max >= 0
arr[max_index..-1].each do |element|
element = element += 1
max = max -= 1
end
end
end
Currently the array isn't even updating and just returns the original values. Which I believe is due to the immutability of FixNums in Ruby, but with a method like map!, which is able to modify those values, I can't get it to loop back through from a certain starting point like each will.
Many thanks in advance
I'd use divmod to calculate the increase for each element and the leftover.
For a max value of 5 and array size of 4 you'd get:
5.divmod(4) #=> [1, 1]
i.e. each element has to incremented by 1 (first value) and 1 element (second value) has to be incremented by another 1.
Another example for a max value of 23 and 4 elements:
[1, 3, 23, 1]
23.divmod(4) #=> [5, 3]
each element has to be incremented by 5 and 3 elements have to be incremented by another 1:
[ 1, 3, 23, 1]
# +5 +5 +5 +5
# +1 +1 +1
# = [ 7, 9, 5, 7]
Applied to your method:
def increment_from_max_value(arr)
max = arr.max
max_index = arr.index(max)
arr[max_index] = 0
q, r = max.divmod(arr.size)
arr.each_index { |j| arr[j] += q }
r.times { |j| arr[(max_index + j + 1) % arr.size] += 1 }
end
arr.each_index { |j| arr[j] += q } simply adds q to each element.
r.times { |j| arr[(max_index + j + 1) % arr.size] += 1 } is a little more complicated. It distributes the remainder, starting from 1 after max_index. The modulo operation ensures that the index will wrap around:
0 % 4 #=> 0
1 % 4 #=> 1
2 % 4 #=> 2
3 % 4 #=> 3
4 % 4 #=> 0
5 % 4 #=> 1
6 % 4 #=> 2
# ...
I think there is something wrong with the while loop. I did not investigate but you can see that this line arr[max_index] = 0 mutates the array.
I don't know if I've understood the logic, but this should return the desired output:
def increment_from_max_value(arr)
max = arr.max
arr[arr.index(max)] = 0
indexes = (0...arr.size).to_a.reverse
max.times do
arr[indexes.first] += 1
indexes.rotate!
end
end
The value of the block variable element in arr[max_index..-1].each do |element| is changed by element = element += 1 but that has no effect on arr.
You could achieve your objective as follows.
def increment_from_max_value(arr)
mx, mx_idx = arr.each_with_index.max_by(&:first)
sz = arr.size
arr[mx_idx] = 0
arr.rotate!(mx_idx + 1).map!.with_index do |e,i|
begin_nbr = mx - i
e += (begin_nbr <= 0 ? 0 : 1 + ((begin_nbr - 1)/sz))
end.rotate!(-mx_idx - 1)
end
arr = [1,3,5,1]
increment_from_max_value(arr)
arr
#=> [2, 4, 1, 3]
arr = [1,2,3,2,1]
increment_from_max_value(arr)
arr
#=> [2, 2, 0, 3, 2]
After computing the maximum value of arr, mx, and its index, mx_idx, and setting arr[mx_idx] to zero, I rotate the array (counter-clockwise) by mx_idx + 1, making the position of mx last. That way the "allocations" of mx begin with the first element of the rotated array. After performing the allocations I then rotate the array clockwise by the same mx_idx + 1.
begin_nbr equals mx minus the number of indices that precede i; in effect, the portion of mx that remains "unallocated" at index i in the first round of allocations.
I can best explain how this works by salting the method with puts statements.
def increment_from_max_value(arr)
mx, mx_idx = arr.each_with_index.max_by(&:first)
sz = arr.size
puts "mx = #{mx}, mx_idx = #{mx_idx}, sz = #{sz}"
arr[mx_idx] = 0
arr.rotate!(mx_idx + 1).
tap { |a| puts "arr rotated to make mx position last = #{a}" }.
map!.with_index do |e,i|
begin_nbr = mx - i
puts "e before = #{e}, i = #{i}, begin_nbr = #{begin_nbr}"
e += (begin_nbr <= 0 ? 0 : 1 + ((begin_nbr - 1)/sz))
e.tap { |f| puts "e after change = #{f}" }
end.
tap { |a| puts "arr after changes = #{a}" }.
rotate!(-mx_idx - 1).
tap { |a| puts "arr after rotating back = #{a}" }
end
arr = [1,3,5,1]
increment_from_max_value(arr)
mx = 5, mx_idx = 2, sz = 4
arr rotated to make mx position last = [1, 1, 3, 0]
e before = 1, i = 0, begin_nbr = 5
e after change = 3
e before = 1, i = 1, begin_nbr = 4
e after change = 2
e before = 3, i = 2, begin_nbr = 3
e after change = 4
e before = 0, i = 3, begin_nbr = 2
e after change = 1
arr after changes = [3, 2, 4, 1]
arr after rotating back = [2, 4, 1, 3]
#=> [2, 4, 1, 3]

how to shrink an array if two consecutive numbers in an array are equal then remove one and increment other

How to shrink an array if two consecutive numbers in an array are equal then remove one and increment other
Example 1:
int a[6]={2,2,3,4,4,4};
// Output: 6
Example 2:
int b[7]={1,2,2,2,4,2,4};
// Output: {1,3,2,4,2,4}
lst = [2,2,3,4,4,4]
def shrink(lst):
idx = 0
while len(lst) > idx+1:
a, b = lst.pop(idx), lst.pop(idx)
if a == b:
lst.insert(idx, a+1)
idx = 0
else:
lst.insert(idx, b)
lst.insert(idx, a)
idx += 1
shrink(lst)
print(lst)
Prints:
[6]
For [5, 5, 5, 1] prints [6, 5, 1]
This can be done in near-linear time like so:
a = [2, 2, 3, 4, 4, 4]
b = [1, 2, 2, 2, 4, 2, 4]
c = [5, 5, 5, 1]
def shrink_array(a):
res = []
for i in range(1, len(a)+1):
if i < len(a) and a[i] == a[i-1]: # if equal to previous
a[i] += 1 # increment and move on
else:
if len(res) > 0 and res[-1] == a[i-1]: # if equal to last in res
res[-1] += 1 # increment last in res
else:
res.append(a[i-1]) # add to res
while len(res) > 1 and res[-1] == res[-2]: # shrink possible duplicates
res[-2] += 1
del res[-1]
return(res)
for arr in [a, b, c]:
print(shrink_array(arr))
Output:
[6]
[1, 3, 2, 4, 2, 4]
[6, 5, 1]

Inner String array swap

I try to swap inner string array value with none additional array, stack...etc.
Example:
s = [1,2,3,4,5,6,7,8]
output= [1,5,2,6,3,7,4,8]
My solution shows as below, but I think isn't the best solution. Can someone correct my code efficiency?
[python3]
class Solution:
def inner_number(self, s):
i=len(s)//2
index=1
while i < len(s):
for j in range(i,index,-1):
s[j-1],s[j]=s[j],s[j-1]
i+=1
index+=2
return s
s = [1,2,3,4,5,6,7,8,9]
h = len(s)//2
res= []
if len(s)%2==1:
res = [j for i in zip(s[:h],s[h:]) for j in i] + [s[-1]]
else:
res = [j for i in zip(s[:h],s[h:]) for j in i]
print(res)
# output [1, 5, 2, 6, 3, 7, 4, 8, 9]
def inner_swap(input):
req_length = int(len(input)/2) if len(input) % 2 == 0 else int(len(input)/2)+1
s1 = input[:req_length]
s2 = input[req_length:]
result = [None]*len(input)
result[::2] = s1
result[1::2] = s2
return result
assert inner_swap([1, 2, 3, 4]) == [1, 3, 2, 4]
assert inner_swap([1, 2, 3, 4, 5]) == [1, 4, 2, 5, 3]

Logical or every entry in two arrays

Say I have these
a = [0, 1, 0, 0, 0]
b = [0, 0, 0, 1, 0]
I want something like c = a | b
and get the answer as c = [0, 1, 0, 1, 0].
I would do something like this:
a = [0, 1, 0, 0, 0]
b = [0, 0, 0, 1, 0]
a.zip(b).map { |a, b| a | b }
#=> [0, 1, 0, 1, 0]
You are doing bit-twiddling with an integer's bits the elements of an array, which is a rather roundabout way of doing that. I suggest you simply deal with the integers directly:
x = 8
y = 2
Note:
x.to_s(2) #=> "1000"
y.to_s(2) #=> "10"
or, say,
x.to_s(2).rjust(8,"0") #=> "00001000"
y.to_s(2).rjust(8,"0") #=> "00000010"
Now you can obtain the result you want very simply with Fixnum#|:
z = x | y #=> 10
Let's confirm:
z.to_s(2) #=> "1010"
To retrieve bit i (i=1), use Fixnum#[]:
y[0] #=> 0
y[1] #=> 1
y[2] #=> 0
y[99] #=> 0
To set bit i, you will need to use Fixnum#<< to obtain an integer that has a 1 in bit position i and 0 for all other bit positions:
1 << i
For example:
1 << O #=> 1
1 << 1 #=> 2
1 << 2 #=> 4
Alternatively, you could of course write:
2**i
To set bit i to 0 use Fixnum#^ (pronounced "XOR"). For i=1,
y = y ^ (1<<1) #=> 0
which we can more compactly as:
y ^= (1<<1)
To set bit i to 1, for i=1, (recall y is now 0):
y |= (1<<1) #=> 2
Similarly (y now equals 2),
y |= (1<<9) #=> 514
y.to_s(2) #=> "1000000010"
You can also convert you arrays to numbers and then just use the binary operators, it should be faster if you need to perform lots of such operations.
If needed you can convert the result back to array.
a = [0, 1, 0, 0, 0]
b = [0, 0, 0, 1, 0]
a_number = a.join.to_i(2)
b_number = b.join.to_i(2)
c_number = a_number | b_number
c_array = c_number.to_s(2).split('').map(&:to_i)
c_array = [0] * (a.size - c_array.size) + c_array if c_array.size < a.size
p c_number.to_s(2)
p c_array

Resources