creating multiple arrays by splitting a hash - arrays

Lets say I have an #assortment of numbers in a hash, e.g. 1 to 100.
Each number in the #assortment can have a status of :free, or :used.
An example #assortment could be:
{ 1 => :free, 2 => :free, 3=> :used etc ... }
Lets say I want to split the #assortment up based on the used numbers, and extract the free numbers into their own hash (or an array or hashes?)
For example, for an #assortment of 1 to 100, if numbers 25 and 75 were 'used' and the rest were 'free', then the result would be 3 new hashes of all the free values:
1 to 24
26 to 74
76 to 100
Similarly, lets say we have a different #assortment, with numbers 1 to 100, but I want to extract numbers 20 to 80, but numbers 30, 31, 32 and 40 are used then the result is like this :
hash1 -> 20 to 29
hash2 ->33 to 39
hash3 -> 41 to 80
Is there a nice functional way to do this in Ruby, where I can pass in a complete #assortment of numbers, and an optional range to extract and get the resulting hashes, perhaps in an array?
I guess the original hash gets broken up or split based on the :used elements...
If you were to loop through the hash, then every free number would be added to a new hash (e.g. hash1) until you reach a used number. Keep going through the loop until you reach a free number, this and all subsequent free numbers get added to a new hash (hash2). Keep this going until you have all the free numbers in new hashes...

#assortment = (20..50).to_a.product([:free]).to_h
[30,31,32,40].each { |n| #assortment[n] = :used }
#assortment
# => {20=>:free, 21=>:free, 22=>:free, 23=>:free, 24=>:free, 25=>:free,
# 26=>:free, 27=>:free, 28=>:free, 29=>:free, 30=>:used, 31=>:used,
# 32=>:used, 33=>:free, 34=>:free, 35=>:free, 36=>:free, 37=>:free,
# 38=>:free, 39=>:free, 40=>:used, 41=>:free, 42=>:free, 43=>:free,
# 44=>:free, 45=>:free, 46=>:free, 47=>:free, 48=>:free, 49=>:free, 50=>:free}
Return an array of hashes
#assortment.reject { |_,v| v == :used }.
slice_when { |(a,_),(b,_)| b > a+1 }.
to_a.
map(&:to_h)
#=> [{20=>:free, 21=>:free,...29=>:free},
# {33=>:free, 34=>:free,...39=>:free},
# {41=>:free, 42=>:free,...50=>:free}]
See Hash#reject (which returns a hash) and Enumerable#slice_when.
Return an array of arrays
Having a hash whose values are all the same doesn't seem very useful. If you'd prefer returning an array of array, just drop to_h.
arr = #assortment.reject { |_,v| v == :used }.
keys.
slice_when { |a,b| b > a+1 }.
to_a
#=> [[20, 21, 22, 23, 24, 25, 26, 27, 28, 29],
# [33, 34, 35, 36, 37, 38, 39],
# [41, 42, 43, 44, 45, 46, 47, 48, 49, 50]]
Return an array of ranges
A third option is to return an array of ranges. To do that map each of arr's elements (arrays) to a range:
arr.map { |f,*_,l| f..l }
#=> [20..29, 33..39, 41..50]
The first element of arr passed to the block is [20, 21, 22, 23, 24, 25, 26, 27, 28, 29]. The three block variables are computed using parallel assignement:
f,*b,l = [20, 21, 22, 23, 24, 25, 26, 27, 28, 29]
f #=> 20
_ #=> [21, 22, 23, 24, 25, 26, 27, 28]
l #=> 29
I wish to underscore that I've used an underscore for the second block variable to underscore that it is not used in the block calculation.

Related

replace element at an index of an array then reset array using for loop

I want to update only 1 element in the 1d array and then start over fresh. if viewed as a matrix form I just want entries i = j to be changed.
my code so far:
import numpy as np
a = np.array([10, 20, 30, 40, 50])
for i, j in enumerate(a):
b = a
b[i] = j + 1
print(b)
I want each iteration of the for loop to only change one element and keep everything else the same.
the output I want looks like this:
[11, 20, 30, 40, 50]
[10, 21, 30, 40, 50]
[10, 20, 31, 40, 50]
[10, 20, 30, 41, 50]
[10, 20, 30, 40, 51]
but I'm getting this because b is not resetting even though I am (or at lest i think) restoring the original array at the start of each loop.
[11, 20, 30, 40, 50]
[11, 21, 30, 40, 50]
[11, 21, 31, 40, 50]
[11, 21, 31, 41, 50]
[11, 21, 31, 41, 51]
any ideas where I went wrong? TIA
Try replacing b=a with b=a.copy()
b=a, will create b and point to the same memory. Whereas b=a.copy(), creates a copy of a and stores it as b in different memory location.

In which array can we find elements in Julia?

Imagine we have the following array of 3 arrays, covering the range 1 to 150:
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10 ... 41, 42, 43, 44, 45, 46, 47, 48, 49, 50]
[51, 52, 53, 54, 55, 56, 57, 58, 59, 60 ... 92, 93, 94, 95, 96, 97, 98, 99, 100, 107]
[71, 73, 84, 101, 102, 103, 104, 105, 106, 108 ... 141, 142, 143, 144, 145, 146, 147, 148, 149, 150]
I want to build an array that stores in which array we find the values 1 to 150. The result must be then:
[1 1 1 ... 1 2 2 2 ... 2 3 2 3 2 ... 3 3 3 ... 3],
where each element corresponds to 1, 2, 3, ... ,150. The obtained array gives then the array-membership of the elements 1 to 150. The code must be applied for any number of arrays (so not only 3 arrays).
You can use an array comprehension. Here is an example with three vectors containing the range 1:10:
A = [1, 3, 4, 5, 7]
B = [2, 8, 9]
C = [6, 10]
Now we can write a comprehension using in with a fallback error to guard :
julia> [x in A ? 1 : x in B ? 2 : 3 for x in 1:10]
10-element Array{Int64,1}:
1
⋮
3
Perhaps also include a fallback error, in case the input is wrong
julia> [x in A ? 1 : x in B ? 2 : x in C ? 3 : error("not found") for x in 1:10]
10-element Array{Int64,1}:
1
⋮
3
Trade memory for search in this case:
Make an array to record which array each value is in.
# example arrays
N=100; A=rand(1:N,30);
B = rand(1:N,40);
C = rand(1:N,35);
# record array containing each value:
A=1,B=2,C=3;
not found=0;
arrayin = zeros(Int32, max(maximum(A),maximum(B),maximum(C)));
arrayin[A] .= 1;
arrayin[B] .= 2;
arrayin[C] .=3;

Sorting arrays of variable length arrays of integers

I have a set of sizes and a target size. Using some knapsack algorithm I get all the possible answers. So, all the answers have the same sum, but contain differing number of elements. I first sort by count so that I get the maximum number of elements. Now, I'm left with a subset that all have the same sum and the same count.
Here is a sample (there are actually many, many, more answers):
[24, 33, 21, 22],
[24, 26, 27, 23],
[24, 27, 28, 21],
[34, 23, 21, 22],
[26, 23, 29, 22],
[24, 40, 36],
[34, 26, 40],
[30, 38, 32],
[24, 37, 39],
[40, 38, 22],
Now, I want to bring the ones that have the biggest elements to the top (¡without breaking the internal order!):
[24, 26, 27, 23], <-- 23 is the biggest smallest
[26, 23, 29, 22], <-- 22 is the second biggest smallest
[24, 27, 28, 21], <-- 21, 24 is the biggest second smallest
[24, 33, 21, 22], <-- 21, 22, 24 is the biggest third smallest
[34, 23, 21, 22], <-- 21, 22, 23 is the second biggest third smallest
[30, 38, 32], <-- 30 is the biggest smallest
[34, 26, 40], <-- 26 is the second biggest smallest
[24, 37, 39], <-- 24, 37 is the biggest second smallest
[24, 40, 36], <-- 24, 36 is the second biggest second smallest
[40, 38, 22], <-- 22 is the fourth biggest smallest
If that makes sense to you, then please tell me the name of the sorting algorithm that can handle it. Everything I've found so far expects to use only one fixed element of the internal array for sorting, but I want to use as many as is necessary to get it sorted.
Obviously, I can do this by brute force, but I'm hoping there's an elegant solution that already exists. I just don't know the right keywords to search for.
Thanks!

Create a list with millions of elements

I need to create and work with lists with 2**30 elements, but It's to slow. Is there any form to increase the speed?
My code:
sup = []
for i in range(2**30):
sup.append([i,pow(y,i,N)])
pow(y,i,n) == y**i*mod(N), modular exponentiation
I tried to use list comprehensions but isn't enough.
Different approach: why do you want to store those numbers in a list?
You have your formula right there; whenever some piece of code needs sup[i]; you just compute pow(y,i,N).
In other words: instead of storing values within a list; just compute them when you need them.
Edit: as it seems that you have good reasons to store that data in an array; I would then say: use the appropriate tool then.
Meaning: instead of doing computing intense things directly with python, you rather look into the numpy framework. That framework is designed for exactly such purposes. Beyond that, I would also look in the way you are storing/preparing your data. Example: you mention to later look for identical entries in that array. I am wondering if that would meant you should use a dictionary instead of a list; or did you really intend do check 2**30 entries each time you look for equal pow values?
Going by your comment and complementing the answer of GhostCat, go directly for the data you are looking for, for example like this
>>> from collections import defaultdict
>>> y = 2
>>> N = 10
>>> data = defaultdict(list)
>>> for i in range(100):
data[pow(y,i,N)].append(i)
>>> for x in data.items():
x
(8, [3, 7, 11, 15, 19, 23, 27, 31, 35, 39, 43, 47, 51, 55, 59, 63, 67, 71, 75, 79, 83, 87, 91, 95, 99])
(1, [0])
(2, [1, 5, 9, 13, 17, 21, 25, 29, 33, 37, 41, 45, 49, 53, 57, 61, 65, 69, 73, 77, 81, 85, 89, 93, 97])
(4, [2, 6, 10, 14, 18, 22, 26, 30, 34, 38, 42, 46, 50, 54, 58, 62, 66, 70, 74, 78, 82, 86, 90, 94, 98])
(6, [4, 8, 12, 16, 20, 24, 28, 32, 36, 40, 44, 48, 52, 56, 60, 64, 68, 72, 76, 80, 84, 88, 92, 96])
>>>
or more specifically, as you need a random sample go for it from the start and don't waste time producing a gazillion stuff you would not need, for example
>>> import random
>>> random_data = defaultdict(list)
>>> for i in random.sample(range(2**30), 20):
random_data[pow(2,i,10)].append(i)
>>> for x in random_data.items():
x
(8, [633728687, 357300263, 208747091, 456291987, 1028949643, 23961003, 750842555])
(2, [602395153, 215460881, 144481457, 829193705])
(4, [752840814, 26689262])
(6, [423520476, 969809132, 326786996, 736424520, 929123176, 865279408, 338237708])
>>>
and depending of what you do with those i later on, you can instead try a more mathematical approach to uncover the underplaying patter that produce an i for which yi mod N is the same and that way you can produce as many i as you need for that particular modular class.
Which for this example is easy, it is
2i = 8 (mod 10) for all i=3 (mod 4) -> range(3,2**30,4)
2i = 2 (mod 10) for all i=1 (mod 4) -> range(1,2**30,4)
2i = 4 (mod 10) for all i=2 (mod 4) -> range(2,2**30,4)
2i = 6 (mod 10) for all i=0 (mod 4) -> range(4,2**30,4)
2i = 1 (mod 10) for i=0

How do I compare elements of separate arrays at specific indexes?

I have two arrays, I want to return the larger number from the same position in each array.
def get_larger_numbers(a, b)
c = []
count = 0
while count < 10 #assumes there are less than 10 elements in an array, not an ideal solution.
if a[count] > b[count]
c << a[count]
elsif b[count] > a[count]
c << b[count]
else #if numbers are the same
c << a[count]
end
count+= 1
end
return c
end
a = [13, 64, 15, 17, 88]
b = [23, 14, 53, 17, 80]
should return:
c == [23, 64, 53, 17, 88]
Clearly, my code doesn't work, what's the best way to refer to increasing index positions?
Also interested to know simpler ways of doing this.
Your code isn't working because of the static 10 you have as the length. Instead I suggest you make your code more dynamic with regards to how often you loop.
def get_larger_numbers(a,b)
c = []
[a.length, b.length].min.times do |i|
if a[i] > b[i]
c << a[i]
else
c << b[i]
end
end
c
end
a = [13, 64, 15, 17, 88]
b = [23, 14, 53, 17, 80]
get_larger_numbers(a,b)
#=> [23, 64, 53, 17, 88]
This solution assumes that if the arrays are not equal size, you want to throw the rest away.
Okay... Here's what you should do:
def get_larger_numbers(a, b)
c = [] #declare empty array for answer
for i in 0...(a.length < b.length ? a.length : b.length) #see EDIT note
c << (a[i] > b[i] ? a[i] : b[i])
end
c #this is an implicit return
end
a = [13, 64, 15, 17, 88]
b = [23, 14, 53, 17, 80]
puts get_larger_numbers(a,b)
This'll do a for loop that'll run from 0 to the length of a. Yes, it assumes that they're the same length. I figure this is what you want.
Anyway, there's a simple ternary that compares the value of each element in both arrays, one index at a time.
It'll push the bigger value to the c array, leaving you with the greater values in the c array to be returned.
EDIT: Added the ternary expression so that for loops through only the smaller array, because comparing with nil (which is what is at any n index beyond the array, presumably) would raise an error.
A compact solution would be:
def get_larger_numbers(a, b)
return a.zip(b).map{|x, y| (x >= y) ? x : y } # Return optional, added for clarity
end
a = [13, 64, 15, 17, 88]
b = [23, 14, 53, 17, 80]
p get_larger_numbers(a, b)
Note that this assumes the input arrays are of the same length. If arrays are of unequal length, you can truncate to the length of the shorter array, or pad the end with the unpaired elements of the larger array. The current code will throw an error, letting you know you've hit this unspecified case.
As for how it works, the zip pairs the elements of the two arrays, so a.zip(b) becomes:
[[13, 23], [64, 14], [15, 53], [17, 17], [88, 80]]
It then loops over the array with map to produce a new array, passing each pair into the block, which returns the larger of the two elements to fill the output array.
Assuming the two arrays are the same size, simply:
def largest_by_position(a,b)
a.zip(b).map(&:max)
end
largest_by_position([13, 64, 15, 17, 88], [23, 14, 53, 17, 80])
#=> [23, 64, 53, 17, 88]
Alternatively, make the operative line:
[a,b].transpose.map(&:max)
For equal-size arrays a and b, Enumerable#zip and Array#transpose always have this yin and yang relationship:
a.zip(b) == [a,b].transpose #=> true

Resources