The number of same elements in an array - arrays

My aim is to display the number of identical elements in an array.
Here is my code:
a = [5, 2, 4, 1, 2]
b = []
for i in a
unless b.include?(a[i])
b << a[i]
print i," appears ",a.count(i)," times\n"
end
end
I get this output:
5 appears 1 times
2 appears 2 times
4 appears 1 times
The output misses 1.

Here's a different way to do it, assuming I understand what "it" is (counting elements in an array):
a = [5,2,4,1,2]
counts = a.each_with_object(Hash.new(0)) do |element, counter|
counter[element] += 1
end
# => {5=>1, 2=>2, 4=>1, 1=>1}
# i.e. one 5, two 2s, one 4, one 1.
counts.each do |element, count|
puts "#{element} appears #{count} times"
end
# => 5 appears 1 times
# => 2 appears 2 times
# => 4 appears 1 times
# => 1 appears 1 times
Hash.new(0) initialises a hash with a default value 0. We iterate on a (while passing the hash as an additional object), so element will be each element of a in order, and counter will be our hash. We will increment the value of the hash indexed by the element by one; on the first go for each element, there won't be anything there, but our default value saves our bacon (and 0 + 1 is 1). The next time we encounter an element, it will increment whatever value already is present in the hash under that index.
Having obtained a hash of elements and their counts, we can print them, of course, puts is same as print but automatically inserts a newline; and rather than using commas to print several things, it is much nicer to put the values directly into the printed string itself using the string interpolation syntax ("...#{...}...").
The problems in your code are as follows:
[logic] for i in a will give you elements of a, not indices. Thus, a[i] will give you nil for the first element, not 5, since a[5] is outside the list. This is why 1 is missing from your output: a[1] (i.e. 2) is already in b when you try to process it.
[style] for ... in ... is almost never seen in Ruby code, with strong preference to each and other methods of Enumerable module
[performance] a.count(i) inside a loop increases your algorithmic complexity: count itself has to see the whole array, and you need to iterate the array to see i, which will be exponentially slower with huge arrays. The method above only has one loop, as access to hashes is very fast, and thus grows more or less linearly with the size of the array.
The stylistic and performance problems are minor, of course; you won't see performance drop till you need to process really large arrays, and style errors won't make your code not work; however, if you're learning Ruby, you should aim to work with the language from the start, to get used to its idioms as you go along, as it will give you much stronger foundation than transplanting other languages' idioms onto it.

a = [5,2,4,1,2]
b = a.uniq
for i in b
print i," appears ",a.count(i)," times\n"
end
print b
Result:
5 appears 1 times
2 appears 2 times
4 appears 1 times
1 appears 1 times
[5, 2, 4, 1]

Related

Select random elements with guaranteed spacing before choosing the same element again

I want to select random elements from a list without the possibility of repeating the same element twice in a row. I want to have a guaranteed amount of other elements between choosing the same element again.
Additionally it has to be impossible to 100% predict what the next choice will be.
My current solution is to select elements at random until I selected one third of the total elements. Then I randomly select half of the the other elements to get another third. After that I add the first third back to the remaining elements and repeat the process.
This way I have a guaranteed distance of 1/3 of the total elements before repeating an element. But, I would like to have an even larger spacing. Is there any way to achieve this without making the choice predictable?
I can't help you with Pascal, I don't have a copy and haven't used it for over 30 years so I don't know what libraries you might have access to.
With that out of the way, the task is fairly straightforward if you have (or can fake) a queue data structure so you can store things in First-In-First-Out order.
Shuffle the original array, then slice the desired "spacing" number of elements off the end of it.
Select an element at random from the N - spacing items in the array by randomly generating an index.
Do whatever you want with that item, but then append it to the queue.
Pop the first element off the queue and store it in the location of the item you just selected/used.
Voila! Items that have recently been used are stored in the queue until they get to the front, then they are recycled into the set from which you are randomizing. Since they are out of circulation for the length of the queue, the spacing is guaranteed.
Here it is in Ruby, which is close to being pseudo-code. I've also annotated the heck out of it.
ary = (1..10).to_a # create an array "ary" containing the numbers 1 to 10
ary.shuffle! # shuffle the array
spacing = ary.length / 3 # set the desired spacing as fraction of ary
# Now we'll slice the last "spacing" elements off into a queue,
# starting at location ary.length - spacing
queue = ary.slice!(ary.length - spacing, spacing)
p ary, queue # print the array and queue to show the random splitting
# Now we're set up with "spacing" random elements (from the shuffling)
# in a queue, and the rest still in "ary"
20.times do # do the following 20 times for demo purposes
index = rand(ary.length) # Choose a random index in "ary",
print ary[index] # print it out,
print ' ' # and print a space after it.
queue << ary[index] # Now append it to the queue
ary[index] = queue.shift # and replace that location with the first element popped from the queue
end
puts # put a new-line at the end of the printed values
which produces, for example:
[7, 2, 3, 8, 6, 10, 5]
[9, 1, 4]
5 7 8 3 5 2 9 4 1 7 3 6 1 5 3 2 4 6 1 7
The first line is the shuffled array after slicing, the second line is the
queue of sliced values, and the third line is the result of 20 iterations of the algorithm. Note that no element occurs within 3 of its prior usage.

algorithm which finds the numbers in a sequence which appear 3 times or more, and prints their indexes

Suppose I input a sequence of numbers which ends with -1.
I want to print all the values of the sequence that occur in it 3 times or more, and also print their indexes in the sequence.
For example , if the input is : 2 3 4 2 2 5 2 4 3 4 2 -1
so the expected output in that case is :
2: 0 3 4 6 10
4: 2 7 9
First I thought of using quick-sort , but then I realized that as a result I will lose the original indexes of the sequence. I also have been thinking of using count, but that sequence has no given range of numbers - so maybe count will be no good in that case.
Now I wonder if I might use an array of pointers (but how?)
Do you have any suggestions or tips for an algorithm with time complexity O(nlogn) for that ? It would be very appreciated.
Keep it simple!
The easiest way would be to scan the sequence and count the number of occurrence of each element, put the elements that match the condition in an auxiliary array.
Then, for each element in the auxiliary array, scan the sequence again and print out the indices.
First of all, sorry for my bad english (It's not my language) I'll try my best.
So similar to what #vvigilante told, here is an algorithm implemented in python (it is in python because is more similar to pseudo code, so you can translate it to any language you want, and moreover I add a lot of comment... hope you get it!)
from typing import Dict, List
def three_or_more( input_arr:int ) -> None:
indexes: Dict[int, List[int]] = {}
#scan the array
i:int
for i in range(0, len(input_arr)-1):
#create list for the number in position i
# (if it doesn't exist)
#and append the number
indexes.setdefault(input_arr[i],[]).append(i)
#for each key in the dictionary
n:int
for n in indexes.keys():
#if the number of element for that key is >= 3
if len(indexes[n]) >= 3:
#print the key
print("%d: "%(n), end='')
#print each element int the current key
el:int
for el in indexes[n]:
print("%d,"%(el), end='')
#new line
print("\n", end='')
#call the function
three_or_more([2, 3, 4, 2, 2, 5, 2, 4, 3, 4, 2, -1])
Complexity:
The first loop scan the input array = O(N).
The second one check for any number (digit) in the array,
since they are <= N (you can not have more number than element), so it is O(numbers) the complexity is O(N).
The loop inside the loop go through all indexes corresponding to the current number...
the complexity seem to be O(N) int the worst case (but it is not)
So the complexity would be O(N) + O(N)*O(N) = O(N^2)
but remember that the two nest loop can at least print all N indexes, and since the indexes are not repeated the complexity of them is O(N)...
So O(N)+O(N) ~= O(N)
Speaking about memory it is O(N) for the input array + O(N) for the dictionary (because it contain all N indexes) ~= O(N).
Well if you do it in c++ remember that maps are way slower than array, so if N is small, you should use an array of array (or std::vector> ), else you can also try an unordered map that use hashes
P.S. Remember that get the size of a vector is O(1) time because it is a difference of pointers!
Starting with a sorted list is a good idea.
You could create a second array of original indices and duplicate all of the memory moves for the sort on the indices array. Then checking for triplicates is trivial and only requires sort + 1 traversal.

Is it safe to delete from an Array inside each?

Is it possible to safely delete elements from an Array while iterating over it via each? A first test looks promising:
a = (1..4).to_a
a.each { |i| a.delete(i) if i == 2 }
# => [1, 3, 4]
However, I could not find hard facts on:
Whether it is safe (by design)
Since which Ruby version it is safe
At some points in the past, it seems that it was not possible to do:
It's not working because Ruby exits the .each loop when attempting to delete something.
The documentation does not state anything about deletability during iteration.
I am not looking for reject or delete_if. I want to do things with the elements of an array, and sometimes also remove an element from the array (after I've done other things with said element).
Update 1: I was not very clear on my definition of "safe", what I meant was:
do not raise any exceptions
do not skip any element in the Array
You should not rely on unauthorized answers too much. The answer you cited is wrong, as is pointed out by Kevin's comment to it.
It is safe (from the beginning of Ruby) to delete elements from an Array while each in the sense that Ruby will not raise an error for doing that, and will give a decisive (i.e., not random) result.
However, you need to be careful because when you delete an element, the elements following it will be shifted, hence the element that was supposed to be iterated next would be moved to the position of the deleted element, which has been iterated over already, and will be skipped.
In order to answer your question, whether it is "safe" to do so, you will first have to define what you mean by "safe". Do you mean
it doesn't crash the runtime?
it doesn't raise an Exception?
it does raise an Exception?
it behaves deterministically?
it does what you expect it to do? (What do you expect it to do?)
Unfortunately, the Ruby Language Specification is not exactly helpful:
15.2.12.5.10 Array#each
each(&block)
Visibility: public
Behavior:
If block is given:
For each element of the receiver in the indexing order, call block with the element as the only argument.
Return the receiver.
This seems to imply that it is indeed completely safe in the sense of 1., 2., 4., and 5. above.
The documentation says:
each { |item| block } → ary
Calls the given block once for each element in self, passing that element as a parameter.
Again, this seems to imply the same thing as the spec.
Unfortunately, none of the currently existing Ruby implementations interpret the spec in this way.
What actually happens in MRI and YARV is the following: the mutation to the array, including any shifting of the elements and/or indices becomes visible immediately, including to the internal implementation of the iterator code which is based on array indices. So, if you delete an element at or before the position you are currently iterating, you will skip the next element, whereas if you delete an element after the position you are currently iterating, you will skip that element. For each_with_index, you will also observe that all elements after the deleted element have their indices shifted (or rather the other way around: the indices stay put, but the elements are shifted).
So, this behavior is "safe" in the sense of 1., 2., and 4.
The other Ruby implementations mostly copy this (undocumented) behavior, but being undocumented, you cannot rely on it, and in fact, I believe at least one did experiment briefly with raising some sort of ConcurrentModificationException instead.
I would say that it is safe, based on the following:
2.2.2 :035 > a = (1..4).to_a
=> [1, 2, 3, 4]
2.2.2 :036 > a.each { |i| a.delete(i+1) if i > 1 ; puts i }
1
2
4
=> [1, 2, 4]
I'd infer from this test that Ruby correctly recognises while iterating through the contents that the element "3" has been deleted while element "2" was being processed, otherwise element "4" would also have been deleted.
However,
2.2.2 :040 > a.each { |i| puts i; a.delete(i) if i > 1 ; puts i }
1
1
2
2
4
4
This suggests that after "2" is deleted, the next element processed is whichever is now third in the array, so the element that used to be in third place does not get processed at all. each appears to re-examine the array to find the next element to process on every iteration.
I think that with that in mind, you ought to duplicate the array in your circumstances prior to processing.
It depends.
All .each does is returns an enumerator, which holds the collection an a pointer to where it left. Example:
a = [1,2,3]
b = a.each # => #<Enumerator: [1, 2, 3]:each>
b.next # => 1
a.delete(2)
b.next # => 3
a.clear
b.next # => StopIteration: iteration reached an end
Each with block calls next until the iteration reaches its end. So as long as you don't modify any 'future' array records it should be safe.
However there are so many helpful methods in ruby's Enumerable and Array you really shouldn't ever need to do this.
You are right, in the past it was advised not to remove items from the collection while iterating over it. In my tests and at least with version 1.9.3 in practice in an array this gives no problem, even when deleting prior or next elements.
It is my opinion that while you can you shouldn't.
A more clear and safe approach is to reject the elements and assign to a new array.
b = a.reject{ |i| i == 2 } #[1, 3, 4]
In case you want to reuse your a array that is also possible
a = a.reject{ |i| i == 2 } #[1, 3, 4]
which is in fact the same as
a.reject!{ |i| i == 2 } #[1, 3, 4]
You say you don't want to use reject because you want to do other things with the elements before deleting, but that is also possible.
a.reject!{ |i| puts i if i == 2;i == 2 }
# 2
#[1, 3, 4]

Array Addition, why start at 'i = 2'?

Using the Ruby language, have the function ArrayAdditionI(arr) take the array of numbers stored in arr and return the string true if any combination of numbers in the array can be added up to equal the largest number in the array, otherwise return the string false. For example: if arr contains [4, 6, 23, 10, 1, 3] the output should return true because 4 + 6 + 10 + 3 = 23. The array will not be empty, will not contain all the same elements, and may contain negative numbers.
Could someone please explain to me why this code starts at 'i=2' and not 'i=0'?
def ArrayAdditionI(arr)
i = 2
while i < arr.length
return true if arr.combination(i).map{|comb| comb.inject(:+)}.include?(arr.max)
i += 1
end
false
end
ArrayAdditionI(STDIN.gets)
Correct me if I'm wrong but with i=2, the while loop will iterate [2..4] and then stop. But does this allow for all the potential combinations?...=> code works, so obviously it does but I'm just not seeing it.
i is not the index of the array it the number of elements that is being used to create a combination. So if the max number in the array can be made with the sum of just two elements it stops if not it tries three and so on.
array.combination(i) returns all possible combination of elements in an array with length i.
For example
if
ar=[4, 6, 23, 10]
then
array.combination(2).to_a
returns
[[4,6],[4,23],[4,10],[6,23],[6,10],[23,10]]
So basically you need to find sum in your program and getting sum requires combinations of length two (You need to operands in all possible combination ). Hence you don't start with i=0 or i=1.
You can not give it an empty array,so 0 leads to false. If you have 1 element in an array,it is also meaningless. So I guess 2 is a starting point which makes this test meaningful.

Algorithm to find "most common elements" in different arrays

I have for example 5 arrays with some inserted elements (numbers):
1,4,8,10
1,2,3,4,11,15
2,4,20,21
2,30
I need to find most common elements in those arrays and every element should go all the way till the end (see example below). In this example that would be the bold combination (or the same one but with "30" on the end, it's the "same") because it contains the smallest number of different elements (only two, 4 and 2/30).
This combination (see below) isn't good because if I have for ex. "4" it must "go" till it ends (next array mustn't contain "4" at all). So combination must go all the way till the end.
1,4,8,10
1,2,3,4,11,15
2,4,20,21
2,30
EDIT2: OR
1,4,8,10
1,2,3,4,11,15
2,4,20,21
2,30
OR anything else is NOT good.
Is there some algorithm to speed this thing up (if I have thousands of arrays with hundreds of elements in each one)?
To make it clear - solution must contain lowest number of different elements and the groups (of the same numbers) must be grouped from first - larger ones to the last - smallest ones. So in upper example 4,4,4,2 is better then 4,2,2,2 because in first example group of 4's is larger than group of 2's.
EDIT: To be more specific. Solution must contain the smallest number of different elements and those elements must be grouped from first to last. So if I have three arrrays like
1,2,3
1,4,5
4,5,6
Solution is 1,1,4 or 1,1,5 or 1,1,6 NOT 2,5,5 because 1's have larger group (two of them) than 2's (only one).
Thanks.
EDIT3: I can't be more specific :(
EDIT4: #spintheblack 1,1,1,2,4 is the correct solution because number used first time (let's say at position 1) can't be used later (except it's in the SAME group of 1's). I would say that grouping has the "priority"? Also, I didn't mention it (sorry about that) but the numbers in arrays are NOT sorted in any way, I typed it that way in this post because it was easier for me to follow.
Here is the approach you want to take, if arrays is an array that contains each individual array.
Starting at i = 0
current = arrays[i]
Loop i from i+1 to len(arrays)-1
new = current & arrays[i] (set intersection, finds common elements)
If there are any elements in new, do step 6, otherwise skip to 7
current = new, return to step 3 (continue loop)
print or yield an element from current, current = arrays[i], return to step 3 (continue loop)
Here is a Python implementation:
def mce(arrays):
count = 1
current = set(arrays[0])
for i in range(1, len(arrays)):
new = current & set(arrays[i])
if new:
count += 1
current = new
else:
print " ".join([str(current.pop())] * count),
count = 1
current = set(arrays[i])
print " ".join([str(current.pop())] * count)
>>> mce([[1, 4, 8, 10], [1, 2, 3, 4, 11, 15], [2, 4, 20, 21], [2, 30]])
4 4 4 2
If all are number lists, and are all sorted, then,
Convert to array of bitmaps.
Keep 'AND'ing the bitmaps till you hit zero. The position of the 1 in the previous value indicates the first element.
Restart step 2 from the next element
This has now turned into a graphing problem with a twist.
The problem is a directed acyclic graph of connections between stops, and the goal is to minimize the number of lines switches when riding on a train/tram.
ie. this list of sets:
1,4,8,10 <-- stop A
1,2,3,4,11,15 <-- stop B
2,4,20,21 <-- stop C
2,30 <-- stop D, destination
He needs to pick lines that are available at his exit stop, and his arrival stop, so for instance, he can't pick 10 from stop A, because 10 does not go to stop B.
So, this is the set of available lines and the stops they stop on:
A B C D
line 1 -----X-----X-----------------
line 2 -----------X-----X-----X-----
line 3 -----------X-----------------
line 4 -----X-----X-----X-----------
line 8 -----X-----------------------
line 10 -----X-----------------------
line 11 -----------X-----------------
line 15 -----------X-----------------
line 20 -----------------X-----------
line 21 -----------------X-----------
line 30 -----------------------X-----
If we consider that a line under consideration must go between at least 2 consecutive stops, let me highlight the possible choices of lines with equal signs:
A B C D
line 1 -----X=====X-----------------
line 2 -----------X=====X=====X-----
line 3 -----------X-----------------
line 4 -----X=====X=====X-----------
line 8 -----X-----------------------
line 10 -----X-----------------------
line 11 -----------X-----------------
line 15 -----------X-----------------
line 20 -----------------X-----------
line 21 -----------------X-----------
line 30 -----------------------X-----
He then needs to pick a way that transports him from A to D, with the minimal number of line switches.
Since he explained that he wants the longest rides first, the following sequence seems the best solution:
take line 4 from stop A to stop C, then switch to line 2 from C to D
Code example:
stops = [
[1, 4, 8, 10],
[1,2,3,4,11,15],
[2,4,20,21],
[2,30],
]
def calculate_possible_exit_lines(stops):
"""
only return lines that are available at both exit
and arrival stops, discard the rest.
"""
result = []
for index in range(0, len(stops) - 1):
lines = []
for value in stops[index]:
if value in stops[index + 1]:
lines.append(value)
result.append(lines)
return result
def all_combinations(lines):
"""
produce all combinations which travel from one end
of the journey to the other, across available lines.
"""
if not lines:
yield []
else:
for line in lines[0]:
for rest_combination in all_combinations(lines[1:]):
yield [line] + rest_combination
def reduce(combination):
"""
reduce a combination by returning the number of
times each value appear consecutively, ie.
[1,1,4,4,3] would return [2,2,1] since
the 1's appear twice, the 4's appear twice, and
the 3 only appear once.
"""
result = []
while combination:
count = 1
value = combination[0]
combination = combination[1:]
while combination and combination[0] == value:
combination = combination[1:]
count += 1
result.append(count)
return tuple(result)
def calculate_best_choice(lines):
"""
find the best choice by reducing each available
combination down to the number of stops you can
sit on a single line before having to switch,
and then picking the one that has the most stops
first, and then so on.
"""
available = []
for combination in all_combinations(lines):
count_stops = reduce(combination)
available.append((count_stops, combination))
available = [k for k in reversed(sorted(available))]
return available[0][1]
possible_lines = calculate_possible_exit_lines(stops)
print("possible lines: %s" % (str(possible_lines), ))
best_choice = calculate_best_choice(possible_lines)
print("best choice: %s" % (str(best_choice), ))
This code prints:
possible lines: [[1, 4], [2, 4], [2]]
best choice: [4, 4, 2]
Since, as I said, I list lines between stops, and the above solution can either count as lines you have to exit from each stop or lines you have to arrive on into the next stop.
So the route is:
Hop onto line 4 at stop A and ride on that to stop B, then to stop C
Hop onto line 2 at stop C and ride on that to stop D
There are probably edge-cases here that the above code doesn't work for.
However, I'm not bothering more with this question. The OP has demonstrated a complete incapability in communicating his question in a clear and concise manner, and I fear that any corrections to the above text and/or code to accommodate the latest comments will only provoke more comments, which leads to yet another version of the question, and so on ad infinitum. The OP has gone to extraordinary lengths to avoid answering direct questions or to explain the problem.
I am assuming that "distinct elements" do not have to actually be distinct, they can repeat in the final solution. That is if presented with [1], [2], [1] that the obvious answer [1, 2, 1] is allowed. But we'd count this as having 3 distinct elements.
If so, then here is a Python solution:
def find_best_run (first_array, *argv):
# initialize data structures.
this_array_best_run = {}
for x in first_array:
this_array_best_run[x] = (1, (1,), (x,))
for this_array in argv:
# find the best runs ending at each value in this_array
last_array_best_run = this_array_best_run
this_array_best_run = {}
for x in this_array:
for (y, pattern) in last_array_best_run.iteritems():
(distinct_count, lengths, elements) = pattern
if x == y:
lengths = tuple(lengths[:-1] + (lengths[-1] + 1,))
else :
distinct_count += 1
lengths = tuple(lengths + (1,))
elements = tuple(elements + (x,))
if x not in this_array_best_run:
this_array_best_run[x] = (distinct_count, lengths, elements)
else:
(prev_count, prev_lengths, prev_elements) = this_array_best_run[x]
if distinct_count < prev_count or prev_lengths < lengths:
this_array_best_run[x] = (distinct_count, lengths, elements)
# find the best overall run
best_count = len(argv) + 10 # Needs to be bigger than any possible answer.
for (distinct_count, lengths, elements) in this_array_best_run.itervalues():
if distinct_count < best_count:
best_count = distinct_count
best_lengths = lengths
best_elements = elements
elif distinct_count == best_count and best_lengths < lengths:
best_count = distinct_count
best_lengths = lengths
best_elements = elements
# convert it into a more normal representation.
answer = []
for (length, element) in zip(best_lengths, elements):
answer.extend([element] * length)
return answer
# example
print find_best_run(
[1,4,8,10],
[1,2,3,4,11,15],
[2,4,20,21],
[2,30]) # prints [4, 4, 4, 30]
Here is an explanation. The ...this_run dictionaries have keys which are elements in the current array, and they have values which are tuples (distinct_count, lengths, elements). We are trying to minimize distinct_count, then maximize lengths (lengths is a tuple, so this will prefer the element with the largest value in the first spot) and are tracking elements for the end. At each step I construct all possible runs which are a combination of a run up to the previous array with this element next in sequence, and find which ones are best to the current. When I get to the end I pick the best possible overall run, then turn it into a conventional representation and return it.
If you have N arrays of length M, this should take O(N*M*M) time to run.
I'm going to take a crack here based on the comments, please feel free to comment further to clarify.
We have N arrays and we are trying to find the 'most common' value over all arrays when one value is picked from each array. There are several constraints 1) We want the smallest number of distinct values 2) The most common is the maximal grouping of similar letters (changing from above for clarity). Thus, 4 t's and 1 p beats 3 x's 2 y's
I don't think either problem can be solved greedily - here's a counterexample [[1,4],[1,2],[1,2],[2],[3,4]] - a greedy algorithm would pick [1,1,1,2,4] (3 distinct numbers) [4,2,2,2,4] (two distinct numbers)
This looks like a bipartite matching problem, but I'm still coming up with the formulation..
EDIT : ignore; This is a different problem, but if anyone can figure it out, I'd be really interested
EDIT 2 : For anyone that's interested, the problem that I misinterpreted can be formulated as an instance of the Hitting Set problem, see http://en.wikipedia.org/wiki/Vertex_cover#Hitting_set_and_set_cover. Basically the left hand side of the bipartite graph would be the arrays and the right hand side would be the numbers, edges would be drawn between arrays that contain each number. Unfortunately, this is NP complete, but the greedy solutions described above are essentially the best approximation.

Resources