My aim is to display the number of identical elements in an array.
Here is my code:
a = [5, 2, 4, 1, 2]
b = []
for i in a
unless b.include?(a[i])
b << a[i]
print i," appears ",a.count(i)," times\n"
end
end
I get this output:
5 appears 1 times
2 appears 2 times
4 appears 1 times
The output misses 1.
Here's a different way to do it, assuming I understand what "it" is (counting elements in an array):
a = [5,2,4,1,2]
counts = a.each_with_object(Hash.new(0)) do |element, counter|
counter[element] += 1
end
# => {5=>1, 2=>2, 4=>1, 1=>1}
# i.e. one 5, two 2s, one 4, one 1.
counts.each do |element, count|
puts "#{element} appears #{count} times"
end
# => 5 appears 1 times
# => 2 appears 2 times
# => 4 appears 1 times
# => 1 appears 1 times
Hash.new(0) initialises a hash with a default value 0. We iterate on a (while passing the hash as an additional object), so element will be each element of a in order, and counter will be our hash. We will increment the value of the hash indexed by the element by one; on the first go for each element, there won't be anything there, but our default value saves our bacon (and 0 + 1 is 1). The next time we encounter an element, it will increment whatever value already is present in the hash under that index.
Having obtained a hash of elements and their counts, we can print them, of course, puts is same as print but automatically inserts a newline; and rather than using commas to print several things, it is much nicer to put the values directly into the printed string itself using the string interpolation syntax ("...#{...}...").
The problems in your code are as follows:
[logic] for i in a will give you elements of a, not indices. Thus, a[i] will give you nil for the first element, not 5, since a[5] is outside the list. This is why 1 is missing from your output: a[1] (i.e. 2) is already in b when you try to process it.
[style] for ... in ... is almost never seen in Ruby code, with strong preference to each and other methods of Enumerable module
[performance] a.count(i) inside a loop increases your algorithmic complexity: count itself has to see the whole array, and you need to iterate the array to see i, which will be exponentially slower with huge arrays. The method above only has one loop, as access to hashes is very fast, and thus grows more or less linearly with the size of the array.
The stylistic and performance problems are minor, of course; you won't see performance drop till you need to process really large arrays, and style errors won't make your code not work; however, if you're learning Ruby, you should aim to work with the language from the start, to get used to its idioms as you go along, as it will give you much stronger foundation than transplanting other languages' idioms onto it.
a = [5,2,4,1,2]
b = a.uniq
for i in b
print i," appears ",a.count(i)," times\n"
end
print b
Result:
5 appears 1 times
2 appears 2 times
4 appears 1 times
1 appears 1 times
[5, 2, 4, 1]
A friend of mine asked this question long time ago. He asked me to this without iterating the array. I had in my mind to find the sum on N number as then subtract from it the sum of array numbers. and another the XOR calculation.
But these solutions still needs to iterate the array.
I wounder if there exists such solution or algorithm to do this without iterating the array.
Also if you are going to flag this question duplicate please refer me the link.
What's the missing number in this array ?
1, 2, 3, 4, 5, 6, *
(The * stands for a number you are not allowed to read, otherwise that would be iterating.)
If there is a missing number in an array you have to inspect it, meaning iterate of the array. No way to do th is without iterating.
In general case, you can't do this. Imagine, that you're given Yves Daoust's sample:
[1, 2, 3, 4, 5, 6, ?]
and you're allowed to read any items of the array, but the last one. What is it? Do I hear seven? No, that's a typical wrong solution:
item = i + (i-1)*(i-2)*(i-3)*(i-4)*(i-5)*(i-6)*F(i)
where F(i) is an arbitrary function (well, not arbitrary, there're some loose restrictions, however - F(i) can't be, say 1/(i-3)). Let
F(i) == 0 -> last item == 7
F(i) == 1 -> last item == 727
F(i) == (pi-i)/720 -> last item == pi
...
You have to have more restrictions, e.g. array represents values of a polynom of the least possible power; in that case the solution is 7
I need to design an algorithm that finds the k'th smallest element in unsorted array using function that called "MED3":
This function finds the n/3 (floor) and 2n/3 (ceil) elements of the array if it was sorted (very similar to median, but instead of n/2 it returns those values).
I thought about kind of partition around those 2 values, and than to continue like QuickSelect, but the problem is that "MED3" doesn't return indices of the 2 values, only the values.
for example, if the array is: 1, 2, 10, 1, 7, 6, 3, 4, 4 it returns 2 (n/3 value) and 4 (2n/3 value).
I also thought to run over the array and to take all the values between 2 and 4 (for example, in the given array above) to new array and then use "MED3" again, but can be duplicates (if the array is 2, 2, 2, 2, ..., 2 I would take all the elements each time).
Any ideas? I must use "MED3".
* MED3 is like a black box, it runs in linear time.
Thank you.
I think you're on the right track, but instead of taking 2 to 4, I'd suggest removing the first n/3 values that are <= MED3.floor() and the first n/3 values that are >= MED3.ceil(). That avoids issues with too many duplicates. If two passes/cycle aren't too expensive, you can remove all values < MED3.floor() + up to a total of n/3 values = MED3.floor() (do the same for ceil())
then repeat until you are at the k'th smallest target.
I am trying to solve a complex problem on HackerRank.com that involves creating a solution that accepts both small and large arrays of data ranging from 10 integers to 99,000 integers in length.
Find the problem here -> https://www.hackerrank.com/challenges/array-and-simple-queries
The Problem
How to put this simple is that I have take a array, copy a range of numbers from that array that the user specifies, then append it to a new array.
i = 2
j = 4
a = [1, 2, 3, 4, 5, 6, 7, 8]
for numbers in range(i, j + 1):
b.append(a[numbers - 1])
The range of numbers is appended to the b[] array. This should be 2, 3, 4 in the above example. Now I want to remove() the 2, 3, 4 from the a[] array. This is where I run into problems.
for numbers in range(i, j + 1):
a.remove(a[i-1])
This should remove numbers 2, 3, 4 and leave the a[] array as 1, 5, 6, 7, 8. This works in most cases as specified.
However, in larger arrays such as 500 in length. I see that a.remove() randomly removes numbers not in the range of i, j + 1.
Example
i = 239
j = 422
It removes a[47] and places it in another position as well removes i through j. I have NO IDEA why a[47] is being removed with the code specified above. Is remove() buggy?
What I Need Help On
I'm not trying to have the problem solved for me. I'm trying to understand why remove() is not working correctly. Logic says that it should not be removing anything from i through j, yet it is. Any help is greatly appreciated.
The .remove method on arrays doesn't remove elements by their index, but their value. If you want to delete part of the list, use the del operator (e.g. del a[5] to delete the sixth element, and del a[1:4] to delete the second, third, and fourth elements).
(As for solving this problem efficiently: if you look at the operations in reverse order, I think you don't have to actually manipulate an array.)
I have for example 5 arrays with some inserted elements (numbers):
1,4,8,10
1,2,3,4,11,15
2,4,20,21
2,30
I need to find most common elements in those arrays and every element should go all the way till the end (see example below). In this example that would be the bold combination (or the same one but with "30" on the end, it's the "same") because it contains the smallest number of different elements (only two, 4 and 2/30).
This combination (see below) isn't good because if I have for ex. "4" it must "go" till it ends (next array mustn't contain "4" at all). So combination must go all the way till the end.
1,4,8,10
1,2,3,4,11,15
2,4,20,21
2,30
EDIT2: OR
1,4,8,10
1,2,3,4,11,15
2,4,20,21
2,30
OR anything else is NOT good.
Is there some algorithm to speed this thing up (if I have thousands of arrays with hundreds of elements in each one)?
To make it clear - solution must contain lowest number of different elements and the groups (of the same numbers) must be grouped from first - larger ones to the last - smallest ones. So in upper example 4,4,4,2 is better then 4,2,2,2 because in first example group of 4's is larger than group of 2's.
EDIT: To be more specific. Solution must contain the smallest number of different elements and those elements must be grouped from first to last. So if I have three arrrays like
1,2,3
1,4,5
4,5,6
Solution is 1,1,4 or 1,1,5 or 1,1,6 NOT 2,5,5 because 1's have larger group (two of them) than 2's (only one).
Thanks.
EDIT3: I can't be more specific :(
EDIT4: #spintheblack 1,1,1,2,4 is the correct solution because number used first time (let's say at position 1) can't be used later (except it's in the SAME group of 1's). I would say that grouping has the "priority"? Also, I didn't mention it (sorry about that) but the numbers in arrays are NOT sorted in any way, I typed it that way in this post because it was easier for me to follow.
Here is the approach you want to take, if arrays is an array that contains each individual array.
Starting at i = 0
current = arrays[i]
Loop i from i+1 to len(arrays)-1
new = current & arrays[i] (set intersection, finds common elements)
If there are any elements in new, do step 6, otherwise skip to 7
current = new, return to step 3 (continue loop)
print or yield an element from current, current = arrays[i], return to step 3 (continue loop)
Here is a Python implementation:
def mce(arrays):
count = 1
current = set(arrays[0])
for i in range(1, len(arrays)):
new = current & set(arrays[i])
if new:
count += 1
current = new
else:
print " ".join([str(current.pop())] * count),
count = 1
current = set(arrays[i])
print " ".join([str(current.pop())] * count)
>>> mce([[1, 4, 8, 10], [1, 2, 3, 4, 11, 15], [2, 4, 20, 21], [2, 30]])
4 4 4 2
If all are number lists, and are all sorted, then,
Convert to array of bitmaps.
Keep 'AND'ing the bitmaps till you hit zero. The position of the 1 in the previous value indicates the first element.
Restart step 2 from the next element
This has now turned into a graphing problem with a twist.
The problem is a directed acyclic graph of connections between stops, and the goal is to minimize the number of lines switches when riding on a train/tram.
ie. this list of sets:
1,4,8,10 <-- stop A
1,2,3,4,11,15 <-- stop B
2,4,20,21 <-- stop C
2,30 <-- stop D, destination
He needs to pick lines that are available at his exit stop, and his arrival stop, so for instance, he can't pick 10 from stop A, because 10 does not go to stop B.
So, this is the set of available lines and the stops they stop on:
A B C D
line 1 -----X-----X-----------------
line 2 -----------X-----X-----X-----
line 3 -----------X-----------------
line 4 -----X-----X-----X-----------
line 8 -----X-----------------------
line 10 -----X-----------------------
line 11 -----------X-----------------
line 15 -----------X-----------------
line 20 -----------------X-----------
line 21 -----------------X-----------
line 30 -----------------------X-----
If we consider that a line under consideration must go between at least 2 consecutive stops, let me highlight the possible choices of lines with equal signs:
A B C D
line 1 -----X=====X-----------------
line 2 -----------X=====X=====X-----
line 3 -----------X-----------------
line 4 -----X=====X=====X-----------
line 8 -----X-----------------------
line 10 -----X-----------------------
line 11 -----------X-----------------
line 15 -----------X-----------------
line 20 -----------------X-----------
line 21 -----------------X-----------
line 30 -----------------------X-----
He then needs to pick a way that transports him from A to D, with the minimal number of line switches.
Since he explained that he wants the longest rides first, the following sequence seems the best solution:
take line 4 from stop A to stop C, then switch to line 2 from C to D
Code example:
stops = [
[1, 4, 8, 10],
[1,2,3,4,11,15],
[2,4,20,21],
[2,30],
]
def calculate_possible_exit_lines(stops):
"""
only return lines that are available at both exit
and arrival stops, discard the rest.
"""
result = []
for index in range(0, len(stops) - 1):
lines = []
for value in stops[index]:
if value in stops[index + 1]:
lines.append(value)
result.append(lines)
return result
def all_combinations(lines):
"""
produce all combinations which travel from one end
of the journey to the other, across available lines.
"""
if not lines:
yield []
else:
for line in lines[0]:
for rest_combination in all_combinations(lines[1:]):
yield [line] + rest_combination
def reduce(combination):
"""
reduce a combination by returning the number of
times each value appear consecutively, ie.
[1,1,4,4,3] would return [2,2,1] since
the 1's appear twice, the 4's appear twice, and
the 3 only appear once.
"""
result = []
while combination:
count = 1
value = combination[0]
combination = combination[1:]
while combination and combination[0] == value:
combination = combination[1:]
count += 1
result.append(count)
return tuple(result)
def calculate_best_choice(lines):
"""
find the best choice by reducing each available
combination down to the number of stops you can
sit on a single line before having to switch,
and then picking the one that has the most stops
first, and then so on.
"""
available = []
for combination in all_combinations(lines):
count_stops = reduce(combination)
available.append((count_stops, combination))
available = [k for k in reversed(sorted(available))]
return available[0][1]
possible_lines = calculate_possible_exit_lines(stops)
print("possible lines: %s" % (str(possible_lines), ))
best_choice = calculate_best_choice(possible_lines)
print("best choice: %s" % (str(best_choice), ))
This code prints:
possible lines: [[1, 4], [2, 4], [2]]
best choice: [4, 4, 2]
Since, as I said, I list lines between stops, and the above solution can either count as lines you have to exit from each stop or lines you have to arrive on into the next stop.
So the route is:
Hop onto line 4 at stop A and ride on that to stop B, then to stop C
Hop onto line 2 at stop C and ride on that to stop D
There are probably edge-cases here that the above code doesn't work for.
However, I'm not bothering more with this question. The OP has demonstrated a complete incapability in communicating his question in a clear and concise manner, and I fear that any corrections to the above text and/or code to accommodate the latest comments will only provoke more comments, which leads to yet another version of the question, and so on ad infinitum. The OP has gone to extraordinary lengths to avoid answering direct questions or to explain the problem.
I am assuming that "distinct elements" do not have to actually be distinct, they can repeat in the final solution. That is if presented with [1], [2], [1] that the obvious answer [1, 2, 1] is allowed. But we'd count this as having 3 distinct elements.
If so, then here is a Python solution:
def find_best_run (first_array, *argv):
# initialize data structures.
this_array_best_run = {}
for x in first_array:
this_array_best_run[x] = (1, (1,), (x,))
for this_array in argv:
# find the best runs ending at each value in this_array
last_array_best_run = this_array_best_run
this_array_best_run = {}
for x in this_array:
for (y, pattern) in last_array_best_run.iteritems():
(distinct_count, lengths, elements) = pattern
if x == y:
lengths = tuple(lengths[:-1] + (lengths[-1] + 1,))
else :
distinct_count += 1
lengths = tuple(lengths + (1,))
elements = tuple(elements + (x,))
if x not in this_array_best_run:
this_array_best_run[x] = (distinct_count, lengths, elements)
else:
(prev_count, prev_lengths, prev_elements) = this_array_best_run[x]
if distinct_count < prev_count or prev_lengths < lengths:
this_array_best_run[x] = (distinct_count, lengths, elements)
# find the best overall run
best_count = len(argv) + 10 # Needs to be bigger than any possible answer.
for (distinct_count, lengths, elements) in this_array_best_run.itervalues():
if distinct_count < best_count:
best_count = distinct_count
best_lengths = lengths
best_elements = elements
elif distinct_count == best_count and best_lengths < lengths:
best_count = distinct_count
best_lengths = lengths
best_elements = elements
# convert it into a more normal representation.
answer = []
for (length, element) in zip(best_lengths, elements):
answer.extend([element] * length)
return answer
# example
print find_best_run(
[1,4,8,10],
[1,2,3,4,11,15],
[2,4,20,21],
[2,30]) # prints [4, 4, 4, 30]
Here is an explanation. The ...this_run dictionaries have keys which are elements in the current array, and they have values which are tuples (distinct_count, lengths, elements). We are trying to minimize distinct_count, then maximize lengths (lengths is a tuple, so this will prefer the element with the largest value in the first spot) and are tracking elements for the end. At each step I construct all possible runs which are a combination of a run up to the previous array with this element next in sequence, and find which ones are best to the current. When I get to the end I pick the best possible overall run, then turn it into a conventional representation and return it.
If you have N arrays of length M, this should take O(N*M*M) time to run.
I'm going to take a crack here based on the comments, please feel free to comment further to clarify.
We have N arrays and we are trying to find the 'most common' value over all arrays when one value is picked from each array. There are several constraints 1) We want the smallest number of distinct values 2) The most common is the maximal grouping of similar letters (changing from above for clarity). Thus, 4 t's and 1 p beats 3 x's 2 y's
I don't think either problem can be solved greedily - here's a counterexample [[1,4],[1,2],[1,2],[2],[3,4]] - a greedy algorithm would pick [1,1,1,2,4] (3 distinct numbers) [4,2,2,2,4] (two distinct numbers)
This looks like a bipartite matching problem, but I'm still coming up with the formulation..
EDIT : ignore; This is a different problem, but if anyone can figure it out, I'd be really interested
EDIT 2 : For anyone that's interested, the problem that I misinterpreted can be formulated as an instance of the Hitting Set problem, see http://en.wikipedia.org/wiki/Vertex_cover#Hitting_set_and_set_cover. Basically the left hand side of the bipartite graph would be the arrays and the right hand side would be the numbers, edges would be drawn between arrays that contain each number. Unfortunately, this is NP complete, but the greedy solutions described above are essentially the best approximation.