How to sum columns with same date in Ruby - arrays

I have an array with these type of data inside it, and I need to sum up the columns with same date.
[["01-04-2013", 100.0, 110.0, 120, 0, 0, 0], ["02-04-2013", 100.0, 110.0, 130, 0, 0, 0], ["03-04-2013", 100.0, 110.0, 120, 0, 0, 0], ["10-04-2013", 100.0, 110.0, 100, 0, 0, 0], ["02-04-2013", 100.0, 140.0, 0, 70, 0, 0], ["10-04-2013", 100.0, 140.0, 0, 100, 0, 0], ["11-04-2013", 100.0, 140.0, 0, 110, 0, 0], ["12-04-2013", 100.0, 140.0, 0, 120, 0, 0], ["09-04-2013", 0.0, 0.0, 0, 0, 130, 0], ["17-04-2013", 0.0, 0.0, 0, 0, 30, 0], ["15-04-2013", 100.0, 130.0, 0, 0, 0, 17], ["17-04-2013", 100.0, 130.0, 0, 0, 0, 90], ["18-04-2013", 100.0, 130.0, 0, 0, 0, 100]]
How can I do it in ruby? I meant to sum up the rows with the same date into one row, and if there is no duplicated date, keep the old ones.

require 'pp'
require 'matrix'
d = [["01-04-2013", 100.0, 110.0, 120, 0, 0, 0], ["02-04-2013", 100.0, 110.0, 130, 0, 0, 0], ["03-04-2013", 100.0, 110.0, 120, 0, 0, 0], ["10-04-2013", 100.0, 110.0, 100, 0, 0, 0], ["02-04-2013", 100.0, 140.0, 0, 70, 0, 0], ["10-04-2013", 100.0, 140.0, 0, 100, 0, 0], ["11-04-2013", 100.0, 140.0, 0, 110, 0, 0], ["12-04-2013", 100.0, 140.0, 0, 120, 0, 0], ["09-04-2013", 0.0, 0.0, 0, 0, 130, 0], ["17-04-2013", 0.0, 0.0, 0, 0, 30, 0], ["15-04-2013", 100.0, 130.0, 0, 0, 0, 17], ["17-04-2013", 100.0, 130.0, 0, 0, 0, 90], ["18-04-2013", 100.0, 130.0, 0, 0, 0, 100]]
pp(
d.group_by(&:first).values.reject do |v|
v.size <= 1
end.map do |e|
e.inject do |m, e|
(Vector.[](*m) + Vector.[](*e)).to_a
end
end
)
Update after comments:
d.group_by(&:first).values.map do |e|
e.inject do |m, e|
[e[0], (Vector.[](*m[1..-1]) + Vector.[](*e[1..-1])).to_a].flatten
end
end.sort
Specification change alert:
def v m
Vector.[](*m.drop(1))
end
d.group_by(&:first).values.map do |group|
r = group.inject do |m, e|
[e[0], *(v(m) + v(e)).to_a]
end
r[1] /= group.size
r[2] /= group.size
r
end.sort
Note. I'm not saying this is homework, but in the cases that are, it should be obvious that when we just do it for the students, we are not really doing them any favors, right? Plus, this solution is provided on a public site that is instantly indexed by google and, being in the top 100 sites in the world, it is not exactly a secret to the prof or the grader. And what if the school is using a national database like http://turnitin.com/ ? I suppose they could check public code snippets if they wanted to. And finally, there is some rather well-written code posted on SO by the, ahem, hobbyists. I'm not sure it can typically pass for lower-division intro-course original work, if I, ahem, say so myself. :-)

aggregated_rows = rows.group_by(&:first).map do |date, rows_by_date|
values = rows_by_date.transpose.drop(1).map { |xs| xs.reduce(:+) }
[date, values]
end
#[["01-04-2013", [100.0, 110.0, 120, 0, 0, 0]],
# ["02-04-2013", [200.0, 250.0, 130, 70, 0, 0]],
...
# ["18-04-2013", [100.0, 130.0, 0, 0, 0, 100]]]

a = [["01-04-2013", 100.0, 110.0, 120, 0, 0, 0], ["02-04-2013", 100.0, 110.0, 130, 0, 0, 0], ["03-04-2013", 100.0, 110.0, 120, 0, 0, 0], ["10-04-2013", 100.0, 110.0, 100, 0, 0, 0], ["02-04-2013", 100.0, 140.0, 0, 70, 0, 0], ["10-04-2013", 100.0, 140.0, 0, 100, 0, 0], ["11-04-2013", 100.0, 140.0, 0, 110, 0, 0], ["12-04-2013", 100.0, 140.0, 0, 120, 0, 0], ["09-04-2013", 0.0, 0.0, 0, 0, 130, 0], ["17-04-2013", 0.0, 0.0, 0, 0, 30, 0], ["15-04-2013", 100.0, 130.0, 0, 0, 0, 17], ["17-04-2013", 100.0, 130.0, 0, 0, 0, 90], ["18-04-2013", 100.0, 130.0, 0, 0, 0, 100]]
require "pp"
def group_and_sum_rows_by_date_string(a)
# instantiate a hash that returns an empty array for a key
# that doesn't exist
h = Hash.new([])
a.each do |row|
# populate the hash with date string as key, and array of
# arrays of the values for that date string
h[k=row.shift] = ([row] + h[k]).compact
end
# add up all the corresponding values in each element's array
# arrays, and return the result as an array
h.map{|k, v| [k, v.transpose.map{|x| x.inject(:+)}]}
end
pp group_and_sum_rows_by_date_string(a)
[["15-04-2013", [100.0, 130.0, 0, 0, 0, 17]],
["03-04-2013", [100.0, 110.0, 120, 0, 0, 0]],
["02-04-2013", [200.0, 250.0, 130, 70, 0, 0]],
["17-04-2013", [100.0, 130.0, 0, 0, 30, 90]],
["18-04-2013", [100.0, 130.0, 0, 0, 0, 100]],
["09-04-2013", [0.0, 0.0, 0, 0, 130, 0]],
["01-04-2013", [100.0, 110.0, 120, 0, 0, 0]],
["12-04-2013", [100.0, 140.0, 0, 120, 0, 0]],
["10-04-2013", [200.0, 250.0, 100, 100, 0, 0]],
["11-04-2013", [100.0, 140.0, 0, 110, 0, 0]]]

require 'pp'
a = [["01-04-2013", 100.0, 110.0, 120, 0, 0, 0], ["02-04-2013", 100.0, 110.0, 130, 0, 0, 0], ["03-04-2013", 100.0, 110.0, 120, 0, 0, 0], ["10-04-2013", 100.0, 110.0, 100, 0, 0, 0], ["02-04-2013", 100.0, 140.0, 0, 70, 0, 0], ["10-04-2013", 100.0, 140.0, 0, 100, 0, 0], ["11-04-2013", 100.0, 140.0, 0, 110, 0, 0], ["12-04-2013", 100.0, 140.0, 0, 120, 0, 0], ["09-04-2013", 0.0, 0.0, 0, 0, 130, 0], ["17-04-2013", 0.0, 0.0, 0, 0, 30, 0], ["15-04-2013", 100.0, 130.0, 0, 0, 0, 17], ["17-04-2013", 100.0, 130.0, 0, 0, 0, 90], ["18-04-2013", 100.0, 130.0, 0, 0, 0, 100]]
h = {}
a.group_by(&:first).each{|k,v| v.flatten!.delete(k); h[k] = v.inject(:+)}
pp h
output:
{"01-04-2013"=>330.0,
"02-04-2013"=>650.0,
"03-04-2013"=>330.0,
"10-04-2013"=>650.0,
"11-04-2013"=>350.0,
"12-04-2013"=>360.0,
"09-04-2013"=>130.0,
"17-04-2013"=>350.0,
"15-04-2013"=>247.0,
"18-04-2013"=>330.0}
pp a.group_by(&:first).map{|k,v| v.flatten!.uniq!}
Output:
[["01-04-2013", 100.0, 110.0, 120, 0],
["02-04-2013", 100.0, 110.0, 130, 0, 140.0, 70],
["03-04-2013", 100.0, 110.0, 120, 0],
["10-04-2013", 100.0, 110.0, 100, 0, 140.0],
["11-04-2013", 100.0, 140.0, 0, 110],
["12-04-2013", 100.0, 140.0, 0, 120],
["09-04-2013", 0.0, 0, 130],
["17-04-2013", 0.0, 0, 30, 100.0, 130.0, 90],
["15-04-2013", 100.0, 130.0, 0, 17],
["18-04-2013", 100.0, 130.0, 0, 100]]
pp a.group_by(&:first).map{|k,v| v.transpose.map!{|a| a.inject(:+)}}
Output:
[["01-04-2013", 100.0, 110.0, 120, 0, 0, 0],
["02-04-201302-04-2013", 200.0, 250.0, 130, 70, 0, 0],
["03-04-2013", 100.0, 110.0, 120, 0, 0, 0],
["10-04-201310-04-2013", 200.0, 250.0, 100, 100, 0, 0],
["11-04-2013", 100.0, 140.0, 0, 110, 0, 0],
["12-04-2013", 100.0, 140.0, 0, 120, 0, 0],
["09-04-2013", 0.0, 0.0, 0, 0, 130, 0],
["17-04-201317-04-2013", 100.0, 130.0, 0, 0, 30, 90],
["15-04-2013", 100.0, 130.0, 0, 0, 0, 17],
["18-04-2013", 100.0, 130.0, 0, 0, 0, 100]]

Related

Transformation of an array yields an empty placeholder

I have a numpy array called new_input_processed. The code below transforms it into a one hot array of type float32 (cf byte_list). But when I type byte_list to see the values of this array, I get an empty tensor. I would like to have a non-empty tensor instead. Is it possible ?
In [30]: new_input_processed
Out[30]:
array([[ 83, 111, 109, 101, 32, 83, 101, 113, 117, 101, 110, 99, 101,
32, 111, 102, 32, 99, 104, 97, 114, 97, 99, 116, 101, 114,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0]], dtype=uint8)
In [31]: byte_list = tf.cast(tf.one_hot(new_input_processed, 256, 1, 0), dtype=tf.float32)
In [32]: byte_list
Out[32]: <tf.Tensor 'Cast_2:0' shape=(1, 100, 256) dtype=float32>
You are not getting an empty tensor. The Tensor object info is returned properly with:
<tf.Tensor 'Cast_2:0' shape=(1, 100, 256) dtype=float32>
Look at the shape, it is just as expected.
Nevertheless, if you want to see the content (i.e. the actual value of the byte_list Tensor object), one way is to call eval().
Something like this should do:
import numpy as np
import tensorflow as tf
new_input_processed = np.array([[ 83, 111, 109, 101, 32, 83, 101, 113, 117, 101, 110, 99, 101,
32, 111, 102, 32, 99, 104, 97, 114, 97, 99, 116, 101, 114,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0]], dtype=np.uint8)
byte_list = tf.cast(tf.one_hot(new_input_processed, 256, 1, 0), dtype=tf.float32)
with tf.Session() as sess: print(byte_list.eval()) # here
Output:
[[[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]
...
[1. 0. 0. ... 0. 0. 0.]
[1. 0. 0. ... 0. 0. 0.]
[1. 0. 0. ... 0. 0. 0.]]]

Stack vertically 5 2D-arrays in diagonal to build a whole 2d-array

l have 5 adjacency matrices (nump arrays) : A, B, C, D, E. each of dimension [20,20].
Given A, B, C, D, E, l would like to build F which stacks the 5 adjacency matrices. Since we have 5 2D arrays of [20,20] then F is of dimension [20*5,20*5] as follow :
F=np.zeros((100,100))
F=[
[A,0,0,0,...,0],
[0,...,B,...,0],
[0,...,..,C,0],
[0,.........D,..,0],
[0,...........,E],
]
such that :
A is indexed at F[0][:20]
B is indexed at F[1][20:40]
C is indexed at F[2][40:60]
D is indexed at F[3][60:80]
E is indexed at F[4][80:100]
What is the efficient numpy way to do that for larage number of adjacency matrices ?. Let's, we have n adjacency matrices to stack in a diagonal of new 2D array of [n*20,n*20]
You could use scipy.sparse.block_diag:
>>> AtoE = np.add.outer(np.arange(5, 10), np.zeros((3, 3), int))
>>> scipy.sparse.block_diag(AtoE).A
array([[5, 5, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[5, 5, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[5, 5, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 6, 6, 6, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 6, 6, 6, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 6, 6, 6, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 7, 7, 7, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 7, 7, 7, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 7, 7, 7, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 8, 8, 8, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 8, 8, 8, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 8, 8, 8, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9, 9, 9],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9, 9, 9],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9, 9, 9]], dtype=int64)
Sparse storage may be a good idea, anyway.
Alternatively, here is a more direct method in case you definitely want to use dense arrays:
>>> A = AtoE[0]
>>> N, N = A.shape
>>> k = len(AtoE)
>>> out = np.zeros((k, N, k, N), A.dtype)
>>> np.einsum('ijik->ijk', out)[...] = AtoE
>>> out.reshape(k*N, k*N)
array([[5, 5, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[5, 5, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[5, 5, 5, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 6, 6, 6, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 6, 6, 6, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 6, 6, 6, 0, 0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 7, 7, 7, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 7, 7, 7, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 7, 7, 7, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 8, 8, 8, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 8, 8, 8, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 8, 8, 8, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9, 9, 9],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9, 9, 9],
[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9, 9, 9]])

Insert values in 2d numpy array

I am stuck on this simple issue but I can't seem to figure it out, I have a diagonal array:
N = [1,2,3,4,5,6,7,8,9]
A = numpy.diag(N)
And I have a list of row and column indices such as this:
B = [[1,0],[2,1],[3,2]]
I want to insert a value of 1 in A given the location from B, it helps to think of A as a 2-D matrix and B the set of coordinates I want to insert the value A in.
I tried to use the numpy.put but it doesn't seem to allow me to access a 2d array and I don't know how to think about it in a for-loop sense.
The desired answer would look like this:
A = [[1,0,0,0,0,0,0,0,0],[1,2,0,0,0,0,0,0,0],[0,1,3,0,0,0,0,0,0],[0,0,0,4,0,0,0,0,0],...,[0,0,0,0,0,0,0,0,9]]
Any help is appreciated
Maybe for loop
for x in B:
A[x[0],x[1]]=1
A
Out[189]:
array([[1, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 2, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 3, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 4, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 5, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 6, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 7, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 8, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 9]])
You need to group the first and the second coordinates together:
I, J = zip(*B)
or
I, J = numpy.transpose(B)
Then you can index A directly
A[I, J] = 1
Make B a numpy array:
B = np.array(b)
Then just index using the first and second columns:
A[B[:, 0], B[:, 1]] = 1
array([[1, 0, 0, 0, 0, 0, 0, 0, 0],
[1, 2, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 3, 0, 0, 0, 0, 0, 0],
[0, 0, 1, 4, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 5, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 6, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 7, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 8, 0],
[0, 0, 0, 0, 0, 0, 0, 0, 9]])

How to find the number of elements of a two-dimensional array, read from a file?

I have a file. Inside the file I have stored a two-dimensional array, something like this:
[[0, 0, 1, 0, 1, 0, 1, 0, 1, 0], [0, 0, 0, 0, 0, 0, 1, 1, 0, 0], [0, 0, 0, 0, 1, 1, 1, 1, 0, 0], [0, 0, 0, 1, 0, 0, 1, 0, 0, 0], [0, 1, 1, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 1, 1, 1, 0], [0, 1, 1, 0, 1, 0, 1, 0, 1, 0], [0, 1, 0, 0, 0, 0, 0, 1, 0, 0], [0, 0, 0, 1, 0, 0, 0, 1, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]]
Lengths of arrays can vary and they are not always 10 elements long.
I read the array from the file using this method:
map = IO.readlines("test.txt")
and when i print the result using:
map.each {|x| puts "#{x}"}
the output is what I expect it to be. But if I try to get the row length using:
puts map[0].length
I get 320 instead of 10 (which is what I expect).
Can someone explain me why am I getting 320 instead of 10 ?
Instead of IO#readlines you should use JSON#parse since it’s a valid json:
require 'json'
JSON.parse(File.read("test.txt"))
#⇒ [[0, 0, 1, 0, 1, 0, 1, 0, 1, 0],
# [0, 0, 0, 0, 0, 0, 1, 1, 0, 0],
# [0, 0, 0, 0, 1, 1, 1, 1, 0, 0],
# [0, 0, 0, 1, 0, 0, 1, 0, 0, 0],
# [0, 1, 1, 0, 0, 0, 0, 0, 0, 0],
# [0, 0, 0, 0, 0, 0, 1, 1, 1, 0],
# [0, 1, 1, 0, 1, 0, 1, 0, 1, 0],
# [0, 1, 0, 0, 0, 0, 0, 1, 0, 0],
# [0, 0, 0, 1, 0, 0, 0, 1, 0, 0],
# [0, 0, 0, 0, 0, 0, 0, 0, 0, 0]]

NumPy Array: Minesweeper - substituting random items

I am at the beginning of an attempt to make a "minesweeper" game. I have an 8 x 8 array of 0's. I would like to substitute 8 random 0's within the array with the value 1 (to represent "mines"). I have no clue where to begin. Here is my code:
import numpy as np
import sys
import random
a = np.array([(0, 0, 0, 0, 0, 0, 0, 0),
(0, 0, 0, 0, 0, 0, 0, 0),
(0, 0, 0, 0, 0, 0, 0, 0),
(0, 0, 0, 0, 0, 0, 0, 0),
(0, 0, 0, 0, 0, 0, 0, 0),
(0, 0, 0, 0, 0, 0, 0, 0),
(0, 0, 0, 0, 0, 0, 0, 0),
(0, 0, 0, 0, 0, 0, 0, 0)])
for random.item in a:
item.replace(1)
print(a)
row = int(input("row "))
column = int(input("column "))
print(a[row - 1, column - 1])
How do I replace 8 random 0's within the array with 1's?
Use np.random.choice without replacement option -
In [3]: a # input array of all zeros
Out[3]:
array([[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0]])
# Generate unique flattened indices and on a flattened view of
# input array assign those as 1s
In [8]: a.flat[np.random.choice(a.size,8,replace=False)] = 1
# Verify results
In [9]: a
Out[9]:
array([[0, 0, 0, 0, 0, 1, 0, 0],
[0, 0, 1, 0, 0, 0, 0, 0],
[0, 1, 1, 0, 1, 0, 0, 0],
[0, 0, 0, 1, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0],
[0, 1, 0, 0, 1, 0, 0, 0],
[0, 0, 0, 0, 0, 0, 0, 0]])

Resources