I have a numpy array called new_input_processed. The code below transforms it into a one hot array of type float32 (cf byte_list). But when I type byte_list to see the values of this array, I get an empty tensor. I would like to have a non-empty tensor instead. Is it possible ?
In [30]: new_input_processed
Out[30]:
array([[ 83, 111, 109, 101, 32, 83, 101, 113, 117, 101, 110, 99, 101,
32, 111, 102, 32, 99, 104, 97, 114, 97, 99, 116, 101, 114,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0]], dtype=uint8)
In [31]: byte_list = tf.cast(tf.one_hot(new_input_processed, 256, 1, 0), dtype=tf.float32)
In [32]: byte_list
Out[32]: <tf.Tensor 'Cast_2:0' shape=(1, 100, 256) dtype=float32>
You are not getting an empty tensor. The Tensor object info is returned properly with:
<tf.Tensor 'Cast_2:0' shape=(1, 100, 256) dtype=float32>
Look at the shape, it is just as expected.
Nevertheless, if you want to see the content (i.e. the actual value of the byte_list Tensor object), one way is to call eval().
Something like this should do:
import numpy as np
import tensorflow as tf
new_input_processed = np.array([[ 83, 111, 109, 101, 32, 83, 101, 113, 117, 101, 110, 99, 101,
32, 111, 102, 32, 99, 104, 97, 114, 97, 99, 116, 101, 114,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
0, 0, 0, 0, 0, 0, 0, 0, 0]], dtype=np.uint8)
byte_list = tf.cast(tf.one_hot(new_input_processed, 256, 1, 0), dtype=tf.float32)
with tf.Session() as sess: print(byte_list.eval()) # here
Output:
[[[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]
[0. 0. 0. ... 0. 0. 0.]
...
[1. 0. 0. ... 0. 0. 0.]
[1. 0. 0. ... 0. 0. 0.]
[1. 0. 0. ... 0. 0. 0.]]]
I have 2D numpy array1 that contains only 0 and 255 values
([[255, 0, 255, 0, 0],
[ 0, 255, 0, 0, 0],
[ 0, 0, 255, 0, 255],
[ 0, 255, 255, 255, 255],
[255, 0, 255, 0, 255]])
and an array2 that is identical in size and shape as array1 and also contains only 0 and 255 values
([[255, 0, 255, 0, 255],
[ 0, 255, 0, 0, 0],
[255, 0, 0, 0, 255],
[ 0, 0, 255, 255, 255],
[255, 0, 255, 0, 0]])
How can I compare array1 to array2 to determine a similarity percentage?
As you only have two possible values, I would propose this algorithm for similarity-checking:
import numpy as np
A = np.array([[255, 0, 255, 0, 0],
[ 0, 255, 0, 0, 0],
[ 0, 0, 255, 0, 255],
[ 0, 255, 255, 255, 255],
[255, 0, 255, 0, 255]])
B = np.array([[255, 0, 255, 0, 255],
[ 0, 255, 0, 0, 0],
[255, 0, 0, 0, 255],
[ 0, 0, 255, 255, 255],
[255, 0, 255, 0, 0]])
number_of_equal_elements = np.sum(A==B)
total_elements = np.multiply(*A.shape)
percentage = number_of_equal_elements/total_elements
print('total number of elements: \t\t{}'.format(total_elements))
print('number of identical elements: \t\t{}'.format(number_of_equal_elements))
print('number of different elements: \t\t{}'.format(total_elements-number_of_equal_elements))
print('percentage of identical elements: \t{:.2f}%'.format(percentage*100))
It counts equal elements and calculates the percentage of the equal elements to the total number of elements
I have array:
a = np.array([[ 0, 1, 2, 0, 0, 0],
[ 0, 4, 1, 35, 0, 10],
[ 0, 0, 5, 4, 0, 4],
[ 1, 2, 5, 4, 0, 4]])
I need select only from first consecutive 0 in each row:
[[ True False False False False False]
[ True False False False False False]
[ True True False False False False]
[ False False False False False False]]
I try:
a[np.arange(len(a)), a.argmax(1): np.arange(len(a)), [0,0,0]] = True
But this is wrong.
You can use np.cumsum.
Assumption: you are looking for zeros only at the start of each row.
a = np.array([[ 0, 1, 2, 0, 0, 0],
[ 0, 4, 1, 35, 0, 10],
[ 0, 0, 5, 4, 0, 4]])
a.cumsum(axis=1) == 0
array([[ True, False, False, False, False, False],
[ True, False, False, False, False, False],
[ True, True, False, False, False, False]], dtype=bool)
Basis: holds True for as long as the cumulative sum is 0 along each row.
Error-prone: an array with negative ints would cause this to fail. I.e. for [-1, 1], this would evaluate to True at position 1.
You might use np.minimum.accumulate with the condition testing a == 0(over the rows); Since non zero gives False, so elements come after the first non zero will be set to False due to the accumulated minimum:
np.minimum.accumulate(a == 0, axis=1)
#array([[ True, False, False, False, False, False],
# [ True, False, False, False, False, False],
# [ True, True, False, False, False, False],
# [False, False, False, False, False, False]], dtype=bool)
Here's one with argmin + broadcasting -
(a==0).argmin(1)[:,None] > np.arange(a.shape[1])
Explanation with a sample step-by-step run
1) Input array :
In [207]: a
Out[207]:
array([[ 0, 1, 2, 0, 0, 0],
[ 0, 4, 1, 35, 0, 10],
[ 0, 0, 5, 4, 0, 4],
[ 1, 2, 5, 4, 0, 4]])
2) Mask of zeros
In [208]: (a==0)
Out[208]:
array([[ True, False, False, True, True, True],
[ True, False, False, False, True, False],
[ True, True, False, False, True, False],
[False, False, False, False, True, False]], dtype=bool)
3) Get the indices where the False occurs signalling the end of first True island for each row. Thus, for any row where there is no zero or if the first element is non-zero would result in argmin output as 0. Thus, our next task would be to use broadcasting to create a mask that starts as True from first row and stops being True at those argmin indices. This would be one with broadcasted-comparison against a range array extending covering all columns.
In [209]: (a==0).argmin(1)
Out[209]: array([1, 1, 2, 0])
In [210]: (a==0).argmin(1)[:,None] > np.arange(a.shape[1])
Out[210]:
array([[ True, False, False, False, False, False],
[ True, False, False, False, False, False],
[ True, True, False, False, False, False],
[False, False, False, False, False, False]], dtype=bool)
Timings
In [196]: a = np.random.randint(0,9,(5000,5000))
In [197]: %timeit a.cumsum(axis=1) == 0 ##Brad Solomon
...: %timeit np.minimum.accumulate(a == 0, axis=1) ##Psidom
...: %timeit (a==0).argmin(1)[:,None] > np.arange(a.shape[1])
...:
10 loops, best of 3: 69 ms per loop
10 loops, best of 3: 64.9 ms per loop
10 loops, best of 3: 32.8 ms per loop
I have an array with these type of data inside it, and I need to sum up the columns with same date.
[["01-04-2013", 100.0, 110.0, 120, 0, 0, 0], ["02-04-2013", 100.0, 110.0, 130, 0, 0, 0], ["03-04-2013", 100.0, 110.0, 120, 0, 0, 0], ["10-04-2013", 100.0, 110.0, 100, 0, 0, 0], ["02-04-2013", 100.0, 140.0, 0, 70, 0, 0], ["10-04-2013", 100.0, 140.0, 0, 100, 0, 0], ["11-04-2013", 100.0, 140.0, 0, 110, 0, 0], ["12-04-2013", 100.0, 140.0, 0, 120, 0, 0], ["09-04-2013", 0.0, 0.0, 0, 0, 130, 0], ["17-04-2013", 0.0, 0.0, 0, 0, 30, 0], ["15-04-2013", 100.0, 130.0, 0, 0, 0, 17], ["17-04-2013", 100.0, 130.0, 0, 0, 0, 90], ["18-04-2013", 100.0, 130.0, 0, 0, 0, 100]]
How can I do it in ruby? I meant to sum up the rows with the same date into one row, and if there is no duplicated date, keep the old ones.
require 'pp'
require 'matrix'
d = [["01-04-2013", 100.0, 110.0, 120, 0, 0, 0], ["02-04-2013", 100.0, 110.0, 130, 0, 0, 0], ["03-04-2013", 100.0, 110.0, 120, 0, 0, 0], ["10-04-2013", 100.0, 110.0, 100, 0, 0, 0], ["02-04-2013", 100.0, 140.0, 0, 70, 0, 0], ["10-04-2013", 100.0, 140.0, 0, 100, 0, 0], ["11-04-2013", 100.0, 140.0, 0, 110, 0, 0], ["12-04-2013", 100.0, 140.0, 0, 120, 0, 0], ["09-04-2013", 0.0, 0.0, 0, 0, 130, 0], ["17-04-2013", 0.0, 0.0, 0, 0, 30, 0], ["15-04-2013", 100.0, 130.0, 0, 0, 0, 17], ["17-04-2013", 100.0, 130.0, 0, 0, 0, 90], ["18-04-2013", 100.0, 130.0, 0, 0, 0, 100]]
pp(
d.group_by(&:first).values.reject do |v|
v.size <= 1
end.map do |e|
e.inject do |m, e|
(Vector.[](*m) + Vector.[](*e)).to_a
end
end
)
Update after comments:
d.group_by(&:first).values.map do |e|
e.inject do |m, e|
[e[0], (Vector.[](*m[1..-1]) + Vector.[](*e[1..-1])).to_a].flatten
end
end.sort
Specification change alert:
def v m
Vector.[](*m.drop(1))
end
d.group_by(&:first).values.map do |group|
r = group.inject do |m, e|
[e[0], *(v(m) + v(e)).to_a]
end
r[1] /= group.size
r[2] /= group.size
r
end.sort
Note. I'm not saying this is homework, but in the cases that are, it should be obvious that when we just do it for the students, we are not really doing them any favors, right? Plus, this solution is provided on a public site that is instantly indexed by google and, being in the top 100 sites in the world, it is not exactly a secret to the prof or the grader. And what if the school is using a national database like http://turnitin.com/ ? I suppose they could check public code snippets if they wanted to. And finally, there is some rather well-written code posted on SO by the, ahem, hobbyists. I'm not sure it can typically pass for lower-division intro-course original work, if I, ahem, say so myself. :-)
aggregated_rows = rows.group_by(&:first).map do |date, rows_by_date|
values = rows_by_date.transpose.drop(1).map { |xs| xs.reduce(:+) }
[date, values]
end
#[["01-04-2013", [100.0, 110.0, 120, 0, 0, 0]],
# ["02-04-2013", [200.0, 250.0, 130, 70, 0, 0]],
...
# ["18-04-2013", [100.0, 130.0, 0, 0, 0, 100]]]
a = [["01-04-2013", 100.0, 110.0, 120, 0, 0, 0], ["02-04-2013", 100.0, 110.0, 130, 0, 0, 0], ["03-04-2013", 100.0, 110.0, 120, 0, 0, 0], ["10-04-2013", 100.0, 110.0, 100, 0, 0, 0], ["02-04-2013", 100.0, 140.0, 0, 70, 0, 0], ["10-04-2013", 100.0, 140.0, 0, 100, 0, 0], ["11-04-2013", 100.0, 140.0, 0, 110, 0, 0], ["12-04-2013", 100.0, 140.0, 0, 120, 0, 0], ["09-04-2013", 0.0, 0.0, 0, 0, 130, 0], ["17-04-2013", 0.0, 0.0, 0, 0, 30, 0], ["15-04-2013", 100.0, 130.0, 0, 0, 0, 17], ["17-04-2013", 100.0, 130.0, 0, 0, 0, 90], ["18-04-2013", 100.0, 130.0, 0, 0, 0, 100]]
require "pp"
def group_and_sum_rows_by_date_string(a)
# instantiate a hash that returns an empty array for a key
# that doesn't exist
h = Hash.new([])
a.each do |row|
# populate the hash with date string as key, and array of
# arrays of the values for that date string
h[k=row.shift] = ([row] + h[k]).compact
end
# add up all the corresponding values in each element's array
# arrays, and return the result as an array
h.map{|k, v| [k, v.transpose.map{|x| x.inject(:+)}]}
end
pp group_and_sum_rows_by_date_string(a)
[["15-04-2013", [100.0, 130.0, 0, 0, 0, 17]],
["03-04-2013", [100.0, 110.0, 120, 0, 0, 0]],
["02-04-2013", [200.0, 250.0, 130, 70, 0, 0]],
["17-04-2013", [100.0, 130.0, 0, 0, 30, 90]],
["18-04-2013", [100.0, 130.0, 0, 0, 0, 100]],
["09-04-2013", [0.0, 0.0, 0, 0, 130, 0]],
["01-04-2013", [100.0, 110.0, 120, 0, 0, 0]],
["12-04-2013", [100.0, 140.0, 0, 120, 0, 0]],
["10-04-2013", [200.0, 250.0, 100, 100, 0, 0]],
["11-04-2013", [100.0, 140.0, 0, 110, 0, 0]]]
require 'pp'
a = [["01-04-2013", 100.0, 110.0, 120, 0, 0, 0], ["02-04-2013", 100.0, 110.0, 130, 0, 0, 0], ["03-04-2013", 100.0, 110.0, 120, 0, 0, 0], ["10-04-2013", 100.0, 110.0, 100, 0, 0, 0], ["02-04-2013", 100.0, 140.0, 0, 70, 0, 0], ["10-04-2013", 100.0, 140.0, 0, 100, 0, 0], ["11-04-2013", 100.0, 140.0, 0, 110, 0, 0], ["12-04-2013", 100.0, 140.0, 0, 120, 0, 0], ["09-04-2013", 0.0, 0.0, 0, 0, 130, 0], ["17-04-2013", 0.0, 0.0, 0, 0, 30, 0], ["15-04-2013", 100.0, 130.0, 0, 0, 0, 17], ["17-04-2013", 100.0, 130.0, 0, 0, 0, 90], ["18-04-2013", 100.0, 130.0, 0, 0, 0, 100]]
h = {}
a.group_by(&:first).each{|k,v| v.flatten!.delete(k); h[k] = v.inject(:+)}
pp h
output:
{"01-04-2013"=>330.0,
"02-04-2013"=>650.0,
"03-04-2013"=>330.0,
"10-04-2013"=>650.0,
"11-04-2013"=>350.0,
"12-04-2013"=>360.0,
"09-04-2013"=>130.0,
"17-04-2013"=>350.0,
"15-04-2013"=>247.0,
"18-04-2013"=>330.0}
pp a.group_by(&:first).map{|k,v| v.flatten!.uniq!}
Output:
[["01-04-2013", 100.0, 110.0, 120, 0],
["02-04-2013", 100.0, 110.0, 130, 0, 140.0, 70],
["03-04-2013", 100.0, 110.0, 120, 0],
["10-04-2013", 100.0, 110.0, 100, 0, 140.0],
["11-04-2013", 100.0, 140.0, 0, 110],
["12-04-2013", 100.0, 140.0, 0, 120],
["09-04-2013", 0.0, 0, 130],
["17-04-2013", 0.0, 0, 30, 100.0, 130.0, 90],
["15-04-2013", 100.0, 130.0, 0, 17],
["18-04-2013", 100.0, 130.0, 0, 100]]
pp a.group_by(&:first).map{|k,v| v.transpose.map!{|a| a.inject(:+)}}
Output:
[["01-04-2013", 100.0, 110.0, 120, 0, 0, 0],
["02-04-201302-04-2013", 200.0, 250.0, 130, 70, 0, 0],
["03-04-2013", 100.0, 110.0, 120, 0, 0, 0],
["10-04-201310-04-2013", 200.0, 250.0, 100, 100, 0, 0],
["11-04-2013", 100.0, 140.0, 0, 110, 0, 0],
["12-04-2013", 100.0, 140.0, 0, 120, 0, 0],
["09-04-2013", 0.0, 0.0, 0, 0, 130, 0],
["17-04-201317-04-2013", 100.0, 130.0, 0, 0, 30, 90],
["15-04-2013", 100.0, 130.0, 0, 0, 0, 17],
["18-04-2013", 100.0, 130.0, 0, 0, 0, 100]]