3-D batch matrix multiplication without knowing batch size - arrays

I'm currently writing a tensorflow program that requires multiplying a batch of 2-D tensors (a 3-D tensor of shape [None,...]) with a 2-D matrix W. This requires turning W into a 3-D matrix, which requires knowing the batch size.
I have not been able to do this; tf.batch_matmul is no longer usable, x.get_shape().as_list()[0] returns None, which is invalid for a reshaping/tiling operation. Any suggestions? I've seen some people use config.cfg.batch_size, but I don't know what that is.

Solution is to use a combination of tf.shape (which returns the shape at runtime) and tf.tile (which accepts the dynamic shape).
x = tf.placeholder(shape=[None, 2, 3], dtype=tf.float32)
W = tf.Variable(initial_value=np.ones([3, 4]), dtype=tf.float32)
print(x.shape) # Dynamic shape: (?, 2, 3)
batch_size = tf.shape(x)[0] # A tensor that gets the batch size at runtime
W_expand = tf.expand_dims(W, axis=0)
W_tile = tf.tile(W_expand, multiples=[batch_size, 1, 1])
result = tf.matmul(x, W_tile) # Can multiply now!
with tf.Session() as sess:
sess.run(tf.global_variables_initializer())
feed_dict = {x: np.ones([10, 2, 3])}
print(sess.run(batch_size, feed_dict=feed_dict)) # 10
print(sess.run(result, feed_dict=feed_dict).shape) # (10, 2, 4)

Related

Julia iteratively make 3D array from 2D arrays

I am trying to make a 3D array from many 2D arrays.
Image Files
Each image becomes a 2D array.
https://drive.google.com/drive/folders/1xBucvqhKFjAfbRIhq5wjr40kSjNor_0t?usp=sharing
using Images, Colors
paths = readdir(
"/Users/me/Downloads/ct_scans"
, join = true
)
images_3D = []
for p = paths
img = load(p)
gray = Gray.(img)
arr = convert(Array{Float64}, gray) # <----- 2D array
append!(images_3d, arr)
end
>>> size(images_3d)
(1536000) # <--- 1D view?
>>> 1536000 == 80*160*120
true
>>> reshaped_3d = reshape(images_3d, (80,160,120))
>>> Gray.(reshaped_3d[1,:,:])
# 160x120 scrambled mess of pixels not rearranged as expected
append! makes a size== 1D array that does not reshape as expected.
Whereas push! creates an array of hard arrays that keep their shape. It’s not technically 3D, just an 80 element vector.
When I tried to initialize an empty 3D and then overwrite each 2D with my own 2D image I got Matrix{Float64} to Float64 type conversion failures.
Can’t iteratively vcat 2D arrays because cannot overwrite variables.
Part of the reason for posting this is to see how Julia programmers approach multi-dimensional arrays.
There's multiple ways to do this, you'll have to tty and test which one is the best in your case.
with append! and resize
Arrays in Julia should start iterating with the first index, which the number of images is the last index. If 80 is the amount of images, the reshape should be
reshape(images_3d, (160,120,80))
(maybe exchange 120 and 160, not sure about this one).
And then to get the first image, it's reshaped_3d[:,:,1]
with push!
push!ing the matrices and then creating the 3d array with cat would work too :
julia> A = [rand(3,4) for i in 1:2];
julia> cat(A..., dims=3)
3×4×2 Array{Float64, 3}:
[:, :, 1] =
0.372747 0.17654 0.398272 0.231992
0.514789 0.342374 0.399816 0.277959
0.908909 0.864676 0.9788 0.585375
[:, :, 2] =
0.358169 0.816448 0.0558052 0.404178
0.747453 0.80815 0.384903 0.447053
0.314895 0.46264 0.947465 0.170982
initialize the 3D Array (probably the best one)
and fill it up progressively
julia> A = Array{Float64}(undef,3,4,2);
julia> for i in 1:2
A[:,:,i] = rand(3,4)
end
julia> A
3×4×2 Array{Float64, 3}:
[:, :, 1] =
0.478106 0.829818 0.526572 0.644238
0.714812 0.781246 0.93239 0.759864
0.523958 0.955136 0.70079 0.193489
[:, :, 2] =
0.481405 0.561407 0.184557 0.449584
0.547769 0.170311 0.371797 0.538843
0.0285712 0.731686 0.00126473 0.452273
Just add to the accepted answer, looping over the first index will be even faster. Consider the following two functions, test1() is faster to run because the loop is in the first index.
aa_stack1 = zeros(3, 10000, 10000);
aa_stack3 = zeros(10000, 10000, 3);
function test1()
for ii = 1:3
aa_stack1[ii, :, :] = rand(10000, 10000)
end
end
function test2()
for ii = 1:3
aa_stack3[:, :, ii] = rand(10000, 10000)
end
end
#time test1()
#time test2()
The first way "maximizes memory locality and reduces cache misses" because "when you iterate over the first dimension, the values of the other two dimensions are kept in cache, which means that accessing them takes less time" (according to ChatGPT).

Using numpy `as_strided` function to create patches, tiles, rolling or sliding windows of arbitrary dimension

Spent a while this morning looking for a generalized question to point duplicates to for questions about as_strided and/or how to make generalized window functions. There seem to be a lot of questions on how to (safely) create patches, sliding windows, rolling windows, tiles, or views onto an array for machine learning, convolution, image processing and/or numerical integration.
I'm looking for a generalized function that can accept a window, step and axis parameter and return an as_strided view for over arbitrary dimensions. I will give my answer below, but I'm interested if anyone can make a more efficient method, as I'm not sure using np.squeeze() is the best method, I'm not sure my assert statements make the function safe enough to write to the resulting view, and I'm not sure how to handle the edge case of axis not being in ascending order.
DUE DILIGENCE
The most generalized function I can find is sklearn.feature_extraction.image.extract_patches written by #eickenberg (as well as the apparently equivalent skimage.util.view_as_windows), but those are not well documented on the net, and can't do windows over fewer axes than there are in the original array (for example, this question asks for a window of a certain size over just one axis). Also often questions want a numpy only answer.
#Divakar created a generalized numpy function for 1-d inputs here, but higher-dimension inputs require a bit more care. I've made a bare bones 2D window over 3d input method, but it's not very extensible.
EDIT JAN 2020: Changed the iterable return from a list to a generator to save memory.
EDIT OCT 2020: Put the generator in a separate function, since mixing generators and return statements doesn't work intiutively.
Here's the recipe I have so far:
def window_nd(a, window, steps = None, axis = None, gen_data = False):
"""
Create a windowed view over `n`-dimensional input that uses an
`m`-dimensional window, with `m <= n`
Parameters
-------------
a : Array-like
The array to create the view on
window : tuple or int
If int, the size of the window in `axis`, or in all dimensions if
`axis == None`
If tuple, the shape of the desired window. `window.size` must be:
equal to `len(axis)` if `axis != None`, else
equal to `len(a.shape)`, or
1
steps : tuple, int or None
The offset between consecutive windows in desired dimension
If None, offset is one in all dimensions
If int, the offset for all windows over `axis`
If tuple, the steps along each `axis`.
`len(steps)` must me equal to `len(axis)`
axis : tuple, int or None
The axes over which to apply the window
If None, apply over all dimensions
if tuple or int, the dimensions over which to apply the window
gen_data : boolean
returns data needed for a generator
Returns
-------
a_view : ndarray
A windowed view on the input array `a`, or `a, wshp`, where `whsp` is the window shape needed for creating the generator
"""
ashp = np.array(a.shape)
if axis != None:
axs = np.array(axis, ndmin = 1)
assert np.all(np.in1d(axs, np.arange(ashp.size))), "Axes out of range"
else:
axs = np.arange(ashp.size)
window = np.array(window, ndmin = 1)
assert (window.size == axs.size) | (window.size == 1), "Window dims and axes don't match"
wshp = ashp.copy()
wshp[axs] = window
assert np.all(wshp <= ashp), "Window is bigger than input array in axes"
stp = np.ones_like(ashp)
if steps:
steps = np.array(steps, ndmin = 1)
assert np.all(steps > 0), "Only positive steps allowed"
assert (steps.size == axs.size) | (steps.size == 1), "Steps and axes don't match"
stp[axs] = steps
astr = np.array(a.strides)
shape = tuple((ashp - wshp) // stp + 1) + tuple(wshp)
strides = tuple(astr * stp) + tuple(astr)
as_strided = np.lib.stride_tricks.as_strided
a_view = np.squeeze(as_strided(a,
shape = shape,
strides = strides))
if gen_data :
return a_view, shape[:-wshp.size]
else:
return a_view
def window_gen(a, window, **kwargs):
#Same docstring as above, returns a generator
_ = kwargs.pop(gen_data, False)
a_view, shp = window_nd(a, window, gen_data = True, **kwargs)
for idx in np.ndindex(shp):
yield a_view[idx]
Some test cases:
a = np.arange(1000).reshape(10,10,10)
window_nd(a, 4).shape # sliding (4x4x4) window
Out: (7, 7, 7, 4, 4, 4)
window_nd(a, 2, 2).shape # (2x2x2) blocks
Out: (5, 5, 5, 2, 2, 2)
window_nd(a, 2, 1, 0).shape # sliding window of width 2 over axis 0
Out: (9, 2, 10, 10)
window_nd(a, 2, 2, (0,1)).shape # tiled (2x2) windows over first and second axes
Out: (5, 5, 2, 2, 10)
window_nd(a,(4,3,2)).shape # arbitrary sliding window
Out: (7, 8, 9, 4, 3, 2)
window_nd(a,(4,3,2),(1,5,2),(0,2,1)).shape #arbitrary windows, steps and axis
Out: (7, 5, 2, 4, 2, 3) # note shape[-3:] != window as axes are out of order

Looping through slices of Theano tensor

I have two 2D Theano tensors, call them x_1 and x_2, and suppose for the sake of example, both x_1 and x_2 have shape (1, 50). Now, to compute their mean squared error, I simply run:
T.sqr(x_1 - x_2).mean(axis = -1).
However, what I wanted to do was construct a new tensor that consists of their mean squared error in chunks of 10. In other words, since I'm more familiar with NumPy, what I had in mind was to create the following tensor M in Theano:
M = [theano.tensor.sqr(x_1[:, i:i+10] - x_2[:, i:i+10]).mean(axis = -1) for i in xrange(0, 50, 10)]
Now, since Theano doesn't have for loops, but instead uses scan (which map is a special case of), I thought I would try the following:
sequence = T.arange(0, 50, 10)
M = theano.map(lambda i: theano.tensor.sqr(x_1[:, i:i+10] - x_2[:, i:i+10]).mean(axis = -1), sequence)
However, this does not seem to work, as I get the error:
only integers, slices (:), ellipsis (...), numpy.newaxis (None) and integer or boolean arrays are valid indices
Is there a way to loop through the slices using theano.scan (or map)? Thanks in advance, as I'm new to Theano!
Similar to what can be done in numpy, a solution would be to reshape your (1, 50) tensor to a (1, 10, 5) tensor (or even a (10, 5) tensor), and then to compute the mean along the second axis.
To illustrate this with numpy, suppose I want to compute means by slices of 2
x = np.array([0, 2, 0, 4, 0, 6])
x = x.reshape([3, 2])
np.mean(x, axis=1)
outputs
array([ 1., 2., 3.])

Julia Approach to python equivalent list of lists

I just started tinkering with Julia and I'm really getting to like it. However, I am running into a road block. For example, in Python (although not very efficient or pythonic), I would create an empty list and append a list of a known size and type, and then convert to a NumPy array:
Python Snippet
a = []
for ....
a.append([1.,2.,3.,4.])
b = numpy.array(a)
I want to be able to do something similar in Julia, but I can't seem to figure it out. This is what I have so far:
Julia snippet
a = Array{Float64}[]
for .....
push!(a,[1.,2.,3.,4.])
end
The result is an n-element Array{Array{Float64,N},1} of size (n,), but I would like it to be an nx4 Array{Float64,2}.
Any suggestions or better way of doing this?
The literal translation of your code would be
# Building up as rows
a = [1. 2. 3. 4.]
for i in 1:3
a = vcat(a, [1. 2. 3. 4.])
end
# Building up as columns
b = [1.,2.,3.,4.]
for i in 1:3
b = hcat(b, [1.,2.,3.,4.])
end
But this isn't a natural pattern in Julia, you'd do something like
A = zeros(4,4)
for i in 1:4, j in 1:4
A[i,j] = j
end
or even
A = Float64[j for i in 1:4, j in 1:4]
Basically allocating all the memory at once.
Does this do what you want?
julia> a = Array{Float64}[]
0-element Array{Array{Float64,N},1}
julia> for i=1:3
push!(a,[1.,2.,3.,4.])
end
julia> a
3-element Array{Array{Float64,N},1}:
[1.0,2.0,3.0,4.0]
[1.0,2.0,3.0,4.0]
[1.0,2.0,3.0,4.0]
julia> b = hcat(a...)'
3x4 Array{Float64,2}:
1.0 2.0 3.0 4.0
1.0 2.0 3.0 4.0
1.0 2.0 3.0 4.0
It seems to match the python output:
In [9]: a = []
In [10]: for i in range(3):
a.append([1, 2, 3, 4])
....:
In [11]: b = numpy.array(a); b
Out[11]:
array([[1, 2, 3, 4],
[1, 2, 3, 4],
[1, 2, 3, 4]])
I should add that this is probably not what you actually want to be doing as the hcat(a...)' can be expensive if a has many elements. Is there a reason not to use a 2d array from the beginning? Perhaps more context to the question (i.e. the code you are actually trying to write) would help.
The other answers don't work if the number of loop iterations isn't known in advance, or assume that the underlying arrays being merged are one-dimensional. It seems Julia lacks a built-in function for "take this list of N-D arrays and return me a new (N+1)-D array".
Julia requires a different concatenation solution depending on the dimension of the underlying data. So, for example, if the underlying elements of a are vectors, one can use hcat(a) or cat(a,dims=2). But, if a is e.g a 2D array, one must use cat(a,dims=3), etc. The dims argument to cat is not optional, and there is no default value to indicate "the last dimension".
Here is a helper function that mimics the np.array functionality for this use case. (I called it collapse instead of array, because it doesn't behave quite the same way as np.array)
function collapse(x)
return cat(x...,dims=length(size(x[1]))+1)
end
One would use this as
a = []
for ...
... compute new_a...
push!(a,new_a)
end
a = collapse(a)

Python 2.7: looping over 1D fibers in a multidimensional Numpy array

I am looking for a way to loop over 1D fibers (row, column, and multi-dimensional equivalents) along any dimension in a 3+-dimensional array.
In a 2D array this is fairly trivial since the fibers are rows and columns, so just saying for row in A gets the job done. But for 3D arrays for example, this expression iterates over 2D slices, not 1D fibers.
A working solution is the one below:
import numpy as np
A = np.arange(27).reshape((3,3,3))
func = np.sum
for fiber_index in np.ndindex(A.shape[:-1]):
print func(A[fiber_index])
However, I am wondering whether there is something that is:
More idiomatic
Faster
Hope you can help!
I think you might be looking for numpy.apply_along_axis
In [10]: def my_func(x):
...: return x**2 + x
In [11]: np.apply_along_axis(my_func, 2, A)
Out[11]:
array([[[ 0, 2, 6],
[ 12, 20, 30],
[ 42, 56, 72]],
[[ 90, 110, 132],
[156, 182, 210],
[240, 272, 306]],
[[342, 380, 420],
[462, 506, 552],
[600, 650, 702]]])
Although many NumPy functions (including sum) have their own axis argument to specify which axis to use:
In [12]: np.sum(A, axis=2)
Out[12]:
array([[ 3, 12, 21],
[30, 39, 48],
[57, 66, 75]])
numpy provides a number of different ways of looping over 1 or more dimensions.
Your example:
func = np.sum
for fiber_index in np.ndindex(A.shape[:-1]):
print func(fiber_index)
print A[fiber_index]
produces something like:
(0, 0)
[0 1 2]
(0, 1)
[3 4 5]
(0, 2)
[6 7 8]
...
generates all index combinations over the 1st 2 dim, giving your function the 1D fiber on the last.
Look at the code for ndindex. It's instructive. I tried to extract it's essence in https://stackoverflow.com/a/25097271/901925.
It uses as_strided to generate a dummy matrix over which an nditer iterate. It uses the 'multi_index' mode to generate an index set, rather than elements of that dummy. The iteration itself is done with a __next__ method. This is the same style of indexing that is currently used in numpy compiled code.
http://docs.scipy.org/doc/numpy-dev/reference/arrays.nditer.html
Iterating Over Arrays has good explanation, including an example of doing so in cython.
Many functions, among them sum, max, product, let you specify which axis (axes) you want to iterate over. Your example, with sum, can be written as:
np.sum(A, axis=-1)
np.sum(A, axis=(1,2)) # sum over 2 axes
An equivalent is
np.add.reduce(A, axis=-1)
np.add is a ufunc, and reduce specifies an iteration mode. There are many other ufunc, and other iteration modes - accumulate, reduceat. You can also define your own ufunc.
xnx suggests
np.apply_along_axis(np.sum, 2, A)
It's worth digging through apply_along_axis to see how it steps through the dimensions of A. In your example, it steps over all possible i,j in a while loop, calculating:
outarr[(i,j)] = np.sum(A[(i, j, slice(None))])
Including slice objects in the indexing tuple is a nice trick. Note that it edits a list, and then converts it to a tuple for indexing. That's because tuples are immutable.
Your iteration can applied along any axis by rolling that axis to the end. This is a 'cheap' operation since it just changes the strides.
def with_ndindex(A, func, ax=-1):
# apply func along axis ax
A = np.rollaxis(A, ax, A.ndim) # roll ax to end (changes strides)
shape = A.shape[:-1]
B = np.empty(shape,dtype=A.dtype)
for ii in np.ndindex(shape):
B[ii] = func(A[ii])
return B
I did some timings on 3x3x3, 10x10x10 and 100x100x100 A arrays. This np.ndindex approach is consistently a third faster than the apply_along_axis approach. Direct use of np.sum(A, -1) is much faster.
So if func is limited to operating on a 1D fiber (unlike sum), then the ndindex approach is a good choice.

Resources