Mapping elements in 3D lower "triangle" to linear structure - arrays

This is the 3D version of an existing question.
A 3D array M[x,y,z] of shape (n,n,n) should be mapped to a flat vector containing only the elements with x<=y<=z in order to save space. So what I need is an expression similar to the 2D case (index := x + (y+1)*y/2). I tried to derive some formulas but just can't get it right. Note that the element order inside the vector doesn't matter.

This is an extension of user3386109's answer for mapping an array of arbitrary dimension d with shape (n,...,n) into a vector of size size(d,n) only containing the elements whose indices satisfy X_1 <= X_2 <= ... <= X_d.

The 3D version of the equation is
index := (z * (z+1) * (z+2)) / 6 + (y * (y+1))/2 + x

In case someone interested, here is the code of #letmaik answer in python:
import math
from itertools import combinations_with_replacement
import numpy as np
ndim = 3 # The one you'd like
size = 4 # The size you'd like
array = np.ones([size for _ in range(ndim)]) * -1
indexes = combinations_with_replacement([n for n in range(size)], ndim)
def index(*args):
acc = []
for idx, val in enumerate(args):
rx = np.prod([val + i for i in range(idx + 1)])
acc.append(rx / math.factorial(idx + 1))
return sum(acc)
for args in indexes:
array[args] = index(*args)
print(array)
Although I must confess it could be improved as the order of the elements do not seem natural.

Related

handle function Matlab

I'm starting to use functions handles in Matlab and I have a question,
what Matlab computes when I do:
y = (0:.1:1)';
fun = #(x) x(1) + x(2).^2 + exp(x(3)*y)
and what Matlab computes when I do:
fun = #(x) x + x.^2 + exp(x*y)
Because I'm evaluating the Jacobian of these functions (from this code ) and it gives different results. I don't understand the difference of putting x(i) or only x
Let's define a vector vec as vec = [1, 2, 3].
When you use this vec in your first function as results = fun(vec), the program will take only the particular elements of the vector, meaning x(1) = vec(1), x(2) = vec(2) and x(3) = vec(3). The whole expression then will look as
results = vec(1) + vec(2).^2 + exp(vec(3)*y)
or better
results = 1 + 2^2 + exp(3*y)
However, when you use your second expression as results = fun(vec), it will use the entire vector vec in all the cases like this
results = vec + vec.^2 + exp(vec*y)
or better
results = [1, 2, 3] + [1^2, 2^2, 3^2] + exp([1, 2, 3]*y)
You can also clearly see that in the first case, I don't really need to care about matrix dimensions, and the final dimensions of the results variable are the same as the dimensions of your y variable. This is not the case in the second example, because you multiply matrices vec and y, which (in this particular example) results in error, as the vec variable has dimensions 1x3 and the y variable 11x1.
If you want to investigate this, I recommend you split this up into subexpressions and debug, e.g.
f1 = #(x) x(1);
f2 = #(x) x(2).^2;
f3 = #(x) exp(x(3)*y);
f = #(x) f1(x) + f1(x) + f3(x)
You can split it up even further if any subexpression is unclear.
The distinction is that one is an array array multiplication (x * y, I'm assuming x is an array with 11 columns in order for the matrix multiplication to be consistent) and the other is a scalar array multiplication (x(3) * y). The subscript operator (n) for any matrix extracts the n-th value from that matrix. For a scalar, the index can only be 1. For a 1D array, it extracts the n-th element of the column/row vector. For a 2D array, its the n-th element when traversed columnwise.
Also, if you only require the first derivative, I suggest using complex-step differentiation. It provides reduced numerical error and is computationally efficient.

Array subsetting in Julia

With the Julia Language, I defined a function to sample points uniformly inside the sphere of radius 3.14 using rejection sampling as follows:
function spherical_sample(N::Int64)
# generate N points uniformly distributed inside sphere
# using rejection sampling:
points = pi*(2*rand(5*N,3).-1.0)
ind = sum(points.^2,dims=2) .<= pi^2
## ideally I wouldn't have to do this:
ind_ = dropdims(ind,dims=2)
return points[ind_,:][1:N,:]
end
I found a hack for subsetting arrays:
ind = sum(points.^2,dims=2) .<= pi^2
## ideally I wouldn't have to do this:
ind_ = dropdims(ind,dims=2)
But, in principle array indexing should be a one-liner. How could I do this better in Julia?
The problem is that you are creating a 2-dimensional index vector. You can avoid it by using eachrow:
ind = sum.(eachrow(points.^2)) .<= pi^2
So that your full answer would be:
function spherical_sample(N::Int64)
points = pi*(2*rand(5*N,3).-1.0)
ind = sum.(eachrow(points.^2)) .<= pi^2
return points[ind,:][1:N,:]
end
Here is a one-liner:
points[(sum(points.^2,dims=2) .<= pi^2)[:],:][1:N, :]
Note that [:] is dropping a dimension so the BitArray can be used for indexing.
This does not answer your question directly (as you already got two suggestions), but I rather thought to hint how you could implement the whole procedure differently if you want it to be efficient.
The first point is to avoid generating 5*N rows of data - the problem is that it is very likely that it will be not enough to generate N valid samples. The point is that the probability of a valid sample in your model is ~50%, so it is possible that there will not be enough points to choose from and [1:N, :] selection will throw an error.
Below is the code I would use that avoids this problem:
function spherical_sample(N::Integer) # no need to require Int64 only here
points = 2 .* pi .* rand(N, 3) .- 1.0 # note that all operations are vectorized to avoid excessive allocations
while N > 0 # we will run the code until we have N valid rows
v = #view points[N, :] # use view to avoid allocating
if sum(x -> x^2, v) <= pi^2 # sum accepts a transformation function as a first argument
N -= 1 # row is valid - move to the previous one
else
rand!(v) # row is invalid - resample it in place
#. v = 2 * pi * v - 1.0 # again - do the computation in place via broadcasting
end
end
return points
end
This one is pretty fast, and uses StaticArrays. You can probably also implement something similar with ordinary tuples:
using StaticArrays
function sphsample(N)
T = SVector{3, Float64}
v = Vector{T}(undef, N)
n = 1
while n <= N
p = rand(T) .- 0.5
#inbounds v[n] = p .* 2π
n += (sum(abs2, p) <= 0.25)
end
return v
end
On my laptop it is ~9x faster than the solution with views.

Split, group and mean: computation with arrays

A is a given N x R xT array. I must split it horizontally to N sub-arrays of size L x M and then group each z together in an array K and take a mean.
For Example: A is the array rand(N,R,T)= rand( 16, 3 ,3); Now I am going to split it:
A=rand( 16, 3 ,3) : A(1,:,:), A(2,:,:), A(3,:,:), A(4,:,:), ... , A(16,:,:).
I have 16 slices.
B_1=A(1,:,:); B_2=A(2,:,:); B_3=A(3,:,:); ... ; B_16=A(16,:,:);
The next step is grouping together every 3 ( for example).
Now I am going create K_i as :
K_1(1,:,:)=B_1;
K_1(2,:,:)=B_2;
K_1(3,:,:)=B_3;
...
K_8(1,:,:)=B_14;
K_8(2,:,:)=B_15;
K_8(3,:,:)=B_16;
The average array is found as:
C_1=[B_1 + B_2 + B_3]/3
...
C_8= [ B_14 + B_15 + B_16] /3
I have implemented it as:
A_reshape = reshape(squeeze(A), size(A,2), size(A,3),2, []);
mean_of_all_slices = permute(mean(A_reshape , 3), [1 2 4 3]);
Question 1 I have checked by hand. It gives me a wrong result. How to fix it? [SOLVED]
EDIT 2 I need to simulate the following computation:
take a product each slice of the array K_i with another array P_p: It means:
for `K_1` is given `P_1`): `B_1 * P_1` , `B_2 * P_1`, `B_3 * P_1`
...
for `K_8` is given `P_8`): `B_14 * P_8` , `B_15 * P_8`, `B_16 * P_8`
I have solved!!!
Disclaimer: this answers a previous version of the question.
In cases such as this I would suggest relying on built-ins, which have a predictable behavior. In your case, this would be movmean (introduced in R2016a):
WIN_SZ = 2; % Window size for averaging
AVG_DIM = 1; % Dimension for averaging
tmp = movmean(A, WIN_SZ , AVG_DIM ,'Endpoints', 'discard');
C = tmp(1:WINDOW_SZ:end, :, :); % This only selects A1+A2, A3+A4 etc.
If your MATLAB is a bit older, this can also be done using convolution (convn, introduced before R2006):
WIN_SZ = 3;
tmp = convn(A, ones(WIN_SZ ,1)./WIN_SZ, 'valid'); % Shorter than A in dim1 by (WIN_SZ-1)
C = tmp(1:WINDOW_SZ:end, :, :); % dim1 size is: ceil((size(A,1)-(WIN_SZ-1))/3)
BTW, the step where you create B from slices of A can be done using
B = num2cell(A,[2,3]); % yields a 16x1 cell array of 1x3x3 double arrays

Fractal dimension algorithms gives results of >2 for time-series

I'm trying to compute Fractal Dimension of very specific time series array.
I've found implementations of Higuchi FD algorithm:
def hFD(a, k_max): #Higuchi FD
L = []
x = []
N = len(a)
for k in range(1,k_max):
Lk = 0
for m in range(0,k):
#we pregenerate all idxs
idxs = np.arange(1,int(np.floor((N-m)/k)),dtype=np.int32)
Lmk = np.sum(np.abs(a[m+idxs*k] - a[m+k*(idxs-1)]))
Lmk = (Lmk*(N - 1)/(((N - m)/ k)* k)) / k
Lk += Lmk
L.append(np.log(Lk/(m+1)))
x.append([np.log(1.0/ k), 1])
(p, r1, r2, s)=np.linalg.lstsq(x, L)
return p[0]
from https://github.com/gilestrolab/pyrem/blob/master/src/pyrem/univariate.py
and Katz FD algorithm:
def katz(data):
n = len(data)-1
L = np.hypot(np.diff(data), 1).sum() # Sum of distances
d = np.hypot(data - data[0], np.arange(len(data))).max() # furthest distance from first point
return np.log10(n) / (np.log10(d/L) + np.log10(n))
from https://github.com/ProjectBrain/brainbits/blob/master/katz.py
I expect results of ~1,5 in both cases however get 2,2 and 4 instead...
hFD(x,4) = 2.23965648024 (k value of here is chosen as an example, however result won't change much in range 4-12 edit: I was able to get result of ~1,9 with k=22, however this still does not make any sense);
katz(x) = 4.03911343057
Which in theory should not be possible for 1D time-series array.
Questions here are: are Higuchi and Katz algorithms not suitable for time-series analysis in general, or am I doing something wrong on my side? Also are there any other python libraries with already implemented and error-less algorithms to verify my results?
My array of interest (each element represents point in time t, t+1, t+2,..., t+N)
x = np.array([373.4413096546802, 418.58026161917803,
395.7387698762124, 416.21163042783206,
407.9812265426947, 430.2355284504048,
389.66095393296763, 442.18969320408166,
383.7448638776275, 452.8931822090381,
413.5696828065546, 434.45932712853585
,429.95212301648996, 436.67612861616215,
431.10235365546964, 418.86935850068545,
410.84902747247423, 444.4188867775925,
397.1576881118471, 451.6129904245434,
440.9181246439599, 438.9857353268666,
437.1800408012741, 460.6251405281339,
404.3208481355302, 500.0432305427639,
380.49579242696177, 467.72953450552893,
333.11328535523967, 444.1171938340972,
303.3024198243042, 453.16332062153276,
356.9697406524534, 520.0720647379901,
402.7949987727925, 536.0721418821788,
448.21609036718445, 521.9137447208354,
470.5822486372967, 534.0572029633416,
480.03741443274765, 549.2104258193126,
460.0853321729541, 561.2705350421926,
444.52689144575794, 560.0835589548401,
462.2154563472787, 559.7166600213686,
453.42374550322353, 559.0591804941763,
421.4899935529862, 540.7970410737004,
454.34364779193913, 531.6018122709779,
437.1545739076901, 522.4262260216169,
444.6017030695873, 533.3991716674865,
458.3492761150962, 513.1735160522104])
The array you are trying to estimate hDF is too short. You need to get longer sample or oversample the current one to have at least 128 points for hDF and more then 4000 points for Katz
import scipy.signal as signal
...
x_res=signal.resample(x,128)
hfd(x_res,4) will be 1.74383694265

Data.Map vs. Data.Array for symmetric matrices?

Sorry for the vague question, but I hope for an experienced Haskeller this is a no-brainer.
I have to represent and manipulate symmetric matrices, so there are basically three different choices for the data type:
Complete matrix storing both the (i,j) and (j,i) element, although m(i,j) = m(j,i)
Data.Array (Int, Int) Int
A map, storing only elements (i,j) with i <= j (upper triangular matrix)
Data.Map (Int, Int) Int
A vector indexed by k, storing the upper triangular matrix given some vector order f(i,j) = k
Data.Array Int Int
Many operations are going to be necessary on the matrices, updating a single element, querying for rows and columns etc. However, they will mainly act as containers, no linear algebra operations (inversion, det, etc) will be required.
Which one of the options would be the fastest one in general if the dimensionality of the matrices is going to be at around 20x20? When I understand correctly, every update (with (//) in the case of array) requires full copies, so going from 20x20=400 elements to 20*21/2 = 210 elements in the cases 2. or 3. would make a lot of sense, but access is slower for case 2. and 3. needs conversion at some point.
Are there any guidelines?
Btw: The 3rd option is not a really good one, as computing f^-1 requires square roots.
You could try using Data.Array using a specialized Ix class that only generates the upper half of the matrix:
newtype Symmetric = Symmetric { pair :: (Int, Int) } deriving (Ord, Eq)
instance Ix Symmetric where
range ((Symmetric (x1,y1)), (Symmetric (x2,y2))) =
map Symmetric [(x,y) | x <- range (x1,x2), y <- range (y1,y2), x >= y]
inRange (lo,hi) i = x <= hix && x >= lox && y <= hiy && y >= loy && x >= y
where
(lox,loy) = pair lo
(hix,hiy) = pair hi
(x,y) = pair i
index (lo,hi) i
| inRange (lo,hi) i = (x-loy)+(sum$take(y-loy)[hix-lox, hix-lox-1..])
| otherwise = error "Error in array index"
where
(lox,loy) = pair lo
(hix,hiy) = pair hi
(x,y) = pair i
sym x y
| x < y = Symmetric (y,x)
| otherwise = Symmetric (x,y)
*Main Data.Ix> let a = listArray (sym 0 0, sym 6 6) [0..]
*Main Data.Ix> a ! sym 3 2
14
*Main Data.Ix> a ! sym 2 3
14
*Main Data.Ix> a ! sym 2 2
13
*Main Data.Ix> length $ elems a
28
*Main Data.Ix> let b = listArray (sym 0 0, sym 19 19) [0..]
*Main Data.Ix> length $ elems b
210
There is a fourth option: use an array of decreasingly-large arrays. I would go with either option 1 (using a full array and just storing every element twice) or this last one. If you intend to be updating a lot of elements, I strongly recommend using a mutable array; IOArray and STArray are popular choices.
Unless this is for homework or something, you should also take a peek at Hackage. A quick look suggests the problem of manipulating matrices has been solved several times already.

Resources