Julia: faster matrix calculation - arrays

In my work, I have to deal with large size matrices.
For example, I use the following matrices.
using LinearAlgebra
#Pauli matrices
σ₁ = [0 1; 1 0]
σ₂ = [0 -im; im 0]
τ₁ = [0 1; 1 0]
τ₃ = [1 0; 0 -1]
#Trigonometric functions in real space
function EYE(Lx,Ly,Lz)
   
N = Lx*Ly*Lz
mat = Matrix{Complex{Float64}}(I, N, N)
return mat
end
function SINk₁(Lx,Ly,Lz)
   
N = Lx*Ly*Lz
mat = zeros(Complex{Float64},N,N)
for ix = 1:Lx
for iy = 1:Ly
for iz = 1:Lz
for dx in -1:1
jx = ix + dx
jx += ifelse(jx > Lx,-Lx,0)
jx += ifelse(jx < 1,Lx,0)
for dy in -1:1
jy = iy + dy
jy += ifelse(jy > Ly,-Ly,0)
jy += ifelse(jy < 1,Ly,0)
for dz in -1:1
jz = iz + dz
ii = (iz-1)*Lx*Ly + (ix-1)*Ly + iy
jj = (jz-1)*Lx*Ly + (jx-1)*Ly + jy
if 1 <= jz <= Lz
if dx == +1 && dy == 0 && dz == 0
mat[ii,jj] += -(im/2) 
end
if dx == -1 && dy == 0 && dz == 0
mat[ii,jj] += im/2
end
end
end
end
end
end
end
end
return mat
end
function COSk₃(Lx,Ly,Lz)
   
N = Lx*Ly*Lz
mat = zeros(Complex{Float64},N,N)
for ix = 1:Lx
for iy = 1:Ly
for iz = 1:Lz
for dx in -1:1
jx = ix + dx
jx += ifelse(jx > Lx,-Lx,0)
jx += ifelse(jx < 1,Lx,0)
for dy in -1:1
jy = iy + dy
jy += ifelse(jy > Ly,-Ly,0)
jy += ifelse(jy < 1,Ly,0)
for dz in -1:1
jz = iz + dz
ii = (iz-1)*Lx*Ly + (ix-1)*Ly + iy
jj = (jz-1)*Lx*Ly + (jx-1)*Ly + jy
if 1 <= jz <= Lz
if dx == 0 && dy == 0 && dz == +1
mat[ii,jj] += 1/2 
end
if dx == 0 && dy == 0 && dz == -1
mat[ii,jj] += 1/2
end
end
end
end
end
end
end
end
return mat
end
Then, I calculate
kron(SINk₁(Lx,Ly,Lz),kron(σ₁,τ₁)) + kron(EYE(Lx,Ly,Lz) + COSk₃(Lx,Ly,Lz),kron(σ₂,τ₃))
This calculation, however, takes much times for large Lx,Ly,Lz;
Lx = Ly = Lz = 15
#time kron(SINk₁(Lx,Ly,Lz),kron(σ₁,τ₁)) + kron(EYE(Lx,Ly,Lz) + COSk₃(Lx,Ly,Lz),kron(σ₂,τ₃))
4.692591 seconds (20 allocations: 8.826 GiB, 6.53% gc time)
Lx = Ly = Lz = 20
#time kron(SINk₁(Lx,Ly,Lz),kron(σ₁,τ₁)) + kron(EYE(Lx,Ly,Lz) + COSk₃(Lx,Ly,Lz),kron(σ₂,τ₃))
52.687861 seconds (20 allocations: 49.591 GiB, 2.69% gc time)
Are there faster ways to calculate Kronecker product, Addition or more proper definitions of EYE(Lx,Ly,Lz), SINk₁(Lx,Ly,Lz), COSk₃(Lx,Ly,Lz)?

The problem you're having is really simple. You are using at least O(n^12) memory. Since almost all of these values are 0, this is a huge waste. You should almost certainly be using Sparse arrays. This should bring your runtime to reasonable levels

Related

How do you invert euclidean (transform and rotation only) matrices in C?

How do you invert 4x3 matrices that are only translation and rotation, no scale? The sort of thing you would use to do an OpenGL Matrix inverse (just without scaling)?
Assuming your TypeMatrix3x4 is a [3][4] matrix, and you are only transforming a 1:1 scale, rotation and translation matrix, the following code seems to work -
This transposes the rotation matrix and applies the inverse of the translation.
TypeMatrix3x4 InvertHmdMatrix34( TypeMatrix3x4 mtoinv )
{
int i, j;
TypeMatrix3x4 out = { 0 };
for( i = 0; i < 3; i++ )
for( j = 0; j < 3; j++ )
out.m[j][i] = mtoinv.m[i][j];
for ( i = 0; i < 3; i++ )
{
out.m[i][3] = 0;
for( j = 0; j < 3; j++ )
out.m[i][3] += out.m[i][j] * -mtoinv.m[j][3];
}
return out;
}
You can solve that for any 3 dimensional affine transformation whose 3x3 transformation matrix is invertible. This allows you to include scaling and non conformant applications. The only requirement is for the 3x3 matrix to be invertible.
Simply extend your 3x4 matrix to 4x4 by adding a row all zeros except the last element, and invert that matrix. For example, as shown below:
[[a b c d] [[x] [[x']
[e f g h] * [y] = [y']
[i j k l] [z] [z']
[0 0 0 1]] [1]] [1 ]] (added row)
It's easy to see that this 4x4 matrix, applied to your vector produces exactly the same vector as before the extension.
If you get the inverse of that matrix, you'll have:
[[A B C D] [[x'] [[x]
[E F G H] * [y'] = [y]
[I J K L] [z'] [z]
[0 0 0 1]] [1 ] [1]]
It's easy to see that it this works in one direction, it needs to be in the reverse direction, if A is the image of B, then B will be the inverse throug the inverse transformation, the only requisite is the matrix to be invertible.
More on... if you have a list of vectors you want to process, you can apply Gauss elimination method to an extended matrix of the form:
[[a b c d x0' x1' x2' ... xn']
[e f g h y0' y1' y2' ... yn']
[i j k l z0' z1' z2' ... zn']
[0 0 0 1 1 1 1 ... 1 ]]
to obtain the inverses of all the vectors you do the Gauss elimination vector to get from above:
[[1 0 0 0 x0 x1 x2 ... xn ]
[0 1 0 0 y0 y1 y2 ... yn ]
[0 0 1 0 z0 z1 z2 ... zn ]
[0 0 0 1 1 1 1 ... 1 ]]
and you will solve n problems in one shot, because the column vectors above will be the ones, that once transformed produce the former ones.
You can get a simple implementation I wrote to teach my son about linear algebra of Gauss/Jordan elimination method here. It's opensource (BSD license) and you can modify/adapt it to your needs. This method uses the last approach, and you can use it out of the box by trying the sist_lin program.
If you want the inverse transformation, put the following contents in the matrix, and apply Gauss elimination to:
a b c d 1 0 0 0
e f g h 0 1 0 0
i j k l 0 0 1 0
0 0 0 1 0 0 0 1
as input to sist_lin and you get:
1 0 0 0 A B C D <-- these are the coefs of the
0 1 0 0 E F G H inverse transformation
0 0 1 0 I J K L
0 0 0 1 0 0 0 1
you will have:
a * x + b * y + c * z + d = X
e * x + f * y + g * z + h = Y
i * x + j * y + k * z + l = Z
0 * x + 0 * y + 0 * z + 1 = 1
and
A * X + B * Y + C * Z + D = x
E * X + F * Y + G * Z + H = y
I * X + J * Y + K * Z + L = z
0 * X + 0 * Y + 0 * Z + 1 = 1

Defining variables explicitly vs accessing arrays

I am implementing the Runge-Kutta-Fehlberg method with adaptive step-size (RK45). I define and call my Butcher tableau in a notebook with
module FehlbergTableau
using StaticArrays
export A, B, CH, CT
A = #SVector [ 0 , 2/9 , 1/3 , 3/4 , 1 , 5/6 ]
B = #SMatrix [ 0 0 0 0 0
2/9 0 0 0 0
1/12 1/4 0 0 0
69/128 -243/128 135/64 0 0
-17/12 27/4 -27/5 16/15 0
65/432 -5/16 13/16 4/27 5/144 ]
CH = #SVector [ 47/450 , 0 , 12/25 , 32/225 , 1/30 , 6/25 ]
CT = #SVector [ -1/150 , 0 , 3/100 , -16/75 , -1/20 , 6/25 ]
end
using .FehlbergTableau
If I code the algorithm for RK45 straightforwardly as
function infinitesimal_flow(A::SVector{6,Float64}, B::SMatrix{6,5,Float64}, CH::SVector{6,Float64}, CT::SVector{6,Float64}, t0::Float64, Δt::Float64, J∇H::Function, x0::SVector{N,Float64}) where N
k1 = Δt * J∇H( t0 + Δt*A[1], x0 )
k2 = Δt * J∇H( t0 + Δt*A[2], x0 + B[2,1]*k1 )
k3 = Δt * J∇H( t0 + Δt*A[3], x0 + B[3,1]*k1 + B[3,2]*k2 )
k4 = Δt * J∇H( t0 + Δt*A[4], x0 + B[4,1]*k1 + B[4,2]*k2 + B[4,3]*k3 )
k5 = Δt * J∇H( t0 + Δt*A[5], x0 + B[5,1]*k1 + B[5,2]*k2 + B[5,3]*k3 + B[5,4]*k4 )
k6 = Δt * J∇H( t0 + Δt*A[6], x0 + B[6,1]*k1 + B[6,2]*k2 + B[6,3]*k3 + B[6,4]*k4 + B[6,5]*k5 )
TE = CT[1]*k1 + CT[2]*k2 + CT[3]*k3 + CT[4]*k4 + CT[5]*k5 + CT[6]*k6
xt = x0 + CH[1]*k1 + CH[2]*k2 + CH[3]*k3 + CH[4]*k4 + CH[5]*k5 + CH[6]*k6
norm(TE), xt
end
and compare it with the more compact implementation
function infinitesimal_flow_2(A::SVector{6,Float64}, B::SMatrix{6,5,Float64}, CH::SVector{6,Float64}, CT::SVector{6,Float64}, t0::Float64,Δt::Float64,J∇H::Function, x0::SVector{N,Float64}) where N
k = MMatrix{N,6}(0.0I)
TE = zero(x0); xt = x0
for i=1:6
# EDIT: this is wrong! there should be a new variable here, as pointed
# out by Lutz Lehmann: xs = x0
for j=1:i-1
# xs += B[i,j] * k[:,j]
x0 += B[i,j] * k[:,j] #wrong
end
k[:,i] = Δt * J∇H(t0 + Δt*A[i], x0)
TE += CT[i]*k[:,i]
xt += CH[i]*k[:,i]B[i,j] * k[:,j]
end
norm(TE), xt
end
Then the first function, which defines variables explicitly, is much faster:
J∇H(t::Float64, X::SVector{N,Float64}) where N = #SVector [ -X[2]^2, X[1] ]
x0 = SVector{2}([0.0, 1.0])
infinitesimal_flow(A, B, CH, CT, 0.0, 1e-2, J∇H, x0)
infinitesimal_flow_2(A, B, CH, CT, 0.0, 1e-2, J∇H, x0)
#btime infinitesimal_flow($A, $B, $CH, $CT, 0.0, 1e-2, $J∇H, $x0)
>> 19.387 ns (0 allocations: 0 bytes)
#btime infinitesimal_flow_2($A, $B, $CH, $CT, 0.0, 1e-2, $J∇H, $x0)
>> 50.985 ns (0 allocations: 0 bytes)
I cannot find a type instability or anything to justify the lag, and for more complex tableaus it is mandatory that I use the algorithm in loop form. What am I doing wrong?
P.S.: The bottleneck in infinitesimal_flow_2 is the line k[:,i] = Δt * J∇H(t0 + Δt*A[i], x0).
Each stage of the RK method computes its evaluation point directly from the base point of the RK step. This is explicit in the first method. In the second method you would have to reset the point computation in each stage, such as in
for i=1:6
xs = x0
for j=1:i-1
xs += B[i,j] * k[:,j]
end
k[:,i] = Δt * J∇H(t0 + Δt*A[i], xs)
...
The slightest error in the step computation can catastrophically throw off the step-size controller, forcing the step size to fall towards zero and thus the effort to increase drastically. An example is the 4101 error in RKF45

Julia- Make Array{Float64,3} in a short time

I want to get Array{Float64,3} from a matrix-valued function H(Lx,Ly,Lz) ,where Lx,Ly,Lz are parameters and H is a (Lx×Ly×Lz)×(Lx×Ly×Lz) matrix.
The sample code is
using LinearAlgebra
eye(T::Type,n) = Diagonal{T}(I, n)
eye(n) = eye(Float64,n)
function H(Lx,Ly,Lz) #def of H
N = Lx*Ly*Lz 
 mat_Htb = zeros(Complex{Float64},N,N)
 for iz = 1:Lz
 for ix = 1:Lx
 for iy=1:Ly
 for dz in -1:1
 jz = iz + dz
 for dx in -1:1
 jx = ix + dx
 for dy in -1:1
 jy = iy + dy
 ii = (iz-1)*Lx*Ly + (ix-1)*Ly + (iy-1) + 1
 jj = (jz-1)*Lx*Ly + (jx-1)*Ly + (jy-1) + 1
 if 1 <= jx <= Lx && 1 <= jy <= Ly && 1 <= jz <= Lz
 if dx == +1 && dy == 0 && dz == 0
 mat_Htb[ii,jj] += im  
 end
 if dx == -1 && dy == 0 && dz == 0
 mat_Htb[ii,jj] += im/4  
 end
 if dx == 0 && dy == +1 && dz == 0
 mat_Htb[ii,jj] += im/2
 end
 if dx == 0 && dy == -1 && dz == 0
 mat_Htb[ii,jj] += im
 end
 if dx == 0 && dy == 0 && dz == +1
 mat_Htb[ii,jj] += -im
 end
 if dx == 0 && dy == 0 && dz == -1
 mat_Htb[ii,jj] += im*(3/7)
 end
 if dx == 0 && dy == 0 && dz == 0
 mat_Htb[ii,jj] += im
 end
 end
 end
 end
 end
 end
 end
 end
 return mat_Htb
end
Lx = 10 #systemsize-parameters
Ly = 10
Lz = 10
ψ0 = Complex{Float64}[] #def of \psi0 ,(Lx×Ly×Lz)×1 vector
for iz = 1:Lz
for ix = 1:Lx
for iy=1:Ly
gauss = exp(-((ix-5)^2 + (iy-5)^2 + (iz-5)^2))
push!(ψ0,gauss)
end
end
end
ψ(t) = exp((-im*t).*H(Lx,Ly,Lz))*ψ0 #time-evolution
abs2ψ(t) = abs2.(ψ(t)./norm(ψ(t))) #normalized density
Then, I tried to make an Array{Float64,3} like this.
x = 1:Lx # our value range
y = 1:Ly
z = 1:Lz
t = 15 #time
ρ(ix,iy,iz) = abs2ψ(t)[(iz-1)*Lx*Ly + (ix-1)*Ly + (iy-1) + 1]
density = Float64[ρ(ix,iy,iz) for ix in x, iy in y,iz in z]
H(Lx,Ly,Lz),ψ(t),abs2ψ(t),ρ(ix,iy,iz) are calculated smoothly.
But the density takes about 30 minutes.
Ultimately, I will do loop-calculations for t.
So I want to reduce the calculation time.
Could you tell me how to solve this problem?
There are still numerous things that would probably need to be improved, but the following version should already be much faster than yours.
The key thing to remember is to try and not recompute the same thing several times (especially if it takes some time to compute it, and you're going to re-use the result a large number of times).
In your example, this applies to :
H, which only depends on Lx, Ly and Lz, and as such can be computed once and for all
ψ and abs2ψ, which depend on H and t, and should therefore get updated at each time step -- but can be re-used for all (ix, iy, iz) triplets.
using LinearAlgebra
function H(Lx,Ly,Lz)
N = Lx*Ly*Lz 
mat_Htb = zeros(Complex{Float64},N,N)
for iz = 1:Lz
for ix = 1:Lx
for iy=1:Ly
for dz in -1:1
jz = iz + dz
for dx in -1:1
jx = ix + dx
for dy in -1:1
jy = iy + dy
ii = (iz-1)*Lx*Ly + (ix-1)*Ly + (iy-1) + 1
jj = (jz-1)*Lx*Ly + (jx-1)*Ly + (jy-1) + 1
if 1 <= jx <= Lx && 1 <= jy <= Ly && 1 <= jz <= Lz
if dx == +1 && dy == 0 && dz == 0
mat_Htb[ii,jj] += im
end
if dx == -1 && dy == 0 && dz == 0
mat_Htb[ii,jj] += im/4
end
if dx == 0 && dy == +1 && dz == 0
mat_Htb[ii,jj] += im/2
end
if dx == 0 && dy == -1 && dz == 0
mat_Htb[ii,jj] += im
end
if dx == 0 && dy == 0 && dz == +1
mat_Htb[ii,jj] += -im
end
if dx == 0 && dy == 0 && dz == -1
mat_Htb[ii,jj] += im*(3/7)
end
if dx == 0 && dy == 0 && dz == 0
mat_Htb[ii,jj] += im
end
end
end
end
end
end
end
end
return mat_Htb
end
function run(Lx, Ly, Lz)
ψ0 = Complex{Float64}[] #def of \psi0 ,(Lx×Ly×Lz)×1 vector
for iz = 1:Lz
for ix = 1:Lx
for iy=1:Ly
gauss = exp(-((ix-5)^2 + (iy-5)^2 + (iz-5)^2))
push!(ψ0,gauss)
end
end
end
x = 1:Lx # our value range
y = 1:Ly
z = 1:Lz
t = 15 #time
H_ = H(Lx,Ly,Lz)
ψ = exp((-im*t).*H_)*ψ0 #time-evolution
abs2ψ = abs2.(ψ./norm(ψ)) #normalized density
ρ(ix,iy,iz) = abs2ψ[(iz-1)*Lx*Ly + (ix-1)*Ly + (iy-1) + 1]
density = Float64[ρ(ix,iy,iz) for ix in x, iy in y,iz in z]
end
Lx = 10 #systemsize-parameters
Ly = 10
Lz = 10
run(Lx, Ly, Lz)
For questions like this, which are very specific to a code that you want to optimize, I tend to think that posting on Julia's discourse forum would be more appropriate.

Optimization of finding all pairs between two arrays

The problem: For two arrays of integers A and B I'm trying to find all ( i , j ) such that A[ i ] == B[ j ]
The issue is that I need to run this on millions upon millions of arrays of integers. (each being up to maybe 100 in length) So, speed is the name of the game! I'd like help milking all the speed I can from this, or even just a better overall function.
My approach so far is to virtually sort A and B via:
sorted(range(len(A)), key=lambda k: A[k])
It then goes through, raising either i or j (depending on which is lower)
If a match is found between and A and B, it scans A and B until it finds the next value that doesn't match, and treats it as a "matching range"
It then saves the cross product of these ranges as found pairs.
The code terminates when either:
A and B are at the end of their respective arrays, or when one is at the end and the other is currently pointing to a value that is greater (as it would be impossible to find a pair by scanning farther)
A = [3,7,8,2,1]#Example arrays. Lengths can be different.
B = [4,2,9,3,6,3]#Real arrays will be longer (up to 100).
SortA=sorted(range(len(A)), key=lambda k: A[k])
SortB=sorted(range(len(B)), key=lambda k: B[k])
solns = []
i = 0
j = 0
done = False
while(not done):
if(A[SortA[i]] == B[SortB[j]]):
if(i == len(A) - 1):
endA = 0
for m in range(len(A) - i - 1):
endA = m + 1
if(A[SortA[i]] != A[SortA[i + m + 1]]):
endA = m
break;
if(j == len(B) - 1):
endB = 0
for n in range(len(B) - j - 1):
endB = n + 1
if(B[SortB[j]] != B[SortB[j + n + 1]]):
endB = n
break;
for r in range(i, i + endA + 1):
for s in range(j, j + endB + 1):
solns.append((r,s))
i = i + endA + 1
j = j + endB + 1
else:
if(A[SortA[i]] < B[SortB[j]]):
i = i + 1
else:
j = j + 1
if(i == len(A)):
if(j == len(B)):
done = True
else:
if(A[SortA[len(A)-1]] < B[SortB[j]]):
done = True
i = i - 1
if(j == len(B)):
if(B[SortB[len(B)-1]] < A[SortA[i]]):
done = True
j = j - 1
The expected results for A and B in the above code are:
Match : (2, 2) Index: (3,1)
Match : (3, 3) Index: (0,3)
Match : (3, 3) Index: (0,5)
You should simply use the == builtin functions and module like pandas. Here is an example:
import pandas as pd
A = pd.DataFrame({'a': [0,1], 'b': [2,3]})
B = pd.DataFrame({'a': [0,1], 'b': [2,3]})
print( (A==B).all().all() )
A = pd.DataFrame({'a': [0,-1], 'b': [2,3]})
B = pd.DataFrame({'a': [0,1], 'b': [2,3]})
print( (A==B).all().all() )
output:
True
False

Returnin a value from a matrix in a function

I have a problem with my function as it is not giving me the appropriate value from Matrix MB. The function is meant to calculate the inverse of a matrix. Every time I hit the button, Label1.text shows '0' which is not the right number.
Can you please help me find out what I did wrong in here to get the right value for the inverse matrix jA at jA(1,1)
Public Class Form1
Function MatrixInverse(ma(,), cf, c) As Double
Dim JJ = 0
Dim J = 0
Dim L = 0
Dim K = 0
Dim F = 0
Dim D As Double = 0
Dim EA As Double = 0.0
Dim i
i = ma.GetLength(0)
J = 0
JJ = 0
Dim MB(i, i)
While JJ < i
While J < i
If J = JJ Then
MB(JJ, J) = 1
Else
MB(JJ, J) = 0
End If
J = J + 1
End While
JJ = JJ + 1
End While
JJ = 0
J = 0
While JJ < i
While J < i
D = 1 / ma(JJ, JJ)
L = JJ
While K < i
ma(L, K) = ma(L, K) * D
MB(L, K) = MB(L, K) * D
K = K + 1
End While
EA = ma(J, JJ)
If J <> JJ Then
F = 0
While F < i
ma(J, F) = ma(J, F) - (EA * ma(JJ, F))
MB(J, F) = MB(J, F) - (EA * MB(JJ, F))
F = F + 1
End While
End If
J = J + 1
End While
F = 0
EA = 0
D = 0
K = 0
J = 0
JJ = JJ + 1
End While
Return MB(cf, c)
End Function
Private Sub Form1_Load(sender As Object, e As EventArgs) Handles MyBase.Load
End Sub
Private Sub Button1_Click(sender As Object, e As EventArgs) Handles Button1.Click
Dim jA(3, 3)
jA = {{11, 4, 12, 5}, {7, 5, 6, 2.1}, {13, 14, 10, 8.1}, {3.1, 2, 1.09, 3.4}}
Label1.Text = MatrixInverse(jA, 1, 1)
End Sub
End Class
I found my error. I simply forgot to add 1 single line: resetting J to 0 before ad 1 to jj:
While JJ < i
While J < i
If J = JJ Then
MB(JJ, J) = 1
Else
MB(JJ, J) = 0
End If
J = J + 1
End While
# J = 0 #
JJ = JJ + 1
End While

Resources