1D-Coupled Transient Diffusion in FiPY with Reactive Boundary Condition - pde

I would like to solve the transient diffusion equation for two compounds A and B as shown in image. I think the image is a better way to show my problem.
Diffusion equations and boundary conditions.
As you can see, the reaction only occurs at the surface and the flux of A is equal to flux of B. So, this two equations are coupled only at surface. The boundary condition is similar to ROBIN boundary condition, explained in Fipy manual. However, the main difference is the existence of the second variable in boundary condition. Does anybody have any idea how to formulate this boundary condition in Fipy?
I guess I need to add some extra term to ROBIN boundary condition, but I couldn't figure it out.
I really appreciate your help.
This is the code which solves the mentioned equation with ROBIN boundary condition # x=0.
-D(dC_A/dx) = -kC_A
-D(dC_B/dx) = -kC_B
In this condition, I can easily use ROBIN boundary condition to solve equations. The results seem reasonable for this boundary condition.
"""
Question for StackOverflow
"""
#%%
from fipy import Variable, FaceVariable, CellVariable, Grid1D, TransientTerm, DiffusionTerm, Viewer, ImplicitSourceTerm
from fipy.tools import numerix
#%%
##### Model parameters
L= 8.4853e-4 # m boundary layer thickness
dx= 1e-8 # mesh size
nx = int(L/dx)+1 # number of meshes
D = 1e-9 # m^2/s diffusion coefficient
k = 1e-4 # m/s reaction coefficient R = k [c_A],
c_inf = 0. # ROBIN general condition, once can think R = k ([c_A]-[c_inf])
c_init = 1. # Initial concentration of compound A, mol/m^3
#%%
###### Meshing and variable definition
mesh = Grid1D(nx=nx, dx=dx)
c_A = CellVariable(name="c_A", hasOld = True,
mesh=mesh,
value=c_init)
c_B = CellVariable(name="c_B", hasOld = True,
mesh=mesh,
value=0.)
#%%
##### Right boundary condition
valueRight = c_init
c_A.constrain(valueRight, mesh.facesRight)
c_B.constrain(0., mesh.facesRight)
#%%
### ROBIN BC requirements, defining cellDistanceVectors
## This code is for fixing celldistance via this link:
## https://stackoverflow.com/questions/60073399/fipy-problem-with-grid2d-celltofacedistancevectors-gives-error-uniformgrid2d
MA = numerix.MA
tmp = MA.repeat(mesh._faceCenters[..., numerix.NewAxis,:], 2, 1)
cellToFaceDistanceVectors = tmp - numerix.take(mesh._cellCenters, mesh.faceCellIDs, axis=1)
tmp = numerix.take(mesh._cellCenters, mesh.faceCellIDs, axis=1)
tmp = tmp[..., 1,:] - tmp[..., 0,:]
cellDistanceVectors = MA.filled(MA.where(MA.getmaskarray(tmp), cellToFaceDistanceVectors[:, 0], tmp))
#%%
##### Defining mask and Robin BC at left boundary
mask = mesh.facesLeft
Gamma0 = D
Gamma = FaceVariable(mesh=mesh, value=Gamma0)
Gamma.setValue(0., where=mask)
dPf = FaceVariable(mesh=mesh,
value=mesh._faceToCellDistanceRatio * cellDistanceVectors)
n = mesh.faceNormals
a = FaceVariable(mesh=mesh, value=k, rank=1)
b = FaceVariable(mesh=mesh, value=D, rank=0)
g = FaceVariable(mesh=mesh, value= k * c_inf, rank=0)
RobinCoeff = (mask * Gamma0 * n / (-dPf.dot(a)+b))
#%%
#### Making a plot
viewer = Viewer(vars=(c_A, c_B),
datamin=-0.2, datamax=c_init * 1.4)
viewer.plot()
#%% Time step and simulation time definition
time = Variable()
t_simulation = 4 # seconds
timeStepDuration = .05
steps = int(t_simulation/timeStepDuration)
#%% PDE Equations
eqcA = (TransientTerm(var=c_A) == DiffusionTerm(var=c_A, coeff=Gamma) +
(RobinCoeff * g).divergence
- ImplicitSourceTerm(var=c_A, coeff=(RobinCoeff * a.dot(-n)).divergence))
eqcB = (TransientTerm(var=c_B) == DiffusionTerm(var=c_B, coeff=Gamma) -
(RobinCoeff * g).divergence
+ ImplicitSourceTerm(var=c_B, coeff=(RobinCoeff * a.dot(-n)).divergence))
#%% A loop for solving PDE equations
while time() <= (t_simulation):
time.setValue(time() + timeStepDuration)
c_B.updateOld()
c_A.updateOld()
res1=res2 = 1e10
viewer.plot()
while (res1 > 1e-6) & (res2 > 1e-6):
res1 = eqcA.sweep(var=c_A, dt=timeStepDuration)
res2 = eqcB.sweep(var=c_B, dt=timeStepDuration)

It's possible to solve this as a fully implicit system. The code below simplifies the problem to have a unity domain size and diffusion coefficient. k is set to 0.2. It captures the analytical solution quite well with some caveats (see below).
from fipy import (
CellVariable,
TransientTerm,
DiffusionTerm,
ImplicitSourceTerm,
Grid1D,
Viewer,
)
L = 1.0
nx = 1000
dx = L / nx
konstant = 0.2
coeff = 1.0
mesh = Grid1D(nx=nx, dx=dx)
var_a = CellVariable(mesh=mesh, value=1.0, hasOld=True)
var_b = CellVariable(mesh=mesh, value=0.0, hasOld=True)
var_a.constrain(1.0, mesh.facesRight)
var_b.constrain(0.0, mesh.facesRight)
coeff_mask = ~mesh.facesLeft * coeff
boundary_coeff = konstant * (mesh.facesLeft * mesh.faceNormals).divergence
eqn_a = TransientTerm(var=var_a) == DiffusionTerm(
coeff_mask, var=var_a
) - ImplicitSourceTerm(boundary_coeff, var=var_a) + ImplicitSourceTerm(
boundary_coeff, var=var_b
)
eqn_b = TransientTerm(var=var_b) == DiffusionTerm(
coeff_mask, var=var_b
) - ImplicitSourceTerm(boundary_coeff, var=var_b) + ImplicitSourceTerm(
boundary_coeff, var=var_a
)
eqn = eqn_a & eqn_b
for _ in range(5):
var_a.updateOld()
var_b.updateOld()
eqn.sweep(dt=1e10)
Viewer((var_a, var_b)).plot()
print("var_a[0] (expected):", (1 + konstant) / (1 + 2 * konstant))
print("var_b[0] (expected):", konstant / (1 + 2 * konstant))
print("var_a[0] (actual):", var_a[0])
print("var_b[0] (actual):", var_b[0])
input("wait")
Note the following:
As written the boundary condition is only first order accurate, which doesn't really matter for this problem, but might hurt you for in higher dimensions. There might be ways to fix this such as having a small cell near the boundary or adding in an explicit second order correction for the boundary condition.
The equations are coupled here. If uncoupled it would probably require loads of iterations to reach equilibrium.
It did require a few iterations to reach equilibrium, but it shouldn't. That's probably due to the solver not converging adequately without a few tries. It might be that coupled equations have some bad conditioning.

Related

Bootstrapping the uncertainty on an RMSE estimate of a location-scale generalized additive model

I have height data (numeric height data in cm; Height) of plants measured over time (numeric data expressed in days of the year; Doy). These data is grouped per genotype (factor data; Genotype) and individual plant (Factor data; Individual). I've managed to calculate the RMSE of the location-scale GAM but I can't figure out how to bootstrap the uncertainty estimate on the RMSE calculation given it is a hierarchical location-scale generalized additive model.
The code to extract the RMSE value looks something like this:
# The GAM
model <- gam(list(Height ~ s(Doy, bs = 'ps', by = Genotype) +
s(Doy, Individual, bs = "re") +
Genotype,
~ s(Doy, bs = 'ps', by = Genotype) +
s(Doy, Individual, bs = "re") +
Genotype),
family = gaulss(), # Gaussian location-scale
method = "REML",
data = data)
# Extract the model formula
form <- formula.gam(model)
# Cross-validation for the location
CV <- CVgam(form[[1]], data, nfold = 10, debug.level = 0, method = "GCV.Cp",
printit = TRUE, cvparts = NULL, gamma = 1, seed = 29)
# The root mean square error is given by taking the square root of the MSE
sqrt(CV$cvscale[1])`
There is only one height measurement per Individual per day of the year. I figure this is problematic in maintaining the exact same formulation of the GAM. In thsi regard, I was thinking of making sure that the same few Individuals of each genotype (let's say n = 4) were randomly sampled over each day of the year. I can't figure out how to proceed though. Any ideas?
I've tried several methods, such as the boot package and for loops. An example of one of things I've tried is:
lm=list();counter=0
lm2=list()
loops = 3
for (i in 1:loops){
datax <- data %>%
group_by(Doy, Genotype) %>%
slice_sample(prop = 0.6, replace = T)
datax
model <- gam(list(Height ~ s(Doy, bs = 'ps', by = Genotype) +
s(Doy, Individual, bs = "re") +
Genotype,
~ s(Doy, bs = 'ps', by = Genotype) +
s(Doy, Individual, bs = "re") +
Genotype),
family = gaulss(),
method = "REML",
data = datax)
# Extract the model formula
form <- formula.gam(model)
# Cross-validation for the location
CV <- CVgam(form[[1]], datax, nfold = 10, debug.level = 0, method = "GCV.Cp",
printit = TRUE, cvparts = NULL, gamma = 1, seed = 29)
RMSE[i] <- sqrt(CV$cvscale[c(1)])
}
RMSE
This loop runs very slow and just returns me 3 times the same RMSE values; Surely, there is an issue with the sampling.
Unfortunately, I can't share my data but maybe somebody has an idea on how to proceed?
Many thanks!

Quantum walk on 3D grid

I am trying to apply the quantum coin walk on a 3D grid, with 3 Hadamard coins. However I can't seem to get symmetric results after 3 steps. Is it simply not possible to have a probability distribution which is symmetric with such a coin?
Thank you
ps the implementation is based on http://susan-stepney.blogspot.com/2014/02/mathjax.html and the position vector captures a 3D grid.
pps Has this been attempted on qiskit? I couldn't use the hard coded matrix to get result perfectly symmetric for some reasons...
Not sure I answered your question, but
from the code reference you mentioned, I only changed line 30 to:ax = fig.add_subplot(111, projection = '3d') and line 3 to:from mpl_toolkits.mplot3d import Axes3D
from numpy import *
from matplotlib.pyplot import *
from mpl_toolkits.mplot3d import Axes3D
N = 100 # number of random steps
P = 2*N+1 # number of positions
coin0 = array([1, 0]) # |0>
coin1 = array([0, 1]) # |1>
C00 = outer(coin0, coin0) # |0><0|
C01 = outer(coin0, coin1) # |0><1|
C10 = outer(coin1, coin0) # |1><0|
C11 = outer(coin1, coin1) # |1><1|
C_hat = (C00 + C01 + C10 - C11)/sqrt(2.)
ShiftPlus = roll(eye(P), 1, axis=0)
ShiftMinus = roll(eye(P), -1, axis=0)
S_hat = kron(ShiftPlus, C00) + kron(ShiftMinus, C11)
U = S_hat.dot(kron(eye(P), C_hat))
posn0 = zeros(P)
posn0[N] = 1 # array indexing starts from 0, so index N is the central posn
psi0 = kron(posn0,(coin0+coin1*1j)/sqrt(2.))
psiN = linalg.matrix_power(U, N).dot(psi0)
prob = empty(P)
for k in range(P):
posn = zeros(P)
posn[k] = 1
M_hat_k = kron( outer(posn,posn), eye(2))
proj = M_hat_k.dot(psiN)
prob[k] = proj.dot(proj.conjugate()).real
fig = figure()
ax = fig.add_subplot(111, projection = '3d')
plot(arange(P), prob)
plot(arange(P), prob, 'o')
loc = range(0, P, P // 10) #Location of ticks
xticks(loc)
xlim(0, P)
ax.set_xticklabels(range(-N, N+1, P // 10))
show()

Array subsetting in Julia

With the Julia Language, I defined a function to sample points uniformly inside the sphere of radius 3.14 using rejection sampling as follows:
function spherical_sample(N::Int64)
# generate N points uniformly distributed inside sphere
# using rejection sampling:
points = pi*(2*rand(5*N,3).-1.0)
ind = sum(points.^2,dims=2) .<= pi^2
## ideally I wouldn't have to do this:
ind_ = dropdims(ind,dims=2)
return points[ind_,:][1:N,:]
end
I found a hack for subsetting arrays:
ind = sum(points.^2,dims=2) .<= pi^2
## ideally I wouldn't have to do this:
ind_ = dropdims(ind,dims=2)
But, in principle array indexing should be a one-liner. How could I do this better in Julia?
The problem is that you are creating a 2-dimensional index vector. You can avoid it by using eachrow:
ind = sum.(eachrow(points.^2)) .<= pi^2
So that your full answer would be:
function spherical_sample(N::Int64)
points = pi*(2*rand(5*N,3).-1.0)
ind = sum.(eachrow(points.^2)) .<= pi^2
return points[ind,:][1:N,:]
end
Here is a one-liner:
points[(sum(points.^2,dims=2) .<= pi^2)[:],:][1:N, :]
Note that [:] is dropping a dimension so the BitArray can be used for indexing.
This does not answer your question directly (as you already got two suggestions), but I rather thought to hint how you could implement the whole procedure differently if you want it to be efficient.
The first point is to avoid generating 5*N rows of data - the problem is that it is very likely that it will be not enough to generate N valid samples. The point is that the probability of a valid sample in your model is ~50%, so it is possible that there will not be enough points to choose from and [1:N, :] selection will throw an error.
Below is the code I would use that avoids this problem:
function spherical_sample(N::Integer) # no need to require Int64 only here
points = 2 .* pi .* rand(N, 3) .- 1.0 # note that all operations are vectorized to avoid excessive allocations
while N > 0 # we will run the code until we have N valid rows
v = #view points[N, :] # use view to avoid allocating
if sum(x -> x^2, v) <= pi^2 # sum accepts a transformation function as a first argument
N -= 1 # row is valid - move to the previous one
else
rand!(v) # row is invalid - resample it in place
#. v = 2 * pi * v - 1.0 # again - do the computation in place via broadcasting
end
end
return points
end
This one is pretty fast, and uses StaticArrays. You can probably also implement something similar with ordinary tuples:
using StaticArrays
function sphsample(N)
T = SVector{3, Float64}
v = Vector{T}(undef, N)
n = 1
while n <= N
p = rand(T) .- 0.5
#inbounds v[n] = p .* 2π
n += (sum(abs2, p) <= 0.25)
end
return v
end
On my laptop it is ~9x faster than the solution with views.

How to vectorize the antenna arrayfactor expression in matlab

I have the antenna array factor expression here:
I have coded the array factor expression as given below:
lambda = 1;
M = 100;N = 200; %an M x N array
dx = 0.3*lambda; %inter-element spacing in x direction
m = 1:M;
xm = (m - 0.5*(M+1))*dx; %element positions in x direction
dy = 0.4*lambda;
n = 1:N;
yn = (n - 0.5*(N+1))*dy;
thetaCount = 360; % no of theta values
thetaRes = 2*pi/thetaCount; % theta resolution
thetas = 0:thetaRes:2*pi-thetaRes; % theta values
phiCount = 180;
phiRes = pi/phiCount;
phis = -pi/2:phiRes:pi/2-phiRes;
cmpWeights = rand(N,M); %complex Weights
AF = zeros(phiCount,thetaCount); %Array factor
tic
for i = 1:phiCount
for j = 1:thetaCount
for p = 1:M
for q = 1:N
AF(i,j) = AF(i,j) + cmpWeights(q,p)*exp((2*pi*1j/lambda)*(xm(p)*sin(thetas(j))*cos(phis(i)) + yn(q)*sin(thetas(j))*sin(phis(i))));
end
end
end
end
How can I vectorize the code for calculating the Array Factor (AF).
I want the line:
AF(i,j) = AF(i,j) + cmpWeights(q,p)*exp((2*pi*1j/lambda)*(xm(p)*sin(thetas(j))*cos(phis(i)) + yn(q)*sin(thetas(j))*sin(phis(i))));
to be written in vectorized form (by modifying the for loop).
Approach #1: Full-throttle
The innermost nested loop generates this every iteration - cmpWeights(q,p)*exp((2*pi*1j/lambda)*(xm(p)*sin(thetas(j))*cos(phis(i)) + yn(q)*sin(thetas(j))*sin(phis(i)))), which are to summed up iteratively to give us the final output in AF.
Let's call the exp(.... part as B. Now, B basically has two parts, one is the scalar (2*pi*1j/lambda) and the other part
(xm(p)*sin(thetas(j))*cos(phis(i)) + yn(q)*sin(thetas(j))*sin(phis(i))) that is formed from the variables that are dependent on
the four iterators used in the original loopy versions - i,j,p,q. Let's call this other part as C for easy reference later on.
Let's put all that into perspective:
Loopy version had AF(i,j) = AF(i,j) + cmpWeights(q,p)*exp((2*pi*1j/lambda)*(xm(p)*sin(thetas(j))*cos(phis(i)) + yn(q)*sin(thetas(j))*sin(phis(i)))), which is now equivalent to AF(i,j) = AF(i,j) + cmpWeights(q,p)*B, where B = exp((2*pi*1j/lambda)*(xm(p)*sin(thetas(j))*cos(phis(i)) + yn(q)*sin(thetas(j))*sin(phis(i)))).
B could be simplified to B = exp((2*pi*1j/lambda)* C), where C = (xm(p)*sin(thetas(j))*cos(phis(i)) + yn(q)*sin(thetas(j))*sin(phis(i))).
C would depend on the iterators - i,j,p,q.
So, after porting onto a vectorized way, it would end up as this -
%// 1) Define vectors corresponding to iterators used in the loopy version
I = 1:phiCount;
J = 1:thetaCount;
P = 1:M;
Q = 1:N;
%// 2) Create vectorized version of C using all four vector iterators
mult1 = bsxfun(#times,sin(thetas(J)),cos(phis(I)).'); %//'
mult2 = bsxfun(#times,sin(thetas(J)),sin(phis(I)).'); %//'
mult1_xm = bsxfun(#times,mult1(:),permute(xm,[1 3 2]));
mult2_yn = bsxfun(#times,mult2(:),yn);
C_vect = bsxfun(#plus,mult1_xm,mult2_yn);
%// 3) Create vectorized version of B using vectorized C
B_vect = reshape(exp((2*pi*1j/lambda)*C_vect),phiCount*thetaCount,[]);
%// 4) Final output as matrix multiplication between vectorized versions of B and C
AF_vect = reshape(B_vect*cmpWeights(:),phiCount,thetaCount);
Approach #2: Less-memory intensive
This second approach would reduce the memory traffic and it uses the distributive property of exponential - exp(A+B) = exp(A)*exp(B).
Now, the original loopy version was this -
AF(i,j) = AF(i,j) + cmpWeights(q,p)*exp((2*pi*1j/lambda)*...
(xm(p)*sin(thetas(j))*cos(phis(i)) + yn(q)*sin(thetas(j))*sin(phis(i))))
So, after using the distributive property, we would endup with something like this -
K = (2*pi*1j/lambda)
part1 = K*xm(p)*sin(thetas(j))*cos(phis(i));
part2 = K*yn(q)*sin(thetas(j))*sin(phis(i));
AF(i,j) = AF(i,j) + cmpWeights(q,p)*exp(part1)*exp(part2);
Thus, the relevant vectorized approach would become something like this -
%// 1) Define vectors corresponding to iterators used in the loopy version
I = 1:phiCount;
J = 1:thetaCount;
P = 1:M;
Q = 1:N;
%// 2) Define the constant used at the start of EXP() call
K = (2*pi*1j/lambda);
%// 3) Perform the sine-cosine operations part1 & part2 in vectorized manners
mult1 = K*bsxfun(#times,sin(thetas(J)),cos(phis(I)).'); %//'
mult2 = K*bsxfun(#times,sin(thetas(J)),sin(phis(I)).'); %//'
%// Perform exp(part1) & exp(part2) in vectorized manners
part1_vect = exp(bsxfun(#times,mult1(:),xm));
part2_vect = exp(bsxfun(#times,mult2(:),yn));
%// Perform multiplications with cmpWeights for final output
AF = reshape(sum((part1_vect*cmpWeights.').*part2_vect,2),phiCount,[])
Quick Benchmarking
Here are the runtimes with the input data listed in the question for the original loopy approach and proposed approach #2 -
---------------------------- With Original Approach
Elapsed time is 358.081507 seconds.
---------------------------- With Proposed Approach #2
Elapsed time is 0.405038 seconds.
The runtimes suggests a crazy performance improvement with Approach #2!
The basic trick is to figure out what things are constant, and what things depend on the subscript term - and therefore are matrix terms.
Within the sum:
C(n,m) is a matrix
2π/λ is a constant
sin(θ)cos(φ) is a constant
x(m) and y(n) are vectors
So the two things I would do are:
Expand the xm and ym into matrices using meshgrid()
Take all the constant term stuff outside the loop.
Like this:
...
piFactor = 2 * pi * 1j / lambda;
[xgrid, ygrid] = meshgrid(xm, ym); % xgrid and ygrid will be size (N, M)
for i = 1:phiCount
for j = 1:thetaCount
xFactor = sin(thetas(j)) * cos(phis(i));
yFactor = sin(thetas(j)) * sin(phis(i));
expFactor = exp(piFactor * (xgrid * xFactor + ygrid * yFactor)); % expFactor is size (N, M)
elements = cmpWeights .* expFactor; % elements of sum, size (N, M)
AF(i, j) = AF(i, j) + sum(elements(:)); % sum and then integrate.
end
end
You could probably figure out how to vectorise the outer loop too, but hopefully that gives you a starting point.

Repeating utility values in Value Iteration (Markov Decision Process)

I am trying to implement the value iteration algorithm of the Markov Decision Process using python. I have one implementation. But, this is giving me many repeated values for the utilities. My transition matrix is quite sparse. Probably, this is causing the problem. But, I am not very sure if this assumption is correct. How should I correct this?
The code might be pretty shoddy. I am very new to value iteration. So please help me identify problems with my code. The reference code is this :http://carlo-hamalainen.net/stuff/mdpnotes/. I have used the ipod_mdp.py code file. Here is the snippet of my implementation:
num_of_states = 470 #total number of states
#initialization
V1 = [0.25] * num_of_states
get_target_index = state_index[(u'48.137654', u'11.579949')] #each state is a location
#print "The target index is ", get_target_index
V1[get_target_index] = -100 #assigning least cost to the target state
V2 = [0.0] * num_of_states
policy = [0.0] * num_of_states
count = 0.0
while max([abs(V1[i] - V2[i]) for i in range(num_of_states)]) > 0.001:
print max([abs(V1[i] - V2[i]) for i in range(num_of_states)])
print count
for s in range(num_of_states): #for each state
#initialize minimum action to the first action in the list
min_action = actions_index[actions[0]] #initialize - get the action index for the first iteration
min_action_cost = cost[s, actions_index[actions[0]]] #initialize the cost
for w in range(num_of_states):
if (s, state_index[actions[0]], w) in transitions: #if this transition exists in the matrix - non-zero value
min_action_cost += 0.9 * transitions[s, state_index[actions[0]], w] * V1[w]
else:
min_action_cost += 0.9 * 0.001 * V1[w] #if not - give it a small value of 0.001 instead of 0.0
#get the minimum action cost for the state
for a in actions:
this_cost = cost[s, actions_index[a]]
for w in range(num_of_states):
# if index_state[w] != 'm':
if (s, state_index[a], w) in transitions:
this_cost += 0.9 * transitions[s, state_index[a], w] * V1[w]
else:
this_cost += 0.9 * 0.001 * V1[w]
if this_cost < min_action_cost:
min_action = actions_index[a]
min_action_cost = this_cost
V2[s] = min_action_cost
policy[s] = min_action
V1, V2 = V2, V1 #swap
count += 1
Thank you very much.
I am not sure I understand your code fully. I will just leave my implementation here in case someone needs it.
import numpy as np
def valueIteration(R, P, discount, threshold):
V = np.copy(R)
old_V = np.copy(V)
error = float("inf")
while error > threshold:
old_V, V = (V, old_V)
max_values = np.dot(P, old_V).max(axis=1)
np.copyto(V, R + discount * max_values)
error = np.linalg.norm(V - old_V)
return V
S = 30
A = 4
R = np.zeros(S)
# Goal state S-1
R[S-2] = 1
P = np.random.rand(S,A,S)
# Goal state goes to dwell state
P[S-2,:,:] = 0
P[S-2,:,S-1] = 1
P[S-1,:,:] = 0
P[S-1,:,S-1] = 1
for s in range(S-2): #goal and dwell states do not need normalization
for a in range(A):
P[s,a,:] /= P[s,a,:].sum()
V = valueIteration(R,P,0.97,0.001)

Resources