How to fully parallelise sequential inner loops in Fortran with OpenMP

How to fully parallelise sequential inner loops in Fortran with OpenMP - loops

I'm attempting to parallelise some old fixed format Fortran code with OpenMP. I can't figure out how to fully parallelise the following structure of nested loops, with one outer loop and 2 sequential inner loops:
do y = 1,ny
do x = 1,nx
calculation 1
enddo
intermediate calculation (calculation 1)
do x = 1,nx
calculation 2 (intermediate calculation)
enddo
enddo
calculation 2 is such that it cannot be included in the first inner loop, and must be included in the outer loop as opposed to in a separate outer loop. This is due to a dependency on an intermediate calculation, itself dependent on all values of calculation 1
I am compiling with gfortran, and setting the environment variable OMP_NUM_THREADS=4. The following code illustrates the 3 approaches I've tested:
c Test of using OpenMP to parallelise sequential inner loops and an
c outer loop
c Use export OMP_NUM_THREADS=4
program parallelTest
c declarations
implicit none
integer nx, ny
parameter (nx = 2, ny = 5)
integer omp_get_thread_num, i, j, A(nx,ny), B(nx,ny), C(nx,ny),
> D(nx,ny), E(nx,ny), F(nx,ny)
c initialisation
A = 7
B = 7
C = 7
D = 7
E = 7
F = 7
c attempt 1: just executes first loop, 2nd loop is ignored
c$omp parallel do shared(A,B) private(i,j) schedule(static) collapse(2)
do j = 1,ny
do i = 1,nx
A(i,j) = omp_get_thread_num()
enddo
do i = 1,nx
B(i,j) = omp_get_thread_num()
enddo
enddo
c$omp end parallel do
c attempt 2: only parallelises outer loop
c$omp parallel do shared(C,D) private(i,j) schedule(static)
do j = 1,ny
do i = 1,nx
C(i,j) = omp_get_thread_num()
enddo
do i = 1,nx
D(i,j) = omp_get_thread_num()
enddo
enddo
c$omp end parallel do
c attempt 3: only parallelises inner loops
c$omp parallel shared(E,F) private(i,j)
do j = 1,ny
c$omp do schedule(static)
do i = 1,nx
E(i,j) = omp_get_thread_num()
enddo
c$omp end do
c$omp do schedule(static)
do i = 1,nx
F(i,j) = omp_get_thread_num()
enddo
c$omp end do
enddo
c$omp end parallel
c print output to terminal
do i = 1,nx
print *, 'A(', i, ',:) = ', A(i,:)
enddo
print *
do i = 1,nx
print *, 'B(', i, ',:) = ', B(i,:)
enddo
print *
print *
do i = 1,nx
print *, 'C(', i, ',:) = ', C(i,:)
enddo
print *
do i = 1,nx
print *, 'D(', i, ',:) = ', D(i,:)
enddo
print *
print *
do i = 1,nx
print *, 'E(', i, ',:) = ', E(i,:)
enddo
print *
do i = 1,nx
print *, 'F(', i, ',:) = ', F(i,:)
enddo
end
This gives the following output:
A( 1 ,:) = 0 0 1 2 3
A( 2 ,:) = 0 1 1 2 3
B( 1 ,:) = 7 7 7 7 7
B( 2 ,:) = 7 7 7 7 7
C( 1 ,:) = 0 0 1 2 3
C( 2 ,:) = 0 0 1 2 3
D( 1 ,:) = 0 0 1 2 3
D( 2 ,:) = 0 0 1 2 3
E( 1 ,:) = 0 0 0 0 0
E( 2 ,:) = 1 1 1 1 1
F( 1 ,:) = 0 0 0 0 0
F( 2 ,:) = 1 1 1 1 1
In approach 1, the outer loop and first inner loop are parallelised as I expect, but the 2nd loop doesn't execute (presumably because the do doesn't immediately follow a c$omp do?). In approach 2, only the outer loop is parallelised. In approach 3, only the inner loops are parallelised.
My question is: how do I get matrix B to be the same as matrix A? This seems like it should be a straightforward task; I'm assuming there's an OpenMP clause or structure I'm not utilising, but would much appreciate some pointing in the right direction of what to search for (new to OpenMP!).

If the information flow is as you have described it, I would re-write your loops as
do y = 1,ny
do x = 1,nx
calculation 1
enddo
enddo
do y = 1,ny
intermediate calculation (calculation 1)
enddo
do y = 1,ny
do x = 1,nx
calculation 2 (intermediate calculation)
enddo
enddo
And then parallelise each y loop separately.

Related

Matrix (arrays) in Lua - swap rows and apply row operation

I have this code
--[[
Gaussian Elimination in Lua
--]]
-- print matrix
function printmatrix(m)
for r=1,#m do
for c=1,#m[r] do
io.write(m[r][c])
if c < #m[r] then io.write(", ") end
end
print() -- print new line
end
end
-- read matrix in CSV format
function readcsv()
local m = {}
while true do
local line = io.read("l") -- read line not including the end of line character
if line==nil or line=="" then break end -- blank line or bad input ends matrix
local row, index = {}, 0
-- the next line is tricky and goes over all entries in the row
for w in string.gmatch(line,"([^,]*),?") do
local v = tonumber(w) -- convert entry to a number
index = index+1
if v==nil then
row[index] = 0 -- default value if we coudn't read the number
else
row[index] = v -- if number is valid
end
end
m[ #m+1 ] = row
end
return m
end
-- determine the size of m and check it is rectangular
function dim(m)
local rows = #m -- number of rows
local cols = 0 -- number of columns
if rows > 0 then cols = #m[1] end
-- check that matrix is rectangular
for i=2,rows do
if cols ~= #m[i] then error("not rectangular!") end
end
return rows, cols
end
-- if m[r][c] is zero swap row r with some row i>r to make m[r][c] nonzero, if possible
function swap(m,r,c)
local nrows, ncols = dim(m)
if r<=0 or r>nrows or c<=0 or c>ncols then error("position out of range") end
if m[r][c] ~= 0 then
-- nothing to do
return
end
-- find a suitable row
local i=r+1
while i <= nrows do
if m[i][c] ~= 0 then break end
i = i+1
end
if i <= nrows then
-- swap rows i,r
-- DO IT!
end
end
-- if m[r][c] is nonzero apply row operations to make each m[i][c]==0 for i>r
function clear(m,r,c)
local nrows, ncols = dim(m)
if r<=0 or r>nrows or c<=0 or c>ncols then error("position out of range") end
if m[r][c] == 0 then
-- nothing to do
return
end
for i=r+1,nrows do
local f = m[i][c] / m[r][c] do
-- apply row_i = row_i - f*row_r
-- DO IT!**
end
end
end
-- apply Gaussian elimination to m to get it into echelon form
function echelon(m)
local nrows, ncols = dim(m)
local r,c = 1,1 -- current position
while r<=nrows and c<=ncols do
-- try to get a nonzero value at this position
swap(m,r,c)
if m[r][c] == 0 then
-- can't, so move right
c = c+1
else
clear(m,r,c)
-- done, so move diagonally
r = r+1
c = c+1
end
end
end
m = readcsv()
print("original:")
printmatrix(m)
echelon(m)
print("echelon form:")
printmatrix(m)
I was hoping someone could clarify on how to write the code (where it says --DO IT! in Lua, I'm fairly new to this thank you
For some context, I'm just experimenting on Gaussian Elimination to try make my work faster during this specific method for computing echelon form -- I'm not too fussed about having 1's as the first non-zero element
It should return this
original:
1, 3, 5, 7
2, 1, -1, 0
3, 4, 4, 7
5, 5, 3, 7
echelon form:
1, 3, 5, 7
0, -5, -11, -14
0, 0, 0, 0
0, 0, 0, 0

t.lua:
--[[
Gaussian Elimination in Lua
--]]
-- print matrix
local function printmatrix (m)
for _, row in ipairs (m) do
io.write (table.concat (row, ', ') .. '\n')
end
end
-- read matrix in CSV format
local function readcsv (file)
io.input (file)
local m = {columns = 0, rectangular = true}
for line in io.lines () do
local row = {}
-- the next line is tricky and goes over all entries in the row
for w in line:gmatch '[^,]+' do
row [#row + 1] = tonumber (w) or 0
end
m [#m + 1] = row
-- Update matrix dimensions
m.rectangular = m.rectangular and (#row == m.columns or #m == 1)
m.columns = #row > m.columns and #row or m.columns
end
return m
end
-- if m[r][c] is zero swap row r with some row i>r to make m[r][c] nonzero, if possible
local function swap (m, r, c)
local nrows, ncols = #m, m.columns
if r <= 0 or r > nrows or c <= 0 or c > ncols then error 'Position out of range' end
if m [r] [c] ~= 0 then
-- nothing to do
return
end
-- find a suitable row
local i = r + 1
while i <= nrows and m [i] [c] == 0 do
i = i + 1
end
if i <= nrows then
m [r], m [i] = m [i], m [r]
end
end
-- if m[r][c] is nonzero apply row operations to make each m[i][c]==0 for i>r
local function clear (m, r, c)
local nrows, ncols = #m, m.columns
if r <= 0 or r > nrows or c <= 0 or c > ncols then error 'Position out of range' end
if m [r] [c] == 0 then
-- nothing to do
return
end
for i = r + 1, nrows do
local f = m [i] [c] / m [r] [c]
for j = 1, #m [i] do
m [i] [j] = m [i] [j] - f * m [r] [j]
end
end
end
-- apply Gaussian elimination to m to get it into echelon form
function echelon (m)
local nrows, ncols = #m, m.columns
local r, c = 1, 1 -- current position
while r <= nrows and c <= ncols do
-- try to get a nonzero value at this position
swap (m, r, c)
if m [r] [c] == 0 then
-- can't, so move right
c = c + 1
else
clear (m, r, c)
-- done, so move diagonally
r = r + 1
c = c + 1
end
end
end
local m = readcsv (arg [1])
print 'Original:'
printmatrix (m)
if m.rectangular then
echelon (m)
print 'Echelon form:'
printmatrix (m)
else
error 'Matrix not rectangular!'
end
t.scv:
1,3,5,7
2,1,-1,0
3,4,4,7
5,5,3,7
lua t.lua t.csv:
Original:
1, 3, 5, 7
2, 1, -1, 0
3, 4, 4, 7
5, 5, 3, 7
Echelon form:
1, 3, 5, 7
0, -5, -11, -14
0, 0, 0, 0
0, 0, 0, 0
You can also use the standard input (lua t.lua; type the values, terminate with Ctrl + D).
To swap anything in Lua, just do a, b = b, a.
I used a straightforward approach to your second doit (apply row_i = row_i - f*row_r): for j = 1, #m [i] do m [i] [j] = m [i] [j] - f * m [r] [j] end
`,
I made your code somewhat more consize and elegant. I also simplified the regular expression used to parse the CSV lines,
I also made matrix's dimensions and rectangularity calculated during its input.

Understanding control structure of nested loops in Python

All of the options below produce the same output, but I'm not quite understanding why. Is anyone able to explain why multiple values are printed for j on each line? I would think it would print either 0 every time when it is set equal to 0 or print 1, 2, 3, 4 instead.
Option 1:
for i in range(1, 6):
j = 0
while j < i:
print(j, end = " ")
j += 1
print("")
Option 2:
for i in range(1, 6):
for j in range(0, i):
print(j, end = " ")
print("")
Option 3:
i = 1
while i < 6:
j = 0
while j < i:
print(j, end = " ")
j += 1
i += 1
print("")
Output:
0
0 1
0 1 2
0 1 2 3
0 1 2 3 4

It is because of the inner while/for loop that one or more digits are printed on a single line.
As the value of i increments in the outer loop, the number of nested iterations increase with increasing value of i.
The digits are printed on the same line in inner loop due to end=" " argument to the first print statement and the next sequence appears on the new line because the second print statement in the outer iteration does not contain any such argument.
In order to gain better understanding, make following changes to your code, one by one and test run to see the effects:
Replace i in the inner loop with some constant value
Replace the space in end = " " with something else, e.g. end = "x"

How to find the number of times a group of a specific value is present in an array?

I have a 1 by 1000 (1 row by 1000 columns) matrix that contain only 0 and 1 as their elements. How can I find how many times 1 is repeated 3 times consecutively.
If there are more than 3 ones then it is necessary to reset the counting. So 4 would be 3+1 and it counts as only one instance of 3 consecutive 1s but 6 would be 3+3 so it counts as two instances of having 3 consecutive 1s.

This approach finds the differences between when A goes from 0 to 1 (rising edge) and from 1 to 0 (falling edge). This gives the lengths of consecutive 1s in each block. Then divide these numbers by 3 and round down to get the number of runs of 3.
Padding A with a 0 at the start and end just ensures we have a rising edge at the start if A starts with a 1, and we have a falling edge at the end if A ends with a 1.
A = round(rand(1,1000));
% padding with a 0 at the start and end will make this simpler
B = [0,A,0];
rising_edges = ~B(1:end-1) & B(2:end);
falling_edges = B(1:end-1) & ~B(2:end);
lengths_of_ones = find(falling_edges) - find(rising_edges);
N = sum(floor(lengths_of_ones / 3));
Or in a much less readable 2 lines:
A = round(rand(1,1000));
B = [0,A,0];
N = sum(floor((find(B(1:end-1) & ~B(2:end)) - find(~B(1:end-1) & B(2:end))) / 3));

You can define your custom functions like below
v = randi([0,1],1,1000);
% get runs in cell array
function C = runs(v)
C{1} = v(1);
for k = 2:length(v)
if v(k) == C{end}(end)
C{end} = [C{end},v(k)];
else
C{end+1} = v(k);
end
end
end
% count times of 3 consecutive 1s
function y = count(x)
if all(x)
y = floor(length(x)/3);
else
y = 0;
end
end
sum(cellfun(#count,runs(v)))

Here is another vectorized way:
% input
n = 3;
a = [1 1 1 1 0 0 1 1 1 0 0 0 1 1 1 1 1 0 1 1 1 1 1 1 1]
% x x x x x = 5
% output
a0 = [a 0];
b = cumsum( a0 ) % cumsum
c = diff( [0 b( ~( diff(a0) + 1 ) ) ] ) % number of ones within group
countsOf3 = sum( floor( c/n ) ) % groups of 3
You like it messy? Here is a one-liner:
countsOf3 = sum(floor(diff([0 getfield(cumsum([a 0]),{~(diff([a 0])+1)})])/n))

Incompatible rank 0 and 1 in fortran

ab is a 4 x 5 matrix i.e., ab(4,5) and x is an array of length 4 i.e., x(4)
x(4) = ab(4,5)/ab(4,4)
do i = 3, 1, -1
x(i) = ( ab(i,5) - ab(i,i+1:4) * x(i+1:4) ) / ab(i,i)
end do
The do loop says incompatible rank 0 and 1.

You can use the SUM function in order to reduce the array multiplication to a scalar. i.e. x(i) = ( ab(i,5) - SUM(ab(i,i+1:4) * x(i+1:4)) ) / ab(i,i). Alternatively, the DOT_PRODUCT function could be used DOT_PRODUCT(ab(i,i+1:4) , x(i+1:4))

How can I make a value in one array the value in another array (VBA)?

I want to create a piece of code that lets me assign a value from a array to another array, which are of different lengths. This is what I have so far.
A(1) = 0
A(2) = 0
A(3) = 6
A(4) = 5
A(5) = 7
n = 0
For i = 1 To 5
If A(i) <> 0 Then
n = n + 1
End If
Next i
ReDim B(1 To n) As Integer
For j = 1 To n
For i = 1 To 5
If A(i) <> 0 Then
B(j) = A(i)
End If
Next i
Next j
MsgBox B(2)
At the moment this returns 7 whereas it should return 5, all values in B are 7. How can I get this code to run?

The fact that you have nested loops should alarm you: this would be executed n * 5 times, which cannot be correct.
Change the second part so it only uses one loop, like this:
ReDim B(1 To n) As Integer
j = 1
For i = 1 To UBound(A)
If A(i) <> 0 Then
B(j) = A(i)
j = j + 1
End If
Next i
Note also that using UBound instead of 5 makes your code more generic. Note also that this loop is very similar to the loop that calculates n. The only difference is that you assign to B(j).
You could in fact combine it with the first loop, if you would re-dimension B twice, the second time with Preserve:
ReDim B(1 To UBound(A)) As Integer
n = 0
For i = 1 To UBound(A)
If A(i) <> 0 Then
n = n + 1
B(n) = A(i)
End If
Next i
' Shorten the array without losing data:
ReDim Preserve B(1 To n)

You are going to have to check B for the first empty array element and exit the loop so you do not continue to write.
Dim A() As Variant, B() As Variant
Dim i As Long, j As Long, n As Long
A = Array(0, 0, 6, 5, 7) '<~~ 0 to 4, not 1 to 5
n = 0
For i = LBound(A) To UBound(A)
If A(i) <> 0 Then
n = n + 1
End If
Next i
ReDim B(1 To n) '<~~ 1 to 3
For i = LBound(A) To UBound(A)
If A(i) <> 0 Then
For j = LBound(B) To UBound(B)
If IsEmpty(B(j)) Then
B(j) = A(i) '<~~ assigned a value; exit loop
Exit For
End If
Next j
End If
Next i
For j = LBound(B) To UBound(B)
Debug.Print B(j)
Next j
Given that arrays can be either zero-based or one-based, I prefer to use the LBound and UBound functions to define their scope.