Using OpenMP how can I apply function across array in Fortran? - arrays

using OpenMP with Fortran, I am trying to speed up the process of applying a function column-wise over an array. In the example code below, one 1 thread is getting used instead of all available 4. In the below example I am using a sum function (which is a simplification of the actual function that is much larger). What am I doing wrong here? Any pointers most welcome. Thanks in advance.
program prog2
use omp_lib
implicit none
integer :: i, j
integer, parameter :: imax = 1000
integer, parameter :: jmax = 1000
real, dimension(:,:), allocatable :: array ! input array
real, dimension(:), allocatable :: res_array ! column-wise sums
real :: tempval
integer :: nthreads
! setting up array
allocate(array(1:imax, 1:jmax))
allocate(res_array(jmax))
do i = 1, imax
do j = 1, jmax
array(i, j) = (real(i+10*j))**0.1
end do
end do
nthreads = OMP_GET_MAX_THREADS()
call OMP_SET_NUM_THREADS(nthreads)
res_array = 0.0
!$OMP PARALLEL DO private(tmp, i) shared(j)
do j = 1, jmax, 1
tmp = mysum(array(:, j))
res_array(j) = tmp
end do
!$OMP END PARALLEL DO
contains
! dummy function summing up column
real function mysum(x)
implicit none
integer :: i
real, dimension(1:imax), intent(in) :: x
real :: tmp = 0.0
do i = 1, imax
tmp = tmp + x(i)
end do
mysum = tmp
return
end function
end program

Related

Passing the allocatable array to subroutine in Fortran with MPI [duplicate]

This question already has answers here:
keeping array limits in fortran during subroutine call
(3 answers)
Closed 5 years ago.
I am trying to pass the allocatable array to the subroutine. When I am using the serial version as mentioned in How to pass allocatable arrays to subroutines in Fortran it is working fine. Below is my serial version of code.
module test
contains
subroutine func(a,sqa,n)
implicit none
integer, intent(in) :: n
integer, intent(in), dimension(:,:) :: a
integer, intent(out), dimension(:,:):: sqa
!local variables
integer :: i,j
do i= 1,n
do j = 1,2
sqa(i,j) = a(i,j)*a(i,j)
print *, 'i',i, 'j', j,'sqa(i,j)',sqa(i,j)
end do
end do
end subroutine func
end module test
program main
use test
implicit none
integer :: n,i,j
integer, dimension(:,:), allocatable :: a, sqa
print *, 'enter no of rows'
read *, n
allocate(a(1:n,2))
allocate(sqa(1:n,2))
do i = 1,n
do j = 1, 2
a(i,j) = i +j
print *, 'i =',i,'j =',j, a(i,j)
end do
end do
call func(a, sqa,n)
deallocate(a,sqa)
end program main
When I start to implement using MPI, my parallel version of code is
module test
contains
subroutine func(a,sqa,istart,iend)
implicit none
integer, intent(in) :: istart, iend
integer, intent(in), dimension(:,:) :: a
integer, intent(out),dimension(:,:) :: sqa
!local variables
integer :: i,j
do i= istart, iend
do j = 1,2
sqa(i,j) = a(i,j)*a(i,j)
print *, 'i',i, 'j', j,'sqa(i,j)',sqa(i,j)
end do
end do
end subroutine func
end module test
program main
use test
use mpi
implicit none
integer :: istart, iend, ierr,nproc, procnum, n,&
points_per_thread, i,j
integer, dimension(:,:), allocatable :: a, sqa
integer,dimension(mpi_status_size) :: status
call mpi_init(ierr)
call mpi_comm_size(mpi_comm_world, nproc, ierr)
call mpi_comm_rank(mpi_comm_world,procnum, ierr)
if(procnum == 0)then
print *, 'enter no of rows'
read *, n
end if
call mpi_bcast(n,1,mpi_integer,0,mpi_comm_world, ierr)
points_per_thread = (n + nproc - 1)/nproc
istart = procnum*points_per_thread + 1
iend = min((procnum + 1)*points_per_thread,n)
print *, 'istart ', istart, 'iend', iend, 'procnum', procnum
call mpi_barrier(mpi_comm_world, ierr)
allocate(a(istart:iend,2))
allocate(sqa(istart:iend,2))
do i = istart,iend
do j = 1, 2
a(i,j) = i +j
print *, 'i =',i,'j =',j, a(i,j)
end do
end do
call mpi_barrier(mpi_comm_world, ierr)
call func(a(istart:iend,:), sqa(istart:iend,:),istart,iend)
deallocate(a,sqa)
call mpi_finalize(ierr)
end program main
The above code gives the segmentation fault error. I don't understand the reason for this.
Next, when in my subroutine func I change the declaration of arrays a and sqa to
integer,intent(in):: a(istart:iend,2)
integer, intent(out)::sqa(istart:iend,2)
Now it works fine. I request to kindly help me understand the reason for the error.
Assumed shape dummy arrays make available the extension of the actual arguments inside the function but not their bounds. If the actual bounds are needed inside the function, explicit-shape dummy arrays must be used.

Passing array of unknown size (subroutine output) to another subroutine

I'm new to Intel MKL. Here's a problem I've come across -- apparently a problem not related to MKL itself, but to the problem of how to declare and pass an array of hitherto unknown size as an output of a subroutine to another subroutine.
I'm trying to use mkl_ddnscsr to convert a matrix to its CSR format suitable for calling by Pardiso:
CALL mkl_ddnscsr(job,Nt,Nt,Adns,Nt,Acsr,ja,ia,info)
CALL PARDISO(pt,1,1,11,13,Nt,Acsr,ia,ja,perm,1,iparm,0,b,x,errr)
Problem is, I have no idea what the length of the CSR form Acsr and the index vector ja before calling the mkl_ddnscsr subroutine. How should one declare Acsr and ja in the main program, or the subroutine where these two lines are located?
I tried something like
INTERFACE
SUBROUTINE mkl_ddnscsr(job, m, n, Adns, lda, Acsr, ja, ia, info)
IMPLICIT NONE
INTEGER :: job(8)
INTEGER :: m, n, lda, info
INTEGER, ALLOCATABLE :: ja(:)
INTEGER :: ia(m+1)
REAL(KIND=8), ALLOCATABLE :: Acsr(:)
REAL(KIND=8) :: Adns(:)
END SUBROUTINE
END INTERFACE
followed by
INTEGER, ALLOCATABLE :: ja(:)
REAL(KIND=8), ALLOCATABLE :: Acsr(:)
outside the INTERFACE, in the main program. But this configuration gives me the segmentation fault upon running.
On the other hand, if I try something like
INTERFACE
SUBROUTINE mkl_ddnscsr(job, m, n, Adns, lda, Acsr, ja, ia, info)
IMPLICIT NONE
INTEGER :: job(8)
INTEGER :: m, n, lda, info
INTEGER :: ja(:), ia(m+1)
REAL(KIND=8) :: Acsr(:), Adns(:)
END SUBROUTINE
END INTERFACE
and then
INTEGER, DIMENSION(:) :: ja
REAL(KIND=8), DIMENSION(:) :: Acsr
Then ifort would give me the following message:
error #6596: If a deferred-shape array is intended, then the ALLOCATABLE or POINTER attribute is missing; if an assumed-shape array is intended, the array must be a dummy argument.
Anyone got any idea how to work around this? What's the right way to declare ja and Acsr in the main program (or main subroutine) and pass them around?
Note that the subroutines are part of the Intel MKL package, not something I write on my own, so it appears that module would be out of the question.
You can find the interface for mkl_ddnscsr from the manual page, or from the include file mkl_spblas.fi in your MKL install directory (e.g., /path/to/mkl/include/).
INTERFACE
subroutine mkl_ddnscsr ( job, m, n, Adns, lda, Acsr, AJ, AI, info )
integer job(8)
integer m, n, lda, info
integer AJ(*), AI(m+1)
double precision Adns(*), Acsr(*)
end
END INTERFACE
Because this routine has only Fortran77-style dummy arguments (i.e., explicit-shape array AI(m+1) or assumed-size arrays like Adns(*)), you can pass any local or allocatable arrays (after allocated in the caller side) as actual arguments. Also, it is not mandatory to write an interface block explicitly, but it should be useful to include it (in the caller side) to detect potential interface mismatch.
According to the manual, it looks like mkl_ddnscsr (a routine for converting a dense to sparse matrix) works something like this:
program main
implicit none
! include 'mkl_spblas.fi' !! or mkl.fi (not mandatory but recommended)
integer :: nzmax, nnz, job( 8 ), m, n, lda, info, irow, k
double precision :: A( 10, 20 )
double precision, allocatable :: Asparse(:)
integer, allocatable :: ia(:), ja(:)
A( :, : ) = 0.0d0
A( 2, 3 ) = 23.0d0
A( 2, 7 ) = 27.0d0
A( 5, 4 ) = 54.0d0
A( 9, 9 ) = 99.0d0
!! Give an estimate of the number of non-zeros.
nzmax = 10
!! Or assume that non-zeros occupy at most 2% of A(:,:), for example.
! nzmax = size( A ) / 50
!! Or count the number of non-zeros directly.
! nzmax = count( abs( A ) > 0.0d0 )
print *, "nzmax = ", nzmax
m = size( A, 1 ) !! number of rows
n = size( A, 2 ) !! number of columns
lda = m !! leading dimension of A
allocate( Asparse( nzmax ) )
allocate( ja( nzmax ) ) !! <-> columns(:)
allocate( ia( m + 1 ) ) !! <-> rowIndex(:)
job( 1 ) = 0 !! convert dense to sparse A
job( 2:3 ) = 1 !! use 1-based indices
job( 4 ) = 2 !! use the whole A as input
job( 5 ) = nzmax !! maximum allowed number of non-zeros
job( 6 ) = 1 !! generate Asparse, ia, and ja as output
call mkl_ddnscsr( job, m, n, A, lda, Asparse, ja, ia, info )
if ( info /= 0 ) then
print *, "insufficient nzmax (stopped at ", info, "row)"; stop
endif
nnz = ia( m+1 ) - 1
print *, "number of non-zero elements = ", nnz
do irow = 1, m
!! This loop runs only for rows having nonzero elements.
do k = ia( irow ), ia( irow + 1 ) - 1
print "(2i5, f15.8)", irow, ja( k ), Asparse( k )
enddo
enddo
end program
Compiling with ifort -mkl test.f90 (with ifort14.0) gives the expected result
nzmax = 10
number of non-zero elements = 4
2 3 23.00000000
2 7 27.00000000
5 4 54.00000000
9 9 99.00000000
As for the determination of nzmax, I think there are at least three ways for this: (1) just use a guess value (as above); (2) assume the fraction of nonzero elements in the whole array; or (3) directly count the number of nonzeros in the dense array. In any case, because we have the exact number of nonzeros as output (nnz), we could re-allocate Asparse and ja to have the exact size (if necessary).
Similarly, you can find the interface for PARDISO from the include file mkl_pardiso.fi or from this (or this) page.

3d array assignment error in fortran segfault

My code is currently breaking when I try to pass an array to another array. I'm using gfortran as the compiler with no special flags. I'm also writing the file as a ".f95". Here's the backbone of the code that's causing the problem:
program foo
integer, parameter :: pp = 16
call fooyoo
contains
subroutine fooyoo
real(kind=pp), allocatable :: f(:,:,:), y_ix(:), r(:)
integer :: ii, jj, nn
nn = 32
r = zoo(nn)
y_ix = r ! this is done because I'm actually allocating two other 1D arrays
do ii = 1, nn
do jj = 1, nn
write(*,*) "crashes here"
f(ii,jj,:) = y_ix
write(*,*) "nothing here"
end do
end do
end subroutine
function zoo(nn) result(r)
integer :: i, nn
real(kind=pp) :: r(nn)
do i = 1, nn
r(i) = i
end do
end function
end program
Probably
do ii = 1, nn
do jj = 1, nn
write(*,*) "crashes on executing the next line because programmer forgot ", &
"to allocate f before trying to set values"
f(ii,jj,:) = y_ix
write(*,*) "nothing here"
end do
end do
In this line
y_ix = r
automatic allocation works because the expression on the lhs is an allocatable array. In
f(ii,jj,:) = y_ix
automatic allocation fails because the expression on the lhs is an array section, not an allocatable array.

finding specific indices with pointer array

I am relatively new to Fortran and break my head about one thing for hours now:
I want to write a subroutine for finding the indexes for specific elements in a real 1D array (given to the routine as input).
I have generated an array with 100 random reals, called arr, and now want to determine the indexes of those elements which are greater than a real value min, which is also passed to subroutine.
Plus, in the end I would like to have a pointer I'd allocate in the end, which I was said would be better than using an array indices containing the indexes once found.
I just didn't find how to solve that, I had following approach:
SUBROUTINE COMP(arr, min)
real, intent(in) :: arr(:)
real, intent(in) :: min
integer, pointer, dimension(:) :: Indices
integer :: i, j
! now here I need a loop which assigns to each element of the pointer
! array the Indices one after another, i don't know how many indices there
! are to be pointed at
! And I dont know how to manage that the Indices are pointed at one after another,
! like Indices(1) => 4
! Indices(2) => 7
! Indices(3) => 32
! Indices(4) => 69
! ...
! instead of
! Indices(4) => 4
! Indices(7) => 7
! Indices(32) => 32
! Indices(69) => 69
! ...
DO i = 1, size(arr)
IF (arr(i) > min) THEN
???
ENDIF
ENDDO
allocate(Indices)
END SUBROUTINE COMP
If succinctness (rather than performance) floats your boat... consider:
FUNCTION find_indexes_for_specific_elements_in_a_real_1D_array(array, min) &
RESULT(indices)
REAL, INTENT(IN) :: array(:)
REAL, INTENT(IN) :: min
INTEGER, ALLOCATABLE :: indices(:)
INTEGER :: i
indices = PACK([(i,i=1,SIZE(array))], array >= min)
END FUNCTION find_indexes_for_specific_elements_in_a_real_1D_array
[Requires F2003. Procedures that have assumed shape arguments and functions with allocatable results need to have an explicit interface accessible where they are referenced, so all well behaved Fortran programmers put them in a module.]
A simple way to get the indices of a rank 1 array arr for elements greater than value min is
indices = PACK([(i, i=LBOUND(arr,1), UBOUND(arr,1))], arr.gt.min)
where indices is allocatable, dimension(:). If your compiler doesn't support automatic (re-)allocation than an ALLOCATE(indices(COUNT(arr.gt.min)) would be needed before that point (with a DEALLOCATE before that if indices is already allocated).
As explanation: the [(i, i=...)] creates an array with the numbers of the indices of the other array, and the PACK intrinsic selects those corresponding to the mask condition.
Note that if you are doing index calculations in a subroutine you have to be careful:
subroutine COMP(arr, min, indices)
real, intent(in) :: arr(:)
real, intent(in) :: min
integer, allocatable, intent(out) :: indices(:)
!...
end subroutine
arr in the subroutine will have lower bound 1 regardless of the bounds of the actual argument (the array passed) (which could be, say VALS(10:109). You will also then want to pass the lower bound to the subroutine, or address that later.
[Automatic allocation is not an F90 feature, but in F90 one also has to think about allocatable subroutine arguments
I think you're on the right track, but you're ignoring some intrinsic Fortran functions, specifically count, and you aren't returning anything!
subroutine comp(arr, min)
real, intent(in) :: arr(:)
real, intent(in) :: min
! local variables
integer, allocatable :: indices(:)
integer :: i,j, indx
! count counts the number of TRUE elements in the array
indx = count(arr > min)
allocate(indices(indx))
! the variable j here is the counter to increment the index of indices
j=1
do i=1,size(arr)
if(arr(i) > min) then
indices(j) = i
j = j+1
endif
enddo
end subroutine comp
Then you can use the array indices as
do i=1,size(indices)
var = arr(indices(i))
enddo
Note that since you are not returning indices, you will lose all the information found once you return--unless you plan on using it in the subroutine, then you're okay. If you're not using it there, you could define it as a global variable in a module and the other subroutines should see it.

FORTRAN: out of bounds check fails in subroutines?

I have a rather simple piece of code (reduced to the essentials from bigger program).
I pass an array and the size of the array to a subroutine. I get no error if the passed size does not match the actual size of the array. And I can even manipulate the parts of the array that don't "exist" !!! (I overwrite memory that I shouldn't overwrite).
Here is the subroutine:
subroutine sub(arr, narr)
implicit none
integer, intent(in) :: narr
double precision, dimension(narr) :: arr
integer :: j
do j = 1, narr
! print all the values
write(*, '("Arr[",I0,"] = ",f0.10)') j, arr(j)
! change the values
arr(j) = -10d0
enddo
end subroutine
and here the main program
program main
implicit none
integer, parameter :: narr = 5
! the array is made smaller
double precision, dimension(narr - 2) :: array
integer :: j
! assign values to array
array = (/ (1d0*j, j = 1,narr - 2) /)
! print using the subroutine
print*, "inside subroutine"
call sub(array, narr)
! print outside the subroutine
print *, " "
print *, "outside subroutine"
do j = 1, narr
write(*, '("Arr[",I0,"] = ",f0.10)') j, array(j)
enddo
end program
If I compile with ifort and "-check all" it only catches the error in the main program, but not in the subroutine.
Is there a way to catch also the error in the subroutine ?
Yes. Declare array as dimension(:) in the subroutine -- assumed-shape array. Using this Fortran >90 declaration requires that the procedure interface be known to the caller -- the easiest way is to have the procedure in a module and use that module in the caller. You don't actually need to pass the size of the array to the subroutine -- you can determine it as size(arr). I have left the argument narr to retain the bug.
module MySub
contains
subroutine sub(arr, narr)
implicit none
integer, intent(in) :: narr
double precision, dimension(:) :: arr
integer :: j
do j = 1, narr
! print all the values
write(*, '("Arr[",I0,"] = ",f0.10)') j, arr(j)
! change the values
arr(j) = -10d0
enddo
end subroutine
end module MySub
program main
use MySub
implicit none
integer, parameter :: narr = 5
! the array is made smaller
double precision, dimension(narr - 2) :: array
integer :: j
! assign values to array
array = (/ (1d0*j, j = 1,narr - 2) /)
! print using the subroutine
print*, "inside subroutine"
call sub(array,narr)
! print outside the subroutine
print *, " "
print *, "outside subroutine"
do j = 1, narr
write(*, '("Arr[",I0,"] = ",f0.10)') j, array(j)
enddo
end program

Resources