OpenMP crashes with parameter-defined array bounds - arrays

I am having an issue with private arrays when using the !$OMP TASK construct. Arrays listed as PRIVATE for tasks are crashing/becoming corrupted when their bounds are given by input parameters in the subroutine. I am using static arrays to avoid the usual issues with allocatable arrays and !$OMP PARALLEL PRIVATE.
The following simplified code reproduces the issue, and crashes with SIGSEV:
SUBROUTINE do_work(n_in)
USE omp_lib
IMPLICIT NONE
INTEGER, INTENT(IN) :: n_in
INTEGER :: i, counter
REAL, DIMENSION(n_in) :: a
REAL, DIMENSION(20) :: b
!$OMP PARALLEL PRIVATE(a,i)
!$OMP SINGLE
counter = 0
DO WHILE(counter .LE. 20)
!$OMP TASK FIRSTPRIVATE(counter) PRIVATE(a,i)
a(:) = 5.0
DO i = 1,n_in
a(1) = a(1) + a(i)
END DO
b(counter) = a(1)
!$OMP END TASK
counter = counter + 1
END DO
!$OMP END SINGLE
!$OMP END PARALLEL
END SUBROUTINE do_work
The issue, however, is cleared away simply by hardcoding the size of array a i.e. REAL, DIMENSION(5) :: a. It is almost as if the task space is not aware of the array size parameter n_in. However, I have verified n_in both inside and outside of the task construct and outside of the parallel construct. Furthermore, if a is declared as a scalar, it works
Is the usage of PRIVATE clauses incorrect or incomplete?
SIDE NOTES:
I've written this simplified code to reproduce the problem. In reality, I am parallelizing a series of linked lists, as you can probably tell from the structure
Any code calling this subroutine is serial. There is no parallel nesting, recursion, etc.

I have no clue why, but in my case it works fine, if I include a radom print command (e.g print*,'test' or print*,a or even only print*, ) somewhere in the parallel region. If I comment it out, again I also get a SIGSEV ... strange. Sorry for this more or less answer.

Related

PGI FORTRAN90 allocating an array passed to a subroutine (seg fault) [duplicate]

The following code is returning a Segmentation Fault because the allocatable array I am trying to pass is not being properly recognized (size returns 1, when it should be 3). In this page (http://www.eng-tips.com/viewthread.cfm?qid=170599) a similar example seems to indicate that it should work fine in F95; my code file has a .F90 extension, but I tried changing it to F95, and I am using gfortran to compile.
My guess is that the problem should be in the way I am passing the allocatable array to the subroutine; What am I doing wrong?
!%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%!
PROGRAM test
!%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%!
IMPLICIT NONE
DOUBLE PRECISION,ALLOCATABLE :: Array(:,:)
INTEGER :: iii,jjj
ALLOCATE(Array(3,3))
DO iii=1,3
DO jjj=1,3
Array(iii,jjj)=iii+jjj
PRINT*,Array(iii,jjj)
ENDDO
ENDDO
CALL Subtest(Array)
END PROGRAM
!%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%!
SUBROUTINE Subtest(Array)
DOUBLE PRECISION,ALLOCATABLE,INTENT(IN) :: Array(:,:)
INTEGER :: iii,jjj
PRINT*,SIZE(Array,1),SIZE(Array,2)
DO iii=1,SIZE(Array,1)
DO jjj=1,SIZE(Array,2)
PRINT*,Array(iii,jjj)
ENDDO
ENDDO
END SUBROUTINE
!%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%!
If a procedure has a dummy argument that is an allocatable, then an explicit interface is required in any calling scope.
(There are numerous things that require an explicit interface, an allocatable dummy is but one.)
You can provide that explicit interface yourself by putting an interface block for your subroutine inside the main program. An alternative and far, far, far better option is to put the subroutine inside a module and then USE that module in the main program - the explicit interface is then automatically created. There is an example of this on the eng-tips site that you provided a link to - see the post by xwb.
Note that it only makes sense for a dummy argument to have the allocatable attribute if you are going to do something related to its allocation status - query its status, reallocate it, deallocate it, etc.
Please also note that your allocatable dummy argument array is declared with intent(in), which means its allocation status will be that of the associated actual argument (and it may not be changed during the procedure). The actual argument passed to your subroutine may be unallocated and therefore illegal to reference, even with an explicit interface. The compiler will not know this and the behaviour of inquiries like size is undefined in such cases.
Hence, you first have to check the allocation status of array with allocated(array) before referencing its contents. I would further suggest to implement loops over the full array with lbound and ubound, since in general you can't be sure about array's bounds:
subroutine subtest(array)
double precision, allocatable, intent(in) :: array(:,:)
integer :: iii, jjj
if(allocated(array)) then
print*, size(array, 1), size(array, 2)
do iii = lbound(array, 1), ubound(array, 1)
do jjj = lbound(array, 2), ubound(array, 2)
print*, array(iii,jjj)
enddo
enddo
endif
end subroutine
This is a simple example that uses allocatable dummy arguments with a module.
module arrayMod
real,dimension(:,:),allocatable :: theArray
end module arrayMod
program test
use arrayMod
implicit none
interface
subroutine arraySub
end subroutine arraySub
end interface
write(*,*) allocated(theArray)
call arraySub
write(*,*) allocated(theArray)
end program test
subroutine arraySub
use arrayMod
write(*,*) 'Inside arraySub()'
allocate(theArray(3,2))
end subroutine arraySub

Pass Common block array size to subroutine in Fortran

I would like to pass the array dimension as a dummy variable to a subroutine. The array itself is on a common block. Here is the code:
PROGRAM test
integer i, nn
integer PARAMETER(Nt=10)
real x(Nt), y(nt), z(Nt)
Common /Bdat/ z
nn=Nt
do i=1,Nt
x(i)=i+1
z(i)=i-1
enddo
call estimate(x,y,nn)
print*, y
return
end
subroutine estimate(x,y,jj)
integer i,jj
real x(jj), y(jj), zq(jj)
COMMON /Bdat/ zq
do i=1, jj
y(i)=x(i)+zq(i)
enddo
return
end
this is the error I get from the subroutine:
real x(jj), y(jj), zq(jj)
1
Error: Variable 'jj' at (1) in this context must be constant
I would really appreciate it if anybody could solve the issue.
You have a scope problem. Read: Scope in Fortran. That is, your subroutine estimate needs access to the variable Nt which you need to pass as an additional argument, or you can move the entire subroutine inside your program using the contains statement. This will allow your program to run successfully, but I highly encourage you to abstain from using common blocks. If you cannot avoid them due to legacy codes see: Improve your FORTRAN 77 programs using some Fortran 90 features
Try using modules instead:
module bdat
implicit none
private
public :: NT, z
integer, parameter :: NT = 10
real :: z(NT)
end module bdat
module my_sub
use bdat, only: &
zq => z ! You're free to rename the variable
implicit none
private
public :: estimate
contains
subroutine estimate(x,y)
! calling arguments
real, intent (in) :: x(:)
real, intent (out) :: y(:)
! local variables
integer :: i, jj
jj = size(x)
do i=1, jj
y(i)=x(i)+zq(i)
end do
end subroutine estimate
end module my_sub
program test
use bdat, only: &
NT, z
use my_sub, only: &
estimate
implicit none
integer :: i
real :: x(NT), y(NT)
do i=1,NT
x(i)=i+1
z(i)=i-1
end do
call estimate(x,y)
print *, y
end program test

Best way to handle large private arrays in openmp parallel region [duplicate]

When I try to parallelize my program in Fortran90 by OpenMP, I get a segmentation fault error.
!$OMP PARALLEL DO NUM_THREADS(4) &
!$OMP PRIVATE(numstrain, i)
do irep = 1, nrep
do i=1, 10
PRINT *, numstrain(i)
end do
end do
!$OMP END PARALLEL DO
I find that if I comment out "PRINT *, numstrain(i)" or remove openmp flags it works without error. I think it is because memory access conflict happens when I access numstrain(i) in parallel. I already declared i and numstrain as private variables. Could someone please give me some idea why it is the case? Thank you so much. :)
UPDATE:
I modified the previous version and this version can print out correct result.
integer, allocatable :: numstrain(:)
integer :: allocate_status
integer :: n
!$OMP PARALLEL DO NUM_THREADS(4) &
!$OMP PRIVATE(numstrain, i)
n = 1000000
do irep = 1, nrep
allocate (numstrain(n), stat = allocate_status)
do i=1, 10
PRINT *, numstrain(i)
end do
deallocate (numstrain, stat = allocate_status)
end do
!$OMP END PARALLEL DO
However if I move the numstrain accessing to another subroutine called by this subroutine (code attached below), 1. It always processes in one thread. 2. At some point (i=4 or 5), it returns Segmentation Fault:11. The variable i when it returns Segmentation Fault:11 is different when I have different NUM_THREADS.
integer, allocatable :: numstrain(:)
integer :: allocate_status
integer :: n
!$OMP PARALLEL DO NUM_THREADS(4) &
!$OMP PRIVATE(numstrain, i)
n = 1000000
do irep = 1, nrep
allocate (numstrain(n), stat = allocate_status)
call anotherSubroutine(numstrain)
deallocate (numstrain, stat = allocate_status)
end do
!$OMP END PARALLEL DO
subroutine anotherSubroutine(numstrain)
integer, allocatable :: numstrain(:)
do i=1, 10
PRINT *, numstrain(i)
end do
end subroutine anotherSubroutine
I also tried to both allocate/deallocate in help subroutine and main subroutine, and only allocate/deallocate in help subroutine. Nothing is changed.
The most typical reason for this is that not enough space is available on the stack to hold the private copy of numstrain. Compute and compare the following two values:
the size of the array in bytes
the stack size limit
There are two kinds of stack size limits. The stack size of the main thread is controlled by things like process limits on Unix systems (use ulimit -s to check and modify this limit) or is fixed at link time on Windows (recompilation or binary edit of the executable is necessary in order to change the limit). The stack size of the additional OpenMP threads is controlled by environment variables like the standard OMP_STACKSIZE, or the implementation-specific GOMP_STACKSIZE (GNU/GCC OpenMP) and KMP_STACKSIZE (Intel OpenMP).
Note that most Fortran OpenMP implementations always put private arrays on the stack, no matter if you enable compiler options that allocate large arrays on the heap (tested with GNU's gfortran and Intel's ifort).
If you comment out the PRINT statement, you effectively remove the reference to numstrain and the compiler is free to optimise it out, e.g. it could simply not make a private copy of numstrain, thus the stack limit is not exceeded.
After the additional information that you've provided one can conclude, that stack size is not the culprit. When dealing with private ALLOCATABLE arrays, you should know that:
private copies of unallocated arrays remain unallocated;
private copies of allocated arrays are allocated with the same bounds.
If you do not use numstrain outside of the parallel region, it is fine to do what you've done in your first case, but with some modifications:
integer, allocatable :: numstrain(:)
integer :: allocate_status
integer, parameter :: n = 1000000
interface
subroutine anotherSubroutine(numstrain)
integer, allocatable :: numstrain(:)
end subroutine anotherSubroutine
end interface
!$OMP PARALLEL NUM_THREADS(4) PRIVATE(numstrain, allocate_status)
allocate (numstrain(n), stat = allocate_status)
!$OMP DO
do irep = 1, nrep
call anotherSubroutine(numstrain)
end do
!$OMP END DO
deallocate (numstrain)
!$OMP END PARALLEL
If you also use numstrain outside of the parallel region, then the allocation and deallocation go outside:
allocate (numstrain(n), stat = allocate_status)
!$OMP PARALLEL DO NUM_THREADS(4) PRIVATE(numstrain)
do irep = 1, nrep
call anotherSubroutine(numstrain)
end do
!$OMP END PARALLEL DO
deallocate (numstrain)
You should also know that when you call a routine that takes an ALLOCATABLE array as argument, you have to provide an explicit interface for that routine. You can either write an INTERFACE block or you can put the called routine in a module and then USE that module - both cases would provide the explicit interface. If you do not provide the explicit interface, the compiler would not pass the array correctly and the subroutine would fail to access its content.

Nested loop in Fortran with OPENMP

My (Fortran) code is very simple. All it does is filling up a large array, that depends on five (independent!) variables. Here is a brief example
do i = 1, imax
do j = 1, jmax
do k = 1, kmax
array(i,j,k) = ! some function of i,j,k
end do
end do
end do
I would to use different threads to fill the values of array in a faster way.
I thought the simplest way to achieve that would be to enclose the loop in these commands
!$OMP PARALLEL DO
!$OMP PARALLEL END
However, if I do this I get completely different results from the serial case. I apologize if the question is too simple, but I couldn't really find a proper example to help solve my problem. Can you recommend a solution or provide an example?
I don't exatly know what is happening, but it could be a race condition or just bad declaration of the directives. Try this and see if it works
!replace ... with variables that are constants as in shared(a,b,c)
!$omp parallel do default(private) shared(...)
do i=1,imax
j=1,jmax
k=1,kmax
array(i,j,k) = ! some function of i,j,k
end do
end do
end do
!$omp end parallel do

How to declare an array variable and its size mid-routine in Fortran

I would like to create an array with a dimension based on the number of elements meeting a certain condition in another array. This would require that I initialize an array mid-routine, which Fortran won't let me do.
Is there a way around that?
Example routine:
subroutine example(some_array)
real some_array(50) ! passed array of known dimension
element_count = 0
do i=1,50
if (some_array.gt.0) then
element_count = element_count+1
endif
enddo
real new_array(element_count) ! new array with length based on conditional statement
endsubroutine example
Your question isn't about initializing an array, which involves setting its values.
However, there is a way to do what you want. You even have a choice, depending on how general it's to be.
I'm assuming that the element_count means to have a some_array(i) in that loop.
You can make new_array allocatable:
subroutine example(some_array)
real some_array(50)
real, allocatable :: new_array(:)
allocate(new_array(COUNT(some_array.gt.0)))
end subroutine
Or have it as an automatic object:
subroutine example(some_array)
real some_array(50)
real new_array(COUNT(some_array.gt.0))
end subroutine
This latter works only when your condition is "simple". Further, automatic objects cannot be used in the scope of modules or main programs. The allocatable case is much more general, such as when you want to use the full loop rather than the count intrinsic, or want the variable not as a procedure local variable.
In both of these cases you meet the requirement of having all the declarations before executable statements.
Since Fortran 2008 the block construct allows automatic objects even after executable statements and in the main program:
program example
implicit none
real some_array(50)
some_array = ...
block
real new_array(COUNT(some_array.gt.0))
end block
end program example
Try this
real, dimension(50) :: some_array
real, dimension(:), allocatable :: other_array
integer :: status
...
allocate(other_array(count(some_array>0)),stat=status)
at the end of this sequence of statements other_array will have the one element for each element of some_array greater than 0, there is no need to write a loop to count the non-zero elements of some_array.
Following #AlexanderVogt's advice, do check the status of the allocate statement.
You can use allocatable arrays for this task:
subroutine example(some_array)
real :: some_array(50)
real,allocatable :: new_array(:)
integer :: i, element_count, status
element_count = 0
do i=lbound(some_array,1),ubound(some_array,1)
if ( some_array(i) > 0 ) then
element_count = element_count + 1
endif
enddo
allocate( new_array(element_count), stat=status )
if ( status /= 0 ) stop 'cannot allocate memory'
! set values of new_array
end subroutine
You need to use an allocatable array (see this article for more on it). This would change your routine to
subroutine example(input_array,output_array)
real,intent(in) :: input_array(50) ! passed array of known dimension
real, intent(out), allocatable :: output_array(:)
integer :: element_count, i
element_count = 0
do i=1,50
if (some_array.gt.0) element_count = element_count+1
enddo
allocate(output_array(element_count))
end subroutine
Note that the intents may not be necessary, but are probably good practice. If you don't want to call a second array, it is possible to create a reallocate subroutine; though this would require the array to already be declared as allocatable.

Resources