How can a Fortran program that iterates over an array and distributes the corresponding entries among a set of subroutines be designed efficiently? - arrays

I have a fortran subroutine that receives a large unsorted array of a certain type and needs to call other subroutines that are responsible for parsing and storing each item depending on one of the values declared inside of it.
In my previous post, I shared a program that does just that but had a few design flaws, like allocating a large array for every type that needs to be parsed and only filling out the required values, or calling if (.not. allocated()) multiple times for every array element.
I have created another version of this program that addresses these downsides, but entails some other design paradigm issues that need to be improved upon:
module animal_farm
integer :: &
RABBIT_ID = 1, &
DOG_ID= 2, &
BIRD_ID= 3, &
HORSE_ID= 4, &
current_animal_id
type :: Animal
character(256) :: animal_type
integer :: &
age
end type Animal
type(Animal), dimension(:), allocatable, target :: & ! temporary arrays storing all the entries from large_animal_list for each animal
rabbit_entries, &
horse_entries, &
bird_entries, &
dog_entries
type(Animal), dimension(:), pointer :: &
current_animal_list
integer, dimension(:), allocatable :: animal_list_mapping
! this type and array is defined for every available animal, but only Rabbit is defined here to keep this example as simple as possible
type :: Rabbit
integer :: &
age, &
estimated_carrots_eaten ! parameters like this are defined differently for each animal, requiring a new *_params array for each type
end type Rabbit
type(Rabbit), dimension(:), allocatable :: & ! list of rabbit entries alongside parameters calculated specifically for rabbits
rabbit_params
integer, dimension(4) :: & ! number of available animals is 4
animal_ids, &
animal_counts, & ! temporary array to count the number of animals in large_animal_list
individual_animal_indeces ! temporary array that stores the current index of one of the animal specific lists
contains
subroutine parse_animals(large_animal_list)
type(Animal), dimension(:), intent(in) :: large_animal_list
integer :: i
allocate(animal_list_mapping(size(large_animal_list)))
animal_counts = 0
do i = 1, size(large_animal_list)
select case(large_animal_list(i)%animal_type)
case('rabbit')
current_animal_id = RABBIT_ID
case('horse')
current_animal_id = HORSE_ID
case('bird')
current_animal_id = BIRD_ID
case('dog')
current_animal_id = DOG_ID
end select
animal_counts(current_animal_id) = animal_counts(current_animal_id)+1
end do
allocate(rabbit_entries(animal_counts(RABBIT_ID)))
allocate(horse_entries(animal_counts(HORSE_ID)))
allocate(bird_entries(animal_counts(BIRD_ID)))
allocate(dog_entries(animal_counts(DOG_ID)))
individual_animal_indeces = 1
do i = 1, size(large_animal_list)
select case(large_animal_list(i)%animal_type)
case('rabbit')
current_animal_id = RABBIT_ID
current_animal_list => rabbit_entries
case('horse')
current_animal_id = HORSE_ID
current_animal_list => horse_entries
case('bird')
current_animal_id = BIRD_ID
current_animal_list => bird_entries
case('dog')
current_animal_id = DOG_ID
current_animal_list => dog_entries
end select
current_animal_list(individual_animal_indeces(current_animal_id))%age = large_animal_list(i)%age
animal_list_mapping(i) = individual_animal_indeces(current_animal_id)
individual_animal_indeces(current_animal_id) = animal_counts(current_animal_id)+1
end do
if (animal_counts(RABBIT_ID)>0) call parse_rabbit_information(rabbit_entries)
! if (animal_counts(HORSE_ID)>0) call parse_horse_information(horse_entries)
! if (animal_counts(BIRD_ID)>0) call parse_bird_information(bird_entries)
! if (animal_counts(DOG_ID)>0) call parse_dog_information(dog_entries)
end subroutine parse_animals
subroutine parse_rabbit_information(rabbit_entries)
type(Animal), dimension(:), intent(in) :: rabbit_entries
integer :: i
allocate(rabbit_params(size(rabbit_entries)))
do i=1, size(rabbit_entries)
rabbit_params(i)%age = rabbit_entries(i)%age
rabbit_params(i)%estimated_carrots_eaten = rabbit_entries(i)%age*10*365
end do
end subroutine parse_rabbit_information
subroutine feed_rabbit(animal_list_index)
integer, intent(in) :: animal_list_index
integer :: rabbit_params_index
rabbit_params_index = animal_list_mapping(animal_list_index)
rabbit_params(rabbit_params_index)%estimated_carrots_eaten = rabbit_params(rabbit_params_index)%estimated_carrots_eaten+1
end subroutine feed_rabbit
end module animal_farm
Program TEST
use animal_farm
type(Animal), dimension(10) :: my_animal_list
my_animal_list(1)%animal_type = "rabbit"
my_animal_list(1)%age = 5
my_animal_list(2)%animal_type = "dog"
my_animal_list(2)%age = 6
my_animal_list(3)%animal_type = "horse"
my_animal_list(3)%age = 1
my_animal_list(4)%animal_type = "rabbit"
my_animal_list(4)%age = 3
my_animal_list(5)%animal_type = "bird"
my_animal_list(5)%age = 4
my_animal_list(6)%animal_type = "horse"
my_animal_list(6)%age = 6
my_animal_list(7)%animal_type = "rabbit"
my_animal_list(7)%age = 2
my_animal_list(8)%animal_type = "rabbit"
my_animal_list(8)%age = 2
my_animal_list(9)%animal_type = "dog"
my_animal_list(9)%age = 4
my_animal_list(10)%animal_type = "horse"
my_animal_list(10)%age = 7
call parse_animals(my_animal_list)
call feed_rabbit(1)
call feed_rabbit(4)
End Program TEST
This version only calls each subroutine responsible for handling the different item types once, and passes an array that already has the correct size and can simply be allocated in the target subroutine. If possible, I would like to improve the following points:
The current solution involves the use of two loops, the first one where the number of occurrences for each item type is counted, and another where the now allocated arrays that are being passed to the subroutines are filled with the corresponding values. This requires the use of helper arrays such as animal_counts or individual_animal_indeces, which in turn also need to know how many different types of animals they need to account for (hardcoded to be 4 in the example). I also tried using some sort of linked-list structure to improve this, which allowed me to only use one loop, but the values corresponding to each type still need to be stored in an array of the correct size.
To address the issues from point 1., I thought about placing the defined *_ID variables in an array, so the helper arrays can be defined with integer, dimension(size(animal_id_array)). The defined *_ID variables are also being used as array indeces, which requires them to be defined by hand from 1-x. It is not very clean to have to add and remove ids from a list like this and redefine the array where they are stored, every time an id is added or removed. The generation of ids can be achieved with the enum, bind(c); enumerator operator, but to get to the number of ids you still need to create a separate array or hardcode the amount somewhere.
How can this program be modified to improve its performance and memory-efficiency without making it needlessly difficult to read and maintain?

How can this program be modified to improve its performance and memory-efficiency without making it needlessly difficult to read and maintain?
Working towards all three of these goals at once is almost always difficult, and sometimes outright impossible. Unless you have specific reasons to do otherwise, I would recommend first focussing on making your code easy to read and maintain, and only then trying to improve its performance and memory-efficiency. The latter step should only be done after profiling your code to see which bits actually need optimising.
With that in mind, let's see if we can simplify your code a bit. Since you already have a number of types, let's go full object-oriented, and introduce some polymorphism.
If we're inheriting Rabbit from Animal, we can avoid storing the animal_type field, and instead generate it using a type-bound procedure, something like
module animal_mod
implicit none
! Define the base Animal type.
type, abstract :: Animal
integer :: age
contains
procedure(animal_type_Animal), deferred, nopass :: animal_type
end type
! Define the interface for the `animal_type` functions.
interface
function animal_type_Animal() result(output)
character(256) :: output
end function
end interface
end animal_mod
and
module rabbit_mod
use animal_mod
implicit none
! Define the `Rabbit` type as an extension of the `Animal` type.
! Note that `Rabbit` has an `age` because it is an `Animal`.
type, extends(Animal) :: Rabbit
integer :: estimated_carrots_eaten
contains
procedure, nopass :: animal_type => animal_type_Rabbit
end type
contains
! Define the implementation of `animal_type` for the `Rabbit` type.
function animal_type_Rabbit() result(output)
character(256) :: output
output = "rabbit"
end function
end module
Now we want to be able to create an array of animals. Fortran doesn't allow polymorphic arrays, so we need to define a type which contains an animal and which can be made into an array. Something like
module animal_box_mod
use animal_mod
implicit none
type :: AnimalBox
class(Animal), allocatable :: a
end type
end module
We can now create an array of animals, e.g.
type(AnimalBox) :: animals(3)
animals(1)%a = Rabbit(age=3, estimated_carrots_eaten=0)
animals(2)%a = Frog(age=3, estimated_bugs_eaten=4, length=1.7786)
animals(3)%a = Mouse(age=4, estimated_cheese_eaten=7, coat="Yellow")
Instead of using a method like feed_rabbit(7), you can instead use a type-bound method. If we add this as
module rabbit_module
type, extends(Animal) :: Rabbit
... ! as above
contains
... ! as above
procedure :: feed
end type
contains
... ! as above
subroutine feed(this)
class(Rabbit), intent(inout) :: this
this%estimated_carrots_eaten = this%estimated_carrots_eaten + 1
end subroutine
end module
then we can call this using our animals array as
select type(a => animals(1)%a); type is(Rabbit)
a.feed()
end select

Related

How to directly use the return of maxloc as index of an array [duplicate]

Is there possibility to use indexing directly on a function's return value? Something like this:
readStr()(2:5)
where readStr() is a function which returns a character string or an array. In many other languages it is quite possible, but what about Fortran? The syntax in my example of course does not compile. Is there any other syntax to be used?
No, that is not possible in Fortran. You could, however, alter your function to take an additional index array that determines which elements are returned. This example illustrates this possibility using an interface to allow for an optional specification of the indices (simplified greatly thanks to the comment by IanH):
module test_mod
implicit none
contains
function squareOpt( arr, idx ) result(res)
real, intent(in) :: arr(:)
integer, intent(in), optional :: idx(:)
real,allocatable :: res( : )
real :: res_( size(arr) )
integer :: stat
! Calculate as before
res_ = arr*arr
if ( present(idx) ) then
! Take the sub-set
allocate( res(size(idx)), stat=stat )
if ( stat /= 0 ) stop 'Cannot allocate memory!'
res = res_(idx)
else
! Take the the whole array
allocate( res(size(arr)), stat=stat )
if ( stat /= 0 ) stop 'Cannot allocate memory!'
res = res_
endif
end function
end module
program test
use test_mod
implicit none
real :: arr(4)
integer :: idx(2)
arr = [ 1., 2., 3., 4. ]
idx = [ 2, 3]
print *, 'w/o indices',squareOpt(arr)
print *, 'w/ indices',squareOpt(arr, idx)
end program
No.
But if it bothers you, you can write your own user defined functions and operators to achieve a similar outcome without having to store the result of the function reference in a separate variable.
You can avoid declaring another variable if you use associate. Whether it is any better or clearer than a temporary variable must be decided by the user. The result has to be stored somewhere anyway.
associate(str=>readStr())
print *, str(2:5)
end associate
It will not be very useful for this specific case with a potentially long string but might be more useful for other similar cases that get linked here as duplicates.

Fortran Array Size Allocation

I am looking to take data and create new arrays that are the same shape. However, I seem to be making Fortran unhappy with this syntax:
For instance I want to create an array that is the same size as the input data:
function allocate_size(input_data)
IMPLICIT NONE
REAL, INTENT(in) :: input_data
REAL, DIMENSION(SIZE(input_data, DIM=1), SIZE(input_data, DIM=2), SIZE(input_data, DIM=3), SIZE(input_data, DIM=4), &
SIZE(input_data, DIM=5)) :: new_data
end function allocate_size
or one that is slightly different when one of those SIZE(input_data, DIM=x) would be replaced with an integer. Is this an incorrect approach?

Fortran calling final routine on arrays [duplicate]

This question already has an answer here:
Does the finalization routine need to be elemental in order to be called on the elements of allocatable array that goes out of scope?
(1 answer)
Closed 5 years ago.
I am writing object destructors in Fortran, using the final keyword in the type definition. But the destructor is not called when an array of those instances leave the scope.
module problem_module
type :: destructable_object
integer :: nr
contains
final :: destruct
end type destructable_object
type :: collection
type(destructable_object) :: single_member
type(destructable_object), dimension(3) :: multiple_members
end type collection
contains
subroutine destruct(instance)
type(destructable_object), intent(in) :: instance
write(*,*) "Destruct ",instance%nr
end subroutine destruct
end module problem_module
In this example module, any scalar of the destructable_object type will be deconstructed with the destruct routine induced by the final keyword. Arrays of the destructable_object type will, however, not be deconstructed. For destructable objects in other classes will also only be properly deconstructed if they are scalar (In this example, the single_member gets deconstructed properly, the multiple_members not). This is independent of whether or not the containing object is in an array or not. So, for example
program main
! Destructors are only called at end of subroutines, not at the end of the program.
! Therefore, I move the entire program to a subroutine.
call main_execute
contains
subroutine main_execute
use problem_module
implicit none
type(destructable_object) :: single_instance
type(destructable_object), dimension(3) :: multiple_instances
type(collection) :: single_collection
type(collection), dimension(3) :: multiple_collections
single_instance%nr = 1
multiple_instances(1)%nr = 2
multiple_instances(2)%nr = 3
multiple_instances(3)%nr = 4
single_collection%single_member%nr = 5
single_collection%multiple_members(1)%nr = 6
single_collection%multiple_members(2)%nr = 7
single_collection%multiple_members(3)%nr = 8
multiple_collections(1)%single_member%nr = 9
multiple_collections(1)%multiple_members(1)%nr = 10
multiple_collections(1)%multiple_members(2)%nr = 11
multiple_collections(1)%multiple_members(3)%nr = 12
multiple_collections(2)%single_member%nr = 13
multiple_collections(2)%multiple_members(1)%nr = 14
multiple_collections(2)%multiple_members(2)%nr = 15
multiple_collections(2)%multiple_members(3)%nr = 16
multiple_collections(3)%single_member%nr = 17
multiple_collections(3)%multiple_members(1)%nr = 18
multiple_collections(3)%multiple_members(2)%nr = 19
multiple_collections(3)%multiple_members(3)%nr = 20
end subroutine main_execute
end program main
returns
Destruct 1
Destruct 5
Destruct 9
Destruct 13
Destruct 17
Exactly all the scalar instances of the destructable object and not the arrays of the destructable objects, independent of their situation. If I want an array of objects with a destructor, I can fix that by adding a layer of indirectness with a container object. That seems clumsy, requiring nested %-constructions or a bunch of pointers. Is there a more elegant way to force the destruction of arrays?
Different final subroutines must be specified for scalar and for each rank of the array of the type to finalize. E.g.:
module problem_module
type :: destructable_object
integer :: nr
contains
final :: destruct, destruct_array
end type destructable_object
type :: collection
type(destructable_object) :: single_member
type(destructable_object), dimension(3) :: multiple_members
end type collection
contains
subroutine destruct(instance)
type(destructable_object), intent(in) :: instance
write(*,*) "Destruct ",instance%nr
end subroutine destruct
subroutine destruct_array(instance)
type(destructable_object), intent(in) :: instance(:)
write(*,*) "Destruct array ",instance%nr
end subroutine destruct_array
end module problem_module
To avoid multiple definition you can add the elemental attribute to the subroutine:
module problem_module
type :: destructable_object
integer :: nr
contains
final :: destruct
end type destructable_object
type :: collection
type(destructable_object) :: single_member
type(destructable_object), dimension(3) :: multiple_members
end type collection
contains
impure elemental subroutine destruct(instance)
type(destructable_object), intent(in) :: instance
write(*,*) "Destruct ",instance%nr
end subroutine destruct
end module problem_module
(I also added impure to have the possibility to use a write in an elemental procedure).
I strongly suggest to use more recent compilers. Full implementation of final stuff may be not available in the old compilers.

Better way to mask a Fortran array?

I am wanting to mask a Fortran array. Here's the way I am currently doing it...
where (my_array <=15.0)
mask_array = 1
elsewhere
mask_array = 0
end where
So then I get my masked array with:
masked = my_array * mask_array
Is there a more concise way to do this?
Use the MERGE intrinsic function:
masked = my_array * merge(1,0,my_array<=15.0)
Or, sticking with where,
masked = 0
where (my_array <=15.0) masked = my_array
I expect that there are differences, in speed and memory consumption, between the use of where and the use of merge but off the top of my head I don't know what they are.
There are two different approaches already given here: one retaining where and one using merge. In the first, High Performance Mark mentions that there may be differences in speed and memory use (think about temporary arrays). I'll point out another potential consideration (without making a value judgment).
subroutine work_with_masked_where(my_array)
real, intent(in) :: my_array(:)
real, allocatable :: masked(:)
allocate(masked(SIZE(my_array)), source=0.)
where (my_array <=15.0) masked = my_array
! ...
end subroutine
subroutine work_with_masked_merge(my_array)
real, intent(in) :: my_array(:)
real, allocatable :: masked(:)
masked = MERGE(my_array, 0., my_array<=15.)
! ...
end subroutine
That is, the merge solution can use automatic allocation. Of course, there are times when one doesn't want this (such as when working with lots of my_arrays of the same size: there are often overheads when checking array sizes in these cases): use masked(:) = MERGE(...) after handling the allocation (which may be relevant even for the question code).
I find it useful to define a function where which takes an array of logicals and returns the integer indices of the .true. values, so e.g.
x = where([.true., .false., .false., .true.]) ! sets `x` to [1, 4].
This function can be defined as
function where(input) result(output)
logical, intent(in) :: input(:)
integer, allocatable :: output(:)
integer :: i
output = pack([(i, i=1, size(input))], input)
end function
With this where function, your problem can be solved as
my_array(where(my_array>15.0)) = 0
This is probably not the most performant way of doing this, but I think it is very readable and concise. This where function can also be more flexible than the where intrinsic, as it can be used e.g. for specific dimensions of multi-dimensional arrays.
Limitations:
Note however that (as #francescalus points out) this will not work for arrays which are not 1-indexed. This limitation cannot easily be avoided, as performing comparison operations on such arrays drops the indexing information, e.g.
real :: my_array(-2,2)
integer, allocatable :: indices(:)
my_array(-2:2) = [1,2,3,4,5]
indices = my_array>3
write(*,*) lbound(indices), ubound(indices) ! Prints "1 5".
For arrays which are not 1-indexed, in order to use this where function you would need the rather ugly
my_array(where(my_array>15.0)+lbound(my_array)-1) = 0

fortran loop a list of 2D arrays using pointers

i have allocated a lot of 2D arrays in my code, and I want each one array to read from a file named as array's name. The problem is that each array has different size, so I am looking for the most efficient way. The code is like this:
Module Test
USE ...
implicit NONE
private
public:: initializeTest, readFile
real(kind=8),dimension(:,:),allocatable,target:: ar1,ar2,ar3,ar4,ar5,...,ar10
real(kind=8),dimension(:,:),pointer:: pAr
CONTAINS
!
subroutine initializeTest
integer:: k1,k2,k3,k4,k5
integer:: ind1,ind2
allocate(ar1(k1,k1),ar2(k1,k2),ar3(k2,k4),ar4(k5,k5),...) !variable sizes
! here needs automatization - since its repeated
pAr => ar1
ind1 = size(pAr,1)
ind2 = size(pAr,2)
call readFile(par,ind1,ind2)
pAr => ar2
ind1 = size(pAr,1)
ind2 = size(pAr,2)
call readFile(par,ind1,ind2)
!....ar3, ... , ar9
pAr => ar10
ind1 = size(pAr,1)
ind2 = size(pAr,2)
call readFile(par,ind1,ind2)
end subroutine initializeTest
!
!
subroutine readFile(ar,row,col)
real(kind=8),dimension(row,col)
integer:: i,j,row,col
! it should open the file with same name as 'ar'
open(unit=111,file='ar.dat')
do i = 1, row
read(222,*) (ar(i,j),j=1,col)
enddo
end subroutine importFile
!
!
end module Test
If your arrays ar1, ar2, etc. had the same dimensions you could put them all in a 3-dimensional array. Since they have different dimensions, you can define a derived type, call it a "matrix", with an allocatable array component and then create an array of that derived type. Then you can read the i'th matrix from a file such as "input_1.txt" for i=1.
The program below, which works with g95 and gfortran, shows how the derived type can be declared and used.
module foo
implicit none
type, public :: matrix
real, allocatable :: xx(:,:)
end type matrix
end module foo
program xfoo
use foo, only: matrix
implicit none
integer, parameter :: nmat = 9
integer :: i
character (len=20) :: fname
type(matrix) :: y(nmat)
do i=1,nmat
allocate(y(i)%xx(i,i))
write (fname,"('input_',i0)") i
! in actual code, read data into y(i)%xx from file fname
y(i)%xx = 0.0
print*,"read from file ",trim(fname)
end do
end program xfoo
As far as I know, extracting the name of the variable from the variable at runtime isn't going to work.
If you need lots of automation for the arrays, consider using an array of a derived type, as one other answer suggests, in order to loop over them both for allocation and reading. Then you can enumerate the files, or store a label with the derived type.
Sticking to specific array names, an alternative is to just read/write the files with the required name as an argument to the routine:
Module Test
...
! here needs automatization - since its repeated
call readFile(ar1,'ar1')
call readFile(ar2,'ar2')
!....ar3, ... , ar9
call readFile(ar10,'ar10')
end subroutine initializeTest
subroutine readFile(ar,label)
real(kind=8) :: ar(:,:)
character(len=*) :: label
integer:: i,j,nrow,ncol,fd
nrow=size(ar,1)
ncol=size(ar,2)
open(newunit=fd,file=label)
do i = 1, row
read(fd,*) (ar(i,j),j=1,col)
enddo
end subroutine readFile
end module Test
Some unsollicited comments: I don't really get why (in this example) readFile is public, why the pointers are needed? Also, kind=8 shouldn't be used (Fortran 90 kind parameter).

Resources