Arrays and derived types - arrays

For my new project, I have to use an array instead of a scratch file to store information from users. To do this, I need to create derived types, too.
However, I haven't understood what an array is and what a derived type is, how to use them, what they can do, and some other basic ideas.
Can anyone give me some information about array and derived types?
I wrote code for them, but I don't know it is written correctly.
If anyone can check this for me, I would appreciate it.
Here are my array and derived types:
! derived type
TYPE Bank
INTEGER :: acNumber, acChecks
REAL :: acBlance, acRate
CHARACTER :: acType*1, acLName*15, acFName*15
END TYPE
! array
INTEGER, PARAMETER :: MaxRow, MaxColum = 7
INTEGER, DIMENSION(MaxRow:MaxColum) :: AccountData

If you are a fortran programmer you have probably seen a subroutine accepting 10/15 arguments. If you think about it, it's insane (they are too many, you run the risk of swapping them) and you quickly realize that some arguments always travel together.
It would make sense to pack them under a single entity that carries everything around as a whole, non as independent entities. This would reduce the number of arguments considerably, giving you only the burden to find proper association. This single entity is the type.
In your code, you say that a Bank is an aggregate of those informations. You can now declare a concrete variable of that type, which will represent and provide access to the single variables acNumber, acChecks and so on. In order to do so, you have to use the % symbol. so if your bank variable is called b, you can say for example
b%acNumber = 5
You can imagine b as a closet, containing different shelves. You move the closed, all the shelves and their content move together.
An array is a group of entities of the same type (say, integer, or Character(len=1024), or Bank) and they are one after another so you can access each of them with a numeric index. Remember that, unless specified differently, arrays indexes in fortran start at 1 (in all the other major languages, the first index is zero instead)
As for your code, I suggest you to:
write
INTEGER, DIMENSION(MaxRow:MaxColum) :: AccountData
as
INTEGER :: AccountData(MaxRow,MaxColum)
it is the same, but you write less. Please also note that there is a difference between using the : and the ,. If you want to define a matrix (your case), which is a two-dimension array, you have to use the comma. What you wrote is wrong.
for the strings, it's better if you write
CHARACTER :: acType*1, acLName*15, acFName*15
as
CHARACTER(LEN=1) :: acType
CHARACTER(LEN=15) :: acLName
CHARACTER(LEN=15) :: acFName
in this case, you write more, but your syntax is deprecated (I could be wrong, though)
Also, remember that it's better if you write one member variable per line in the types. It's a matter of taste, but I prefer to see the full size of a type by having one line per member variable.
For MaxRows and MaxColumns, I would write them as MAX_ROWS and MAX_COLUMNS. Parameters and stuff that is highly constant by tradition is identified with an all capital, underscore separated name in any major language.
Edit: to answer your comment, here is an example of the use of an array
$ more foo.f90
program test
integer :: myarray(10)
myarray = 0 ! equivalent to zeroing the single elements one by one
myarray(2) = 5
myarray(7) = 10
print *, myarray
end program
$ g95 foo.f90 -o foo
$ ./foo
0 5 0 0 0 0 10 0 0 0
an array is just like multiple variables with the same name, identified by an index. Very useful to express vectors, or matrices.
You can of course do an array of an aggregated type you define, instead of a predefined type (eg. integer).

An array is an ordered list of variables, all of the same type, indexed by integers. See Array in Wikipedia Note that in Fortran array indexing is more flexible than most other low level languages, in that instead of a single index per dimension, you can have an index triplet consisting of lower bound, upper bound, and stride. In that case the lvalue of the expression is a subarray rather than a single element of the array type.
A derived type is a composite type defined by the users, which is made up of multiple components which can be of different types. In some other languages these are knows as structs, structure types, or record types. See Record in Wikipedia
You can also make an array of a derived type, or you can have a derived type where one or more components are themselves arrays, or for that matter, other derived types. It's up to you!
The easiest way to check your code is to try to compile it. Making it past the compiler is of course no guarantee that the program works as expected, but it certainly is a required step.

Related

Is there any reason to pass array by first item?

In a rather old style Fortran project, I often see this pattern, when an array is passed by its first item:
program test
implicit none
integer :: a(10)
a(:) = 1
call sub(a(1), 10) ! here
contains
subroutine sub(a, length)
integer, intent(in) :: length
integer, intent(in) :: a(length)
print *, a
end subroutine
end program
where it could be:
call sub(a, 10) ! here
which is valid even in Fortran 77.
Note the size of the array has to be passed and used explicitly, this will not work for assumed shape array:
subroutine sub(a)
integer, intent(in) :: a(:)
print *, a
end subroutine
For me this is confusing, as the call suggests a scalar is passed to the subroutine. I suppose it works because the array is passed by reference.
Is there any reason to pass arrays like this, especially now?
One uses this feature when using the older interface to the non-blocking MPI routines.
Say you want to pass subarray A(10,10:19) which is a part of a bigger array A(1:100,1:100).
If you pass
use mpi
call ISend(A(10,10:19), 10, MPI_REAL, ...
you pass a temporary copy of array A and the address of the temporary copy will not be valid at the time of the MPI_Wait. Therefore, instead, you create an MPI derived type, that describes the offsets in the array to be sent and you use it as
use mpi
call ISend(A(10,10), 1, derived_type, ...
Of course, with the most modern MPI libraries and compilers you use use mpi_f08. However, most HPC codes in the wild do not use it yet.
Another solution is to use an MPI derived type that includes the absolute address of the subarray and just pass A. Sometimes it is practical, sometimes it is not. It depends on how much the subarrays passed vary throughout the code.
Be aware that there are other issues present in non-blocking MPI in the old interface and it helps if you explicitly mark the routines as ASYNCHRONOUS.
Consider the following example:
implicit none
integer a(2,2)
a = RESHAPE([1,2,3,4],[2,2])
call sub(a(2,1))
print '(2I3)', TRANSPOSE(a)
contains
subroutine sub(b)
integer, intent(out) :: b(2)
b = -1
end subroutine sub
end
Here, the element sequence represented by the actual argument a is a(2,1), a(1,2), a(2,2), and the first two are associated with the dummy argument b.
For arrays of rank greater than one, this element sequence use may make things (much) easier to specify certain consecutive elements of the actual argument. For a rank-1 array we can write a(3:) instead of a(3), say, as the actual argument. For higher rank arrays we haven't that option.
I won't express an opinion on whether this is a good use this feature.

Native Fortran type signature for "list of variable length strings"?

In Fortran, is there any way to declare an "array of allocatable arrays", that doesn't require wrapping the allocatable arrays into a derived type?
My main use-case would be to allow invoking a function with an array of variable-length strings, i.e. I am looking for a type signature matching the literal
["Hello", &
"World.", &
"How are you?"]
Motivation
In Fortran strings are natively represented as fixed-size character arrays, padded with blanks on the right. Arrays of strings are normally represented as arrays of (equal-length) character arrays, which I assume is in order to make them behave like a matrix of characters.
However, this also means that doing something like
CALL mySubroutine(["Hello","World.","How are you?"])
will result in a compiler error like
Error: Different CHARACTER lengths (5/4) in array constructor at (1)
A commonly suggested workaround (see e.g. Return an array of strings of different length in Fortran) is to use an array of derived types instead, e.g.
TYPE string
CHARACTER(LEN=:), ALLOCATABLE :: chars
END type string
! ...
CALL myStringSubroutine([string("Hello"), &
string("World."), &
string("How are you?")])
However, since there is no standardized string type of this kind, I am much more frequently seeing APIs using natively supported "workarounds" such as using fixed-size character strings, and trimming them when used. In this case the invocation would look like
CALL myFixedSubroutine(["Hello ", &
"World. ", &
"How are you?"])
While this is no problem in principle, it can become awkward and inefficient, if one of the strings is much longer than the others. It also has implications for version control, where now changing "... you?" to "... you??" means that the padding of the other lines has to be changed, creating spurious diffs.
(In the comments, a suggestion was given that at least automates the whitespace-padding.)
No, there is no way bar the wrapper type.
A fundamental concept in the language is that elements within an array may only vary in value. The allocation status of an object is not part of the value of that object.
(The allocation status of a component is part of the value of the object that has the component.)
A varying length string type is described in Part 2 of the Fortran standard.

Fortran 90 - Algebra operation with scalar and arrays

I am working with a Fortran 90 program that, amongst many others, has the following variables declared:
real(r8) :: smp_node_lf
real(r8), pointer :: sucsat(:,:)
real(r8), pointer :: h2osoi_vol(:,:)
real(r8), pointer :: watsat(:,:)
real(r8), pointer :: bsw(:,:)
And at some point in the program, there is an algebra operation that looks like this:
do j = 1,nlevgrnd
do c = 1,fn
...
smp_node_lf = -sucsat(c,j)*(h2osoi_vol(c,j)/watsat(c,j))**(-bsw(c,j))
...
end do
end do
I am trying to "translate" a dozen lines of this program to R, but the above excerpt in particular made me confused.
What is the dimension of smp_node_lf? Is it an scalar? Does it inherit the dimensions of the arrays sucsat,h2osoi_vol,watsat and bsw?
There is a lack of dimensions for smp_node_lf because it is a scalar, and it is receiving the value of that scalar operation multiple times, being rewritten, if there is nothing to save its value to a vector or something.
It will never inherit the dimensions of any of those elements, there is never a vector to be inherited, everything it is receiving is scalar
if you have to retrieve its value, assuming the original code is capable of it as it is, there should be another part inside this very loop that saves that value before it is overwritten by another pass.
If there is not such thing, implement it, you might be dealing with incomplete code that does nothing it was said to do.
I've dealt with my fair share of "perfect code" that "did miracles when I used last time" with not a single miracle to be found within its lines of code.

Fortran: Array of arbitrary dimension?

If I want to create an allocatable multidimensional array, I can say:
program test
real, dimension(:,:), allocatable :: x
integer :: i,j
allocate(x(5, 5))
do i = 1,size(x,1)
do j = 1,size(x,2)
x(i,j) = i*j
end do
end do
write(*,*) x
end program test
However, what if I don't know how many dimension x will be. Is there a way to accommodate that?
Newer compilers allow the use of assumed-rank objects for interoperability.
I think that is what you are looking for. But this is for call to functions or subroutines. The function or subroutine declares the dummy argument as assumed-rank and the actual rank is passed with the actual argument at runtime.
Example from IBM website:
REAL :: a0
REAL :: a1(10)
REAL :: a2(10, 20)
REAL, POINTER :: a3(:,:,:)
CALL sub1(a0)
CALL sub1(a1)
CALL sub1(a2)
CALL sub1(a3)
CONTAINS
SUBROUTINE sub1(a)
REAL :: a(..)
PRINT *, RANK(a)
END
END
follow this or that for more details
It looks to me like you're trying to carry out stencil computations across an array of rank-1, -2 or -3 -- this isn't quite the same as needing arrays of arbitrary rank. And assumed-rank arrays are really only applicable when passing an array argument to a routine, there's no mechanism even in the forthcoming standard for declaring an array to have a rank determined at run-time.
If you're impatient to get on with your code and your compiler doesn't yet implement TS 29113:2012 perhaps the following approach will appeal to you.
real, dimension(:,:,:), allocatable :: voltage_field
if (nd == 1) allocate(voltage_field(nx,1,1))
if (nd == 2) allocate(voltage_field(nx,ny,1))
if (nd == 3) allocate(voltage_field(nx,ny,nz))
Your current approach faces the problem of not knowing, in advance of knowing the number of dimensions in the field, how many nearest-neighbours to consider in the stencil, so you might find yourself writing 3 versions of each stencil update. If you simply abuse a rank-3 array of size nx*1*1 to represent a 1D problem (mutatis mutandis a 2D problem) you always have 3 sets of nearest-neighbours in each stencil calculation. It's just that in the flattened dimensions the nearest neighbour is, well, either a ghost cell containing a boundary value, or the cell itself if your space wraps round.
Writing your code to work always in 3 dimensions but to make no assumptions about the extent of at least two of them will, I think, be easier than writing rank-sensitive code. But I haven't given the matter a lot of thought and I haven't really thought too much about its impact on your f-d scheme.

Find a string value in an array of strings

I have an array of strings in a Fortran program. Not a single character string. I know that one of the values in the array is "foo". I want to know the index of the array that contains "foo". Is there a way to find the index other than a brute force loop? I obviously can't use the "minloc" routine since I'm not dealing with numerics here. Again, just to make sure: I am not searching for a substring in a string. I am searching for a string in an array of strings.
implicit none
integer i
character*8 a(100)
do i = 1,100
a(i)='foo'
enddo
a(42)='bar'
call f(a,len(a(1)),shape(a)*len(a(1)),'bar ')
end
subroutine f(c,n,all,s)
implicit none
integer n,all
character*(*) s
character*(all) c
write(*,*)index(c,s)/n+1
end
a.out -> 42
note this code is treating the entire array as one big string and searching for substrings so it will also find matches that are not aligned with the component string boundaries.
eg. a false match occurs with adjacent entries such as:
a(2)='xxbar '
a(3)=' yyy'
Some additional work required to ensure you find an index that is an integer multiple of n ( of course by the time you do that a simple loop might look preferable )
Well, after thinking about it, I came up with this. It works if "foo" is known to be either absent from the array, or located in one and only one place:
character(len=3) :: tags(100)
integer :: test(100)
integer :: str_location
! populate "tags" however needed. Then search for "foo":
test=(/(i,i=1,100)/)
where (tags.ne."foo") test=0
str_location = sum(test)
I am guessing this is actually slower than the brute force loop, but it makes for compact code. I thought about filling "test" with ones and using maxloc, but that doesn't account for the possibility of "foo" be absent from the array. Opinions?

Resources