I have to write a subroutine in Fortran 77(i'm using Intel Fortran), which reads the measured values from a text file and stores them in a matrix.
Since the number of measured values is always variable, I must dynamically allocate the matrix.
I know that the dynamic allocation is only possible from Fortran 90, but at that time people had the same problems, so it is also possible.
How would you proceed?
I do not want to reserve too much space for the matrix because the method is impractical for me.
If you really are restricted to Fortran 77, you do not do dynamic allocation. Instead, declare an array that is larger than what you think you will likely need, without it being too large to prevent the program from running on your target system. Then store your values in that large array, separately keeping track of how many elements of the large array that you use. If your choice of array size was not large enough, let the user know and terminate the program.
People found the lack of dynamic allocation in Fortran 77 very restrictive, so they often resorted to using non-standard language extensions. If you decide to go down the path of language extensions, then these days the best extension to Fortran 77 to use in this situation is the allocatable array feature introduced with Fortran 90. I think it is fair to say that all actively maintained compilers that can handle Fortran 77 will also handle Fortran 90 allocatable arrays (and then some).
As many people have pointed out, you don't have to stick to Fortran77, even if much of what is already written is Fortran77 compatible. Even the few features that have been deleted in Fortran 95 See Wikipedia for a list, your compiler will probably still work fine, as long as you don't switch from Fixed Form to Free Form in the same file.
Pre-F90, what people would probably do is to declare arrays that are (hoped to be) big enough for any use case, then only use the first elements of that array.
One thing that I am not certain about, but which might be useful, is the change of scope. Short example:
subroutine main(n)
implicit none
integer n
integer a(n)
print*, "Please enter the ", n, " numbers"
read*, a
print*, "Sum is ", sum(a)
end subroutine main
program dynamic
implicit none
integer n
print*, "Enter size of array:"
read*, n
call main(n)
end program dynamic
I'm curious to know whether this would be Fortran77 compliant. I honestly don't know. #francescalus has convinced me that it isn't.
Related
I was trying to write a library for linear algebra operations in Haskell. In order to be able to define safe operations for matrices and vectors I wanted to encode their dimensions in their types. After some research I found that using DataKinds one is able to do that, similar to the way it's done here. For example:
data Vector (n :: Nat) a
dot :: Num a => Vector n a -> Vector n a -> a
In the aforementioned article, as well as in some libraries, the size of the vectors is a phantom type and the vector type itself is a wrapper around an Array. In trying to figure out if there is a array type with its size at the type-level in the standard library I started wondering about the underlying representation of arrays. From what I could gather form this commentary on GHC memory layout, arrays need to store their size on the heap so a 3-dimensional vector would need to take up 1 more word than necessary. Of course we could use the following definition:
data Vector3 a = Vector3 a a a
which might be fine if we only care about 3D geometry, but it doesn't allow for vectors of arbitrary size and also it makes indexing awkward.
So, my question is this. Wouldn't it be useful and a potential memory optimization to have an array type with statically known size in the standard library? As far, as I understand the only thing that it would need is a different info table, which would store the size, instead of it being stored for at each heap object. Also, the compiler could choose between Array and SmallArray automatically.
Wouldn't it be useful and a potential memory optimization to have an array type with statically known size in the standard library?
Sure. I suspect if you wrote up your use case carefully and implemented this, GHC HQ would accept a patch. You might want to do the writeup first and double-check that they're into it to avoid wasting time on a patch they won't accept, though; I certainly don't speak for them.
Also, the compiler could choose between Array and SmallArray automatically.
I'm not an expert here, but I kinda doubt this. Usually supporting polymorphism means you need a uniform representation.
Is it possible to use real numbers as iterators and array indices when compiling with gfortran? Here is some example code:
program test
real i
real testarray(5)
testarray = 0.
do i=1,5
write(*,*) testarray(i)
end do
end program
I want to run some code that I did not write. It compiles fine with the intel compiler on windows, but I want to compile and run it in linux with the gfortran compiler. I'm currently getting errors using real numbers as array indices and do loop iterators.
Thanks!
Why would you want to use real numbers as array and loop indices?
If you need to use the real value of the index, do something like:
program test
integer i
real testarray(5)
testarray = 0.
do i=1,5
testarray(i) = REAL(i)
end do
end program
And of course you could go the other direction if you needed to,
integer j
do j = 1, INTEGER(testarray(1))
...
end do
for example. The standard doesn't allow non-integer indices. They don't make sense either -- what is the 1.5 index in your array?
It appears that the real array indexing is an extension that should be possible if you compile with --std=gnu. But support for that may not always be there as it is not part of the standard.
If you don't want to see the warnings, then try --std=legacy. Otherwise "gnu", as already suggested. The gfortran manual states:
As an extension, GNU Fortran allows the use of REAL expressions or
variables as array indices.
and
The default value for std is ‘gnu’, which specifies a superset of the
Fortran 95 standard that includes all of the extensions supported by
GNU Fortran, although warnings will be given for obsolete extensions
not recommended for use in new code. The ‘legacy’ value is equivalent
but without the warnings for obsolete extensions, and may be useful
for old non-standard programs.
Using real variables as loop indices was deleted from the language standard with Fortran 95. Because of the amount of legacy code that uses this, it is likely to remain in compilers for decades.
Another possibility is to implement this as a function or subroutine. The user experience would be similar tab(x) loohs like an array or like a function, but would allow more control (for example you can check if x is within eps of some value of x0 for which you have defined a value).
In general the idea seems dangerous due to rounding errors.
If you are working on rational numbers or let say srqt's of integer numers, then it is again ideal case when f(x) as a function applies (with x bein e.g. a derived type that contains numerator and denominator).
So my final answer is: write it as a function.
Quite often when I look at legacy Fortran code for linear algebra subroutines, I notice this:
Instead of working with a bunch of separate arrays, they will concatenate all their arrays into one big workspace array and use pointers to demarcate where a variable begins.
They even concatenate independent non-array variables into arrays. Are there benefits to doing this, and should I be doing this if I want to write optimized code?
No, don't do that if you want to keep sane mind. This is a practise from 1960s-1980s when there was no dynamic allocation possible and they wanted only small number of working arrays in the argument list.
In old subroutines you had a long list of arguments and then one or two working arrays:
call SUB(N1, N2, N3, M1, M2, M3, A, B, C, WRK, IWRK)
if you needed to pass 10 working arrays instead of one it would be too difficult to call it.
But in 21st century the most important thing is to keep your code readable and clear and only after that optimize it.
BTW having some quantities to close in memory can be even detrimental due to false sharing.
That does not mean you should fragment your memory too much, but it makes sense to keep stuff together when you will indeed access it sequentially. That's why structure of arrays are used instead of arrays of structures.
In general (independent of the programming language that is used): having "consecutive" blocks of well, anything is often helpful.
The operating system, or even the hardware might be able to benefit from having a single huge section in memory to deal with; compared to look at 50 or 100 different locations.
A good starter for such discussions would be this question for example.
But I agree 100% with the other answer: unless you get massive performance gains out of using such techniques, you should always prefer to write "clean" (aka readable) code. And that translates to avoiding such practices.
I need to perform both single precision and double precision arithmetics on a variable in different parts of my code. So basically, I declare the variable as single precision first. Then I call subroutine sub_a which makes use of double precision version of the variable and performs double precision operations on that:
program main
implicit none
integer,parameter :: single = selected_real_kind(p=6,r=37)
integer,parameter :: double = selected_real_kind(p=15,r=307)
real(single),allocatable,dimension(:) :: A
real(double),allocatable,dimension(:) :: B
allocate(A(3),B(3))
A=2 ! single precision
A=A+3 ! single precision
print '(a,1x,3(f20.15))','sqrt(A),single:',sqrt(A)
print '(a,1x,I15)','mem_address of A before sub_a:',loc(A)
call sub_a(real(A,kind=double),B) ! double precision
print '(a,1x,3(f20.15))','sqrt(A),double:',B
contains
subroutine sub_a(a,b)
real(double),dimension(:),intent(in) :: a
real(double),dimension(:),intent(inout) :: b
print '(a,1x,I15)','mem_address of A in sub_a:',loc(a)
b=sqrt(a)
end subroutine sub_a
end program main
As seen in the code, I also obtained the memory address of A prior to calling sub_a and the version of A inside sub_a and they are expectedly different.
My questions are:
Is the version of A inside the sub_a allocated in the heap memory so I should not be worried about the size limitation?
is there any potential issue/bug in writing this example?
is there a better way of capturing the purpose described in this example, specially for larger size arrays?
Many Thanks
Update:
I haven't experienced any Memory issue for very large arrays, when using gfortran4.6 / ifort13.1 as compiler.
I plan to use suggestion by #innoSPG as an alternative approach.
by the nature of the call, the version of A that you have in sub_a is a temporary array created by a piece of code included by the compiler. However, if you will manipulate very large arrays, it is not a good idea.
For question 2. to my knowledge, there is no bug. The only issue is the temporary array that may be a problem if you have large arrays and limited memory on your system.
For question 3. In the case there is a memory issue, you can write sub_a to accept simple precision, and then convert each element in sub_a before using it in computation.
The temporary array at the call site can be made at the stack or at the heap, it is implementation dependent. Compilers usually have options to control the behavior.
With Intel Fortran it the option -heap-arrays n (n cannot be ommited or too low, because performance would be bad).
Gfortran would put them on the heap automatically if it doesn't know the size beforehand. It is good to use -fstack-arrays for better performance (it is included in -Ofast).
You will find similar options in other compiler's documentation.
I would not fear and avoid the temporary arrays at all cost. The possibility of much shorter (and more readable!) code is sometimes more important. Many codes have portions that are done only once during initialization or final output where the performance penalty is often irrelevant. I personally use it to make my code cleaner in these parts (not in the computational core).
I have a project that is half in C and half in Fortran 77. [No, not Fortran 90 or 03, Fortran 77.] The code would be much cleaner if I could pass pointers generated on the C side back to Fortran, which would then pass them back as necessary for handling in other C functions. As it is, the C code is filled with global variables that shouldn't be global, and is otherwise on the verge of becoming an unstructured mess. So are there any reasonably reliable ways to pass an opaque pointer between C and Fortran?
If you are on a 32-bit platform, consider casting the pointers to integers and passing those integers to the Fortran code. When the Fortran passes them back, reconvert the integer back into a pointer, cross-fingers, and use.
From what I remember (from 25+ years ago), Fortran 77 tends to pass everything to C by pointer anyway - and character strings get passed with a length, and arrays get passed with their dimensions.
If you're on a 64-bit platform, you'll have to work out whether the Fortran 77 compiler provides any 8-byte integers (INTEGER*8?) - my suspicion is that it won't (largely confirmed by looking at the GNU documentation; if you were using Fortran 2003, you'd be in better shape, it seems). If it does, the same trick works. If it does not, you are into much dodgier territory.
You could try - against recommendations - using a union of a double and a pointer. In the C, you'd set the pointer in the union from your C code pointer, then copy the double out of the union into a Fortran REAL*8, and as long as no-one touches that except to copy it or pass it back, maybe you will be OK if the gods smile favourably upon your endeavours. Most likely though, the whole thing will explode - this sort of union has an incredible ability to detect when the customer will be most annoyed if something doesn't work and will then proceed to explode at exactly the right moment - part way through the demo, or fifteen minutes after the program goes live.
An alternative to consider (still with gritted teeth) is a union of a 64-bit pointer and an array of two 32-bit integers, and then requiring the Fortran code to pass an array of two integers when you need to return a (64-bit) pointer. Clearly, an array of one integer(s) would work to 32-bit code; maybe just require the calling code to pass an array of two integers in all cases, zeroing the unused integer value in the 32-bit pointer case? That gives you forward migratability.
You can do this with the (non-standard) Cray pointer extension:
http://gcc.gnu.org/onlinedocs/gfortran/Cray-pointers.html