Next: Assignment-2 Up: The Code Previous: fourier.f

### Discussion of fourier.f

• In order to understand what really happens inside get_diffusion_matrix and expand_temp_profile, we must learn more about PESSL subroutine fft, which has some very special requirements.
• In order to evaluate

and

we must provide subroutine fft with values of T0(x,y) and D(x,y) on a sufficiently dense grid. We could use the same grid for this purpose as the grid which we have already defined in main in order to compute values of T(x,y, t) at those points. But those two grids don't have to be the same. Furthermore, there are special requirements imposed by subroutine fft, as to the dimensions of its own grid. And these are as follows:
1.
Let nx and ny be the number of grid points in x and y directions.
2.
nx and ny must be a multiple of a number of processors requested for the computation.
3.
nx and ny must be no greater than 37748736 and must be of the form:

where h, i, j, k, and m are integers and

• The purpose of subroutine factor_nodes and function minpower2 is to ensure that these very special conditions are satisfied and that a suitable grid can be generated taking our original numbers nx and ny, which correspond to the requested numbers of harmonics in x and y, as a starting point. We shall discuss how this is done exactly towards the end of this section. For the time being let us simply assume that it's done and that we have tweaked our input nx and ny so that they're now suitable. In other words, for the FFT computation we generate a physical grid which has the same number of points in x and y as the requested number of harmonics in these directions.
• Subroutine fft computes the following:

Now you can see why we have scaled our problem so that it would fit into .
• How to relate this double sum to:

The answer is in the scaling factor , which must be set so that

We have nx little segments between 0 and and ditto for ny. Therefore

Consequently

If we make our then we'll evaluate

• This is exactly what happens below:
    scale_f = 1.d0/ REAL(nx*ny, long)
ALLOCATE(df(nx,ny))
DO j = 1, ny
DO i =1,  nx
df(i,j) = diff_coef((twopi*(i-1))/nx,(twopi*(j-1))/ny)
ENDDO
ENDDO
CALL fft(df,scale=scale_f,transpose='N')

After having determined the scaling factor (our , theirs scale_f), we evaluate D(xi, yj) for and , save the result on an auxiliary variable df (which is distributed the same way as f, i.e., (*,BLOCK), but has different dimensions, i.e., , whereas f has dimensions ), and load the lot into fft.
• Subroutine fft will return the numbers that correspond to the Fourier transforms for and , i.e.,

on the same matrix, df.
• Our next task is to evaluate matrix F following the formula:

while at the same time collapsing the double indexes into a single index according to

we build two arrays ixi and iyi which simply remember values of j and k that correspond to a given l. This is how it's done:
    num_errors=0
ALLOCATE(ixi(dif_nx*dif_ny), stat=istat)
IF( istat .NE. 0 ) num_errors = num_errors + 1
ALLOCATE(iyi(dif_nx*dif_ny), stat=istat)
IF( istat .NE. 0 ) num_errors = num_errors + 1
DO ix = 1, dif_nx
DO iy = 1, dif_ny
i = (iy-1)* dif_nx + ix
ixi(i) = ix
iyi(i) = iy
ENDDO
ENDDO

• Now we can use these arrays in calculating Fij. The code below has a double-nested DO loop, whose outer index runs over j and the inner index runs over i. At the beginning of every loop we find a pair (k, l) that corresponds to a given j or i by looking up arrays iyp and ixp:
    DO j = 1, dif_nx*dif_ny
iyp = iyi(j)
ixp = ixi(j)
DO i = 1, dif_nx*dif_ny
iy = iyi(i)
ix = ixi(i)

• Then we evaluate k - k' and j - j':
          ixdiff = iabs(ix-ixp) + 1
iydiff = iabs(iy-iyp) + 1

Observe that we evaluate the absolute values of these (remember the argument that F is symmetric) and then add 1. Why? That's because the fft subroutine returns results for and , but in our case and . So, for this reason we're shifting the indexes in the matrix returned by fft by 1.
• The matrix Fij is now calculated as follows:
          f(i,j) = ( ( ix*ixp + iy*iyp*dif_ly_ratio ) *   &
REAL(df(ixdiff, iydiff)    - df(ix+ixp+1,iy+iyp+1)) &
+     ( ix*ixp - iy*iyp*dif_ly_ratio ) * &
REAL(df(ix+ixp+1,iydiff )  - df(ixdiff,iy+iyp+1)))

• Before returning from the subroutine we free space that was allocated for auxiliary variables:
    DEALLOCATE(df)
DEALLOCATE(ixi)
DEALLOCATE(iyi)
RETURN
END SUBROUTINE get_diffusion_matrix

• Let us now have a look at the next subroutine, which evaluates a0, expand_temp_profile.
• The formula for a0 is:

• This means that we have to scale our parameter in order to get the factor. Setting gave us , so multiplying by should yield exactly what's needed:
    scale_f = -twopi / REAL( nx*ny,long)

• Now we create a temporary array, called atmp, which we're going to initialise to on a grid in the (x, y) space with dimensions already made to suit subroutine fft:
    ALLOCATE(atmp(nx,ny), stat=istat )
IF( istat .NE. 0 )  THEN
WRITE(*,*) 'allocation failure in expand_temp_profile'
STOP
ENDIF
DO i = 1, nx
DO j = 1, ny
atmp(i,j) = init_temp((twopi*(i-1))/nx, (twopi*(j-1))/ny)
ENDDO
ENDDO

• The next step is simply to call subroutine fft passing atmp and scale_f to it:
    CALL fft(atmp,scale=scale_f,transpose='N')

The coefficients akj(0) will be returned on atmp.
• The last step is to collapse the indexes i and j onto a single index and transfer appropriate array entries from atmp to a:
    DO j =  1, dif_nx * dif_ny
jx = MOD(j-1, dif_nx) + 2
jy = (j-1) / dif_nx + 2
a(j) = REAL(atmp(jx,jy),long)
ENDDO

Observe that this time we do it differently than in subroutine get_diffusion_matrix.
• Before returning to the caller we free space used by atmp:
    DEALLOCATE(atmp)
RETURN


Next: Assignment-2 Up: The Code Previous: fourier.f
Zdzislaw Meglicki
2001-02-26