A Porting sapt2012 to different platforms

If one wants to port the SAPT codes to an architecture that is not supported in the official release, there are three main fragments of the codes that are strongly architecture-dependent and should be taken care of. Note that the issues described below have to be resolved for each program (e.g., tran, cc, and sapt.x) separately, as the programs do not currently use any common library with the system-dependent routines.

  1. The memory allocation routines. Each of the programs in the SAPT suite uses a single REAL*8 array which is allocated when this program starts. All matrices, both real and integer, used by the program are then defined within the allocated core array so that no further calls to any architecture-dependent allocation routines are needed (the programs assume that the integer variables are 4-byte by default). Note that whereas the CC program automatically determines the core size needed and allocates that much memory, all other programs need to have the requested core size declared in an appropriate namelist unless the default value of 40000000 words is sufficient (see Sec. 10.2 for details). Currently, the SAPT codes use the following mechanisms to allocate the core arrays:
    • The Fortran 90 ALLOCATE routine for SUN, HPUX, and Linux with the PGF90 or Intel compilers,
    • The Fortran malloc intrinsic for SGI and Linux with the PGF77 compiler,
    • A C function memget supplied by SAPT for the G77 compiler under Linux and for IBM AIX.

    For a new platform, a proper way of handling memory allocation must be chosen. The Fortran 90 ALLOCATE is recommended where available.

  2. The timing routines. The routines timit in tran/main.F, second in cc/whole.F, timing in e2d/ccbits.F, and timt in sapt/m.F return time (in seconds) elapsed since some fixed moment and are called throughout the program. These routines call system timing routines which are architecture-dependent:
    • the etime routine for SUN and HPUX,
    • the mclock routine for SGI, IBM, and Linux with Portland or Intel compilers (note that this one returns time in hundredths of seconds),
    • the second routine for Linux and the G77 compiler.

    Again, a proper system timing routine needs to be chosen for a new architecture.

  3. The packing/unpacking routines which manipulate the integral indices. As the current basis set size limit for SAPT is 1023 functions, 40 bits are needed to store the four orbital indices for a transformed two-electron integral. These indices are stored on disk packed into one 4-byte integer and one 1-byte integer, and the routines that take care of the integral packing and unpacking are located in tran/unpack.F, cc/unpack.F, e2d/ccbits.F, and sapt/unpack10.F. These routines have the following functions:
    • unpack10: (INTEGER*4,INTEGER*1)4 orbital indices
    • pack10: 4 orbital indices(INTEGER*4,INTEGER*1)
    • unpack10a: (INTEGER*4,INTEGER*1)2 pair indices
    • pack10a: 2 pair indices(INTEGER*4,INTEGER*1)
    • spltindx: INTEGER*8(INTEGER*4,INTEGER*1)
    • joinindx: (INTEGER*4,INTEGER*1)INTEGER*8

    The packing/unpacking routines are implemented using the intrinsic functions for bitwise operations: ishft, ibits, and iand. The efficiency of packing and unpacking influences the total calculation time quite notably, and the implementation through bitwise operations has been found to be optimal for several architectures. Note, however, that subtle differences in the syntax of these intrinsics exist for different architectures and some of the packing/unpacking routines are platform-dependent. Apart from the set of routines described above, the tran program needs another set of unpacking procedures to access two-electron integral indices written by various integral and SCF front-ends, including gaussian, gamess, and dalton (see the tran/unpack.F file for details). These routines are highly architecture-dependent since the structure of the integral indices depends on the endianness (big-endian or little-endian) of the machine (for gaussian) and on the options selected when compiling gaussian or gamess [see Sec. 8.1 (gaussian) and Sec. 9.4 (gamess) for more on this subject]. When porting to a new architecture, one must make sure that both the internal SAPT packing/unpacking routines listed above and the routines used to unpack integral indices written by a particular front-end program work correctly.