A Porting sapt2012 to
different platforms
If one wants to port the SAPT codes to an architecture
that is not supported in the official release, there are three main
fragments of the codes that are strongly architecture-dependent and should
be taken care of. Note that the issues described below have to be resolved
for each program (e.g., tran, cc, and sapt.x) separately,
as the programs do not currently use any common library with the
system-dependent routines.
- The memory allocation routines.
Each of the programs in the SAPT suite uses a single REAL*8 array which is allocated when this program starts.
All matrices, both real and integer, used by the program are then defined
within the allocated core array so that no further calls to any
architecture-dependent allocation routines are needed (the programs
assume that the integer variables are 4-byte by default). Note that
whereas the CC program automatically determines the core size needed and
allocates that much memory, all other programs need to have the requested
core size declared in an appropriate namelist unless the default value of
40000000 words is sufficient (see Sec. 10.2
for details). Currently, the SAPT codes
use the following mechanisms to allocate the core arrays:
- The Fortran 90 ALLOCATE routine for SUN, HPUX, and Linux with the
PGF90 or Intel compilers,
- The Fortran malloc
intrinsic for SGI and Linux with the PGF77 compiler,
- A C function memget
supplied by SAPT for the G77 compiler under Linux and for IBM
AIX.
For a new platform, a proper way of handling memory
allocation must be chosen. The Fortran 90 ALLOCATE is recommended where available.
- The timing routines. The routines
timit in tran/main.F, second in
cc/whole.F, timing in e2d/ccbits.F, and
timt in sapt/m.F return time (in seconds) elapsed since some
fixed moment and are called throughout the program. These routines call
system timing routines which are architecture-dependent:
- the etime routine
for SUN and HPUX,
- the mclock routine
for SGI, IBM, and Linux with Portland or Intel compilers (note that
this one returns time in hundredths of seconds),
- the second routine
for Linux and the G77 compiler.
Again, a proper system timing routine needs to be
chosen for a new architecture.
- The packing/unpacking routines
which manipulate the integral indices. As the current basis set size
limit for SAPT is 1023 functions, 40 bits are needed to store the four
orbital indices for a transformed two-electron integral. These indices
are stored on disk packed into one 4-byte integer and one 1-byte integer,
and the routines that take care of the integral packing and unpacking are
located in tran/unpack.F, cc/unpack.F, e2d/ccbits.F,
and sapt/unpack10.F. These routines have the
following functions:
- unpack10:
(INTEGER*4,INTEGER*1)→4 orbital
indices
- pack10: 4 orbital
indices→(INTEGER*4,INTEGER*1)
- unpack10a:
(INTEGER*4,INTEGER*1)→2 pair
indices
- pack10a: 2 pair
indices→(INTEGER*4,INTEGER*1)
- spltindx:
INTEGER*8→(INTEGER*4,INTEGER*1)
- joinindx:
(INTEGER*4,INTEGER*1)→INTEGER*8
The packing/unpacking routines are implemented
using the intrinsic functions for bitwise operations: ishft, ibits, and
iand. The efficiency of packing and
unpacking influences the total calculation time quite notably, and the
implementation through bitwise operations has been found to be optimal
for several architectures. Note, however, that subtle differences in
the syntax of these intrinsics exist for different architectures and
some of the packing/unpacking routines are platform-dependent. Apart
from the set of routines described above, the tran program needs another set of unpacking procedures
to access two-electron integral indices written by various integral and
SCF front-ends, including gaussian,
gamess, and dalton (see the
tran/unpack.F file for details). These
routines are highly architecture-dependent since the structure of the
integral indices depends on the endianness (big-endian or
little-endian) of the machine (for gaussian) and on
the options selected when compiling gaussian or
gamess [see Sec. 8.1
(gaussian) and Sec. 9.4
(gamess) for more
on this subject]. When porting to a new architecture, one must make
sure that both the internal SAPT packing/unpacking routines listed
above and the routines used to unpack integral indices written by a
particular front-end program work correctly.