10 How to run sapt2012

To perform a sapt2012 calculation for one dimer geometry, one has to run a dozen or so programs: integral/SCF calculations for the monomers and possibly for the dimer, interface programs (in most cases) rewriting integral/SCF files into different forms, one- and two-electron integral transformations, MBPT/CCSD calculations for monomers, and finally, the “proper” SAPT calculations. All of this is performed automatically using the script SAPT from ./SAPT2012/bin directory (note, however, that a special script doSAPT_CADPAC has to be used if cadpac is the front-end SCF program). This script calls other executables and scripts which can be found in the same place. Calculation of the interaction potential energy surface of a dimer involves multiple invocations of the SAPT script for different dimer geometries. This process can be simplified and automated with the help of the Runlot utility scripts, described in Sec. 10.6 .

One of the scripts called by SAPT is the script bin/Clean that cleans up unnecessary files after a SAPT run (it will erase the SAPT-related files from the directory in which it is run). The files are not automatically erased at the end of of each run in order to enable restarts (which has to be done on a case-by-case basis by modifying the SAPT script except when starting from the transformation step). Therefore, bin/Clean is called at the beginning of the SAPT script, so that a consecutive calculation can be performed in the same directory (do not forget to change the name of the output file). This is necessary since several temporary files are named with no reference to the job name. Thus, two simultaneous calculations cannot be done in the same directory (but can be run, of course, in separate directories). It is also prudent to run bin/Clean itself (just by executing ./SAPT2012/bin/Clean) to release unnecessary disk space after finishing calculations in a given working directory.

The actual running of the program when using the SAPT script is very simple. On most installations runs are performed in a “working” or “scratch” directory designed to hold large temporary files. We find it simplest to make a subdirectory there, copy the input files (created by the user or taken from the ./SAPT2012/examples) to this subdirectory, and either execute the SAPT script in this directory using the full path (e.g., /home/local/SAPT2012/bin/SAPT ...) or simply copy the SAPT script to the working directory (several other possibilities exist, for example users can add the ./SAPT2012/bin directory to their PATH environment variable). However, the SAPT script requires the vars.cfg file present in the same directory so this file must be copied accordingly if the script is not run from the SAPT2012/bin directory.

During the compilation, the script Compall updates vars.cfg file with proper paths to executables and the scripts should run properly on any installation without changes if vars.cfg is present in the same directory. If the paths need to be changed for some reasons, the file vars.cfg should be edited and the variable MAIN_SAPT_DIR and the SCF_HOME_DIRECTORIES changed to reflect user’s directory structure (if gamess is used, it concerns also additional directories relevant for this program).

Also note that the large core memory requested typically by the sapt2012 programs requires on some systems the use of the ulimit command to change the default user resources. This command, built into the SAPT script, is currently commented out but it may be reactivated and adjusted as needed.

The SAPT script is written in ksh although on Linux platforms, some of which are not equipped in ksh, it is actually executed under bash. A sapt2012 calculation is launched by typing SAPT with appropriate options. Typing SAPT without any options will produce a brief description of the necessary input. A typical run statement might be something like (in Unix ksh):

SAPT jobname [opt1] [opt2] >output.file 2>&1 &

The keyword jobname has to be the same as the beginning of the name of the input files for the system considered (we use a naming scheme to reference the needed input files). For example, let the keyword be name. Then, as the script is running, it will look for the nameP.data file, which contains the input data to the programs tran, ccsdt, and sapt.x. Similarly, the SAPT script will look for input files to the integral/SCF parts of a calculation starting with name. The number and full names of such files depend on the type of basis set used in calculations and on the choice of the integral/SCF code.

For all SCF front-ends except gamess, the keyword opt2 can be skipped and opt1 is optional. Using the string scfcp for opt1 will request, in addition to the standard SAPT calculation, also the CP-corrected supermolecular SCF interaction energy in the dimer-centered basis set. If such a calculation is not required, leave opt1 blank or, better yet, use noscfcp instead. Using the keyword gototran as opt1 will result in restarting a sapt run from the transformation step, i.e., from the program tran. Both parameters, opt1 and opt2, are required when gamess is used as the SCF program. opt1 can assume one of the values listed above (but noscfcp must now be used instead of a blank), while opt2 must be the full path of the scratch directory in which the whole calculation is taking place. opt2 has the same meaning when dalton is used. However, for dalton opt2 is optional and the place where SAPT is launched will be assumed as the scratch directory if this parameter is omitted.

The output from all programs executed by SAPT is written to a file output.file (this name can be set by the user to an arbitrary string, usually connected with the system being computed and its geometry). The last two elements in the command line given above are Unix ksh constructs which indicate that standard error messages will be written to the same file as the standard output and that the process will be run in the background.

Most modern computer clusters do not allow command line submission of jobs. Instead, one has to use a queuing system (batch processor). In this case, the sequential SAPT should be submitted in a way analogous to that described in Sec. 14 on psapt2k2, of course, requesting just one core.

10.1 Calculations of integrals and SCF energies

A sapt2012 run starts with calculations of “atomic” integrals, i.e., integrals involving the basis functions, and proceeds to calculations of the SCF orbitals and orbital energies for monomers. Optionally, if the opt1 keyword is set to scfcp, a counterpoise (CP) corrected supermolecular calculation of the HF-SCF interaction energy is also performed. The number and type of SCF calculations depend on the chosen approach to the basis set construction and on the selection of the supermolecular calculation. For a description of possible choices of basis sets, see the following subsections and Ref. 54.

For information on the input to the integral/SCF programs please refer to the original documentation for those codes. Some remarks on this subject can also be found in Sec. 9. For an atmol manual see

http://www.theochem.ru.nl/~pwormer
. Please note that symmetry considerations are not built into the sapt2012 codes, so the SCF packages should be asked to output the integrals with no symmetry assumptions.

The sapt2012 codes make an implicit assumption that for each monomer the number of occupied orbitals is smaller or equal to the number of virtual orbitals and a crash will occur if this condition is not met. It is thus not recommended to run sapt2012 calculations in minimal basis sets.

Regardless of the interface program, sapt2012 is limited to 65535 molecular orbital indices (or 1024, if the setting BEYOND1024=YES was not used in the Compall script).

10.1.1 DCBS and DC+BS approaches

If a dimer-centered basis set (DCBS), possibly including “bond” functions (denoted then by DC+BS), is used and the supermolecular SCF interaction energy is not needed, then only two integral/SCF calculations, for monomer A and for monomer B, will be performed. The DCBS/DC+BS approach means that the calculations for monomer X are performed using the orbital basis set consisting of all functions, those “belonging” to monomer X (i.e., centered on the nuclei of monomer X), those of the interacting partner (i.e., centered at the positions where the partner’s nuclei are in the dimer), and the bond functions (in DC+BS case). In the inputs, the bond functions and the functions of the interacting partner are typically connected with zero-charge centers and are sometimes called “ghost” orbitals. The script will then look for files nameA.* and nameB.* for the respective monomers. The number and full names of the files depend on the integral/SCF code used. Here is a list of files needed for some of the front-end codes:

atmol1024: 
nameA.intinp, nameA.scfinp, nameB.intinp, and nameB.scfinp
gamess: 
nameA.inp and nameB.inp
gaussian: 
nameA.data and nameB.data
dalton: 
nameA.dal, nameA.mol, nameB.dal, and nameB.mol
molpro: 
name.molpro

In each case, the file nameP.data will also be needed containing input for the post-SCF part of the calculation (tran, cc, and sapt.x stages). In a DC+BS run, the variable DIMER in the namelist TRN in this file must be set to T (or .TRUE.).

If DCBS or DC+BS is used and a supermolecular SCF interaction energy calculation is requested (the scfcp option is set), the script will additionally look for a dimer input file (or files) name.* (notice the convention: nameA, nameB, and name for the monomers A, B, and for the dimer, respectively). Thus, for example for atmol1024, the complete set of input files necessary for this type of run are: name.intinp, name.scfinp, nameA.intinp, nameA.scfinp, nameB.intinp, nameB.scfinp, and nameP.data.

In the DCBS or DC+BS approach, the basis set specifications for all integral/SCF calculations are identical except for different charges set to zero in different files and for different numbers of electrons in each run. If the scfcp option is used, the file of two-electron integrals can be computed only once, during the dimer run, and the subsequent SCF calculations for monomers A and B can then use this file (most integral/SCF programs are flexible enough to allow such a route). Note, however, that the one-electron integrals must be computed separately for monomers A and B. If the scfcp option is not used, only the monomer A calculation needs to compute the two-electron integrals.

10.1.2 MCBS and MC+BS approaches

An alternative, and strongly recommended, way of performing sapt calculations is to use the so-called monomer-centered ‘plus’ basis set (MC+BS) [54]. To understand this approach, first notice that the conceptually simplest (and most natural) method is to use in SAPT the monomer-centered basis set (MCBS), i.e., a basis that includes only functions that one would have used if the calculations involved only the energies and properties of a given monomer (calculations for monomer X involve only basis orbitals centered on this monomer). In the limit of infinite orbital basis set, the MCBS approach converges to the exact values for each SAPT correction. However, for several reasons this convergence is slow for some corrections [54]. The simplest cure for this problem is to use the DCBS or, even better, the DC+BS approach. The advantage of such a method is a close relation to the supermolecular approach with the CP correction. The disadvantage is that the basis set is increased by a factor of two (between MCBS and DCBS methods for identical monomers). Reference [54] has shown that many of the functions used in DCBS/DC+BS are not needed for the SAPT convergence. If a DCBS/DC+BS is reduced to some intermediate size, it is called an MC+BS. It has been recommended in Ref. 54 that MC+BS’s are constructed from an MCBS by adding all the midbond functions and only a part of the basis set on the interacting monomer (so-called ‘farbond’ functions). The simplest choice for the farbond part is the “isotropic” part of the basis set, i.e., orbitals with symmetries appearing in the occupied orbitals of constituent atoms (i.e., s part of the basis for H and He and sp part of the basis centered on the first and second row atoms). In practice, MC+ basis sets match the accuracies of DC+ bases and at the same time reduce several times the costs of SAPT calculations.

An MC+BS approach requires a little more work with setting up the basis sets than a DC+BS calculation. The reason is that in the former approach the basis functions appear in three different roles: a given basis function can belong only to monomer A, only to monomer B, or to both monomers. A calculation of the integrals in a set with repeated functions would be unnecessarily time and disk space consuming, but a method has been developed to avoid this problem. Below, the individual files needed for an MC+BS calculation are listed first and then the dependence between the structure of these files and the control parameters in the file nameP.data is described.

If MC+BS calculations are performed and the supermolecular SCF interaction energy is not requested (scfcp keyword not present), the SAPT script will look for input files nameMA.*, nameMB.* to perform integral/SCF calculations for monomers A and B, respectively. The input files for these runs should contain the basis of a given monomer plus the midbond and farbond functions on the “ghost” centers. Since the basis sets are different for A and B, the two-electron integrals from A cannot be reused for B. The eigenvalues and eigenvectors from these two runs are saved, whereas both one- and two-electron integrals are discarded. The script will next look for an input file nameA.* to calculate one- and two-electron integrals for monomer A in the DC+BS equivalent of the MC+BS used (this basis is identical to the one described in the previous subsection, but in some cases the functions have to be ordered in a special way or the proper “tag” parameters have to be specified in the file nameP.data, as discussed below). No SCF calculations are needed (if performed, the results will be discarded). This step is needed to produce the integrals for the sapt calculations. Next the script will look for a file nameB.* to calculate one-electron integrals only for the monomer B in the DC+ basis set (the two-electron integrals need not be calculated since these are identical as for monomer A). Again, no SCF step is needed. Thus, if for example atmol1024 is used as the integral/SCF program, the needed input files are: nameMA.intinp, nameMA.scfinp, nameMB.intinp, nameMB.scfinp, nameA.intinp, nameB.intinp, and nameP.data. In the last of these files, the variable DIMER in namelist TRN should be set to F (or .FALSE.).

If MC+BS calculations are performed and the supermolecular HF-SCF interaction energy is requested (the scfcp option is used), the script will, as in the previous case, first look for input files nameMA.* and nameMB.* to perform integral/SCF runs for monomers A and B in an appropriate MC+BS. Next, the script will look for the file(s) name.* to perform integral/SCF calculations for the dimer in the DC+BS equivalent of the MC+BS used. In the next two steps the files nameA.* and nameB.* containing the same DC+ basis set will be used, as in the calculations described in the previous paragraph. However, now neither the monomer A nor B calculation need to compute the two-electron integrals since these are already available from the dimer calculation. Also, in contrast to the case of the previous paragraph, the SCF calculations are performed for each monomer since the total SCF energies are needed to compute the supermolecular HF-SCF interaction energy. The SCF orbital energies and coefficients from these runs are discarded. Thus, if for example atmol1024 is used as the integral/SCF program, the following input files are needed: nameMA.intinp, nameMA.scfinp, nameMB.intinp, nameMB.scfinp, name.intinp, name.scfinp, nameA.intinp, nameA.scfinp, nameB.intinp, nameB.scfinp, and nameP.data. If gaussian is used as the integral/SCF program, the input files needed are: nameMA.data, nameMB.data, name.data, nameA.data, nameB.data, and nameP.data. If dalton is used, the following input files are needed: nameMA.dal, nameMA.mol, nameMB.dal, nameMB.mol, nameA.dal, nameA.mol, nameB.dal, nameB.mol, nameP.data, name.dal, and name.mol.

The possibility of skipping parts of integral/SCF calculations varies between programs. In the case of atmol1024, the calculations of two-electron integrals are omitted by simply including the BYPASS TWO statement in the proper *.intinp file and an SCF calculation is omitted by not executing the corresponding binary (scf). The two-electron files from a previous atmol1024 run will be recognized by a subsequent SCF run due to the naming convention. For dalton, the appropriate keyword is .NOTWO and it can be used in the file nameB.dal in MC+BS calculations. It appears that some integral/SCF programs, e.g., gamess, do not have an input option to skip the calculation of two-electron integrals.

There are two ways of arranging basis functions in an MC+BS (or MCBS) run. The first way is known to work with atmol1024 and to fail with gaussian, dalton, and molpro. It may work with other packages, but it has not been tested. This method is chosen by specifying the keyword BLKMB=T in the TRN namelist read from the file nameP.data (this is also the default and therefore this keyword can be omitted). If this path is used, the basis functions in the DC+BS-type input files specified above (name.*, nameA.*, and nameB.*) have to be ordered as follows:

polA isoA mid isoB polB

where isoX is the isotropic part of the basis set of monomer X (as defined above), polX are the polarization functions on monomer X (orbitals with symmetries p and higher for H and He and d and higher for the first and second-row atoms), and mid are midbond functions (optional). In the calculations producing the orbital energies and coefficients for monomers A and B (files nameMA.* and nameMB.*), the basis set should be ordered as:

polA isoA mid isoB

and

iso mid iso pol , A B B

respectively. The method is in fact more general than the names isoX and polX indicate. As it should be clear from the above, the basis set for each monomer can be divided into two arbitrary subsets. Examples of MC+BS input files set up using this strategy can be found in the directories ./SAPT2012/examples/ATMOL1024/HF2_MCBS, ./SAPT2012/examples/ATMOL1024/CO2D_MCBS, and also ./SAPT2012/examples/ATMOL1024/ArH2O_MCBS.

A more general method that works for all integral/SCF front-ends is the “tags” method. In this method, chosen by BLKMB=F, the ordering of functions is arbitrary, except that it has to be the same in all input files. Of course, the “MCBS files”, nameMA.* and nameMB.* will contain only subsets of the whole basis. It is, however, required that the basis specifications in these files differ from the specification in the files name.*, nameA.*, and nameB.* by only the deletion of functions which belong exclusively to the other monomer, i.e., the sequence of functions in the subset remaining after such deletion is not altered with respect to the whole DC+BS set. The role of a given function is specified by “tagging” it in the TRN namelist read from the file nameP.data. This is achieved by first setting the variable BASIS to a string containing letters spdf..., one letter for each shell in the DC+BS-type basis (thus, for a part of the basis with 3s2p1d, the string should be sssppd). Since the program expands each symbol into a number of orbitals, one has to provide information on the type of basis set, spherical or Cartesian, which is done by setting the variable SPHG to .TRUE. or .FALSE., respectively. Note that when running sapt2012 with gamess as a front-end, SPHG must always be set to .FALSE., even if the ISPHER=1 option was applied during the SCF calculations to enforce removal of spurious spherical components from the Cartesian basis used by gamess. The other string variable to be specified is TAGS. It has to contain exactly the same number of characters as the BASIS variable. The allowed characters are a, b, and m, denoting basis functions appearing only in the MC+BS of monomer A (which can be the same as in polA part of the basis discussed above), only in the MC+BS of B (like polB), and in the MC+BS sets of both monomers, respectively. Each of the variables BASIS and TAGS should end with a + sign denoting the end of a string.

For an example of using tags with gamess, see the files in the directory
./SAPT2012/examples/GAMESS/HF_NH3_MCBS. This example does not use the iso∕pol splitting, to show the more general character of the tags method. Here, in the files HF_NH3.inp, HF_NH3A.inp, and HF_NH3B.inp, the DC+BS basis functions are entered in the order F (5s3p2d), H of HF (3s2p), midbond (2s1p), N (5s3p2d), H1 of NH3 (3s2p), H2 of NH3 (3s2p), and H3 of NH3 (3s2p). Now, in this example, the MC+BS for monomer A (HF) is constructed by deleting the last two s, the last p, and the last d functions of N, as well as the last s and p functions on each H of NH3 (see the file HF_NH3MA.inp). The functions deleted here happen to be the most diffuse ones for each symmetry, although in principle other choices could have been made as well. Similarly, the MC+BS for monomer B (NH3) is constructed from the whole DC+BS set by deleting the last two s, the last p, the last d functions of F, and the last s and p functions of HF’s hydrogen (see the file HF_NH3MB.inp). Note that the sequence of basis functions in the input files is determined by the sequence of atomic centers and the sequence of functions within each center. Thus, simply deleting a contraction does not change the relative sequence in the remaining set, as required. The part of the TRN namelist corresponding to this setup is

  BLKMB=F, SPHG=F,  
  BASIS=’ssssspppddsssppsspssssspppddsssppsssppssspp+’,  
   TAGS=’mmmaammamammamammmmmmbbmmbmbmmbmbmmbmbmmbmb+’

Note that the variable SPHG has been set to .FALSE. since gamess always generates integrals in the Cartesian basis (even if the SCF calculations are performed in variational space spanned by only pure spherical components, as it is done with the option ISPHER=1 in gamess input files *.inp). If the “tags” method is used with some other SCF program, make sure that the value of SPHG matches the actual type of basis set for which the atomic integrals are generated.

Other examples of MC+BS input files using the BASIS and TAGS options with gamess can be found in directories ./SAPT2012/examples/GAMESS/*_MCBS. For atmol1024, gaussian, and molpro, the examples are in ./SAPT2012/examples/ATMOL1024/ArHF_MCBS, ./SAPT2012/ examples/GAUSSIAN/CO2D_MCBS, and ./SAPT2012/examples/MOLPRO/CO2D_MCBS, respectively.

10.2 Input for post-Hartree-Fock part

The input for the transformation tran, the MBPT/CC code cc, and the proper SAPT program sapt.x is supplied in the file nameP.data. This input consists of a title line and three namelist sets. The first line is reserved for the title of the run (up to 80 characters). The three namelists that follow are TRN, which is for the input to the transformation program, CCINP, which passes information to the MBPT/CC program, and INPUTCOR, which informs the perturbation program which corrections are to be computed.

It should be noted that the exact syntax of a namelist statement depends on the platform. For example, on SGI and Linux platforms, each namelist should start with the name preceded by & (e.g., &TRN) and end with &END. On the other hand, IBM compilers require the forward slash (/) as the namelist end marker. Throughout this manual, the former convention will be followed.

10.2.1 Namelist TRN

The major function of this namelist is to tell the tran program which integral/SCF package is in use. Currently supported codes are listed in Table 1. In addition, a number of legacy codes should still work, although several of them have not been used for years.

Out of the 10 possible integral/SCF selections the user should select only one by setting the corresponding variable equal to .TRUE. and all the other variables equal to .FALSE. The selection variables are given by:

Of course, selecting a given SCF package in the TRN namelist will work only if sapt2012 has been compiled for use with this package (see Sec. 8).

The variable DIMER is used to determine whether the transformation type is dimer (default) or monomer (DIMER=.FALSE.). The dimer-type transformation is chosen for DCBS/DC+BS calculations and the monomer-type one for MC+BS calculations.

If monomer type of transformation is set, several other options become available. These are BLKMB, BASIS, TAGS, and SPHG variables connected with the tags system. The use of these variables was described in Sec. 10.1.2. Also, a description is provided in comments of the subroutine mkoffset of the module trans.F (in subdirectory ./SAPT2012/tran).

Setting the OUT variable to .TRUE. will give more extensive printing from tran, showing mainly memory partitioning and the input vectors.

The variable TOLER sets the threshold for writing the transformed integrals to disk. The input is an integer n which is then used to discard all integrals which are less than 10-n in absolute value. An input of –1 gives a threshold of identically 0. We recommend values slightly tighter than normally used for similar purposes in isolated molecules calculations, of about 10-12 around the van der Waals minimum and progressively smaller as one proceeds farther away from the minimum. Note that the usual default thresholds in most integral/SCF programs will be larger than this value and must therefore be lowered so that the atomic integral files which are the input to the transformation contain sufficiently small integrals. It is also advisable to tighten the threshold for the convergence on the density matrices in order to increase the accuracy of the SCF eigenvectors.

If the SCF program used is atmol1024 and if the two-electron integrals produced by this program are stored in more than one file, the number of such files has to be given in TRN as the variable NMFILES. For example, NMFILES=3 will tell the transformation to look for atmol1024 integrals in the files MT3, MT4, and MT5. The most likely use of this option is on old Linux platforms which do not support files larger than 2 Gbytes so that the integrals have to distributed over several smaller files.

Finally, the MEMTRAN variable can be used to dynamically allocate memory for the transformation section (tran) of sapt2012. As a default, the memory for the tran program is set to 40 Mwords (320 Mbytes). If possible, the memory should be set large enough so that the transformation program uses the faster “in core” path. When DIMER=.TRUE., an even faster algorithm is switched on if twice as much memory as for the ordinary “in core” algorithm is available. The amount of memory needed for the available transformation paths can be calculated beforehand using the program memcalc in ./SAPT2012/bin (see Sec. 10.7 ). Declaring more than the amount needed for the “fast in core” algorithm when DIMER=.TRUE., or more than needed for the standard “in core” approach for DIMER=.FALSE., makes no difference for the performance of transformation (and may increase the probability of crashing due to requesting more memory than available on a system at a given time). The transformation will work in smaller memory, except that the slower “out of core” pathway will be chosen (see the source module memory.F for more details). One can read from the transformation output of a test run whether the “out of core” or “in core” (faster) pathway has been chosen and what was the largest size of memory used. This method can also be used to adjust the MEMTRAN variable appropriately (look for lines containing phrases similar to:

... Mem: 8182228 CPU: 463.7

or

AABBOVVV Integrals. Out of core. 2 passes Mem: 29848693).

The minimal information absolutely necessary for the namelist TRN is given by: &TRN ISITx=T &END, where x is one of the symbols denoting the integral/SCF program listed above.

10.2.2 Namelist CCINP

The purpose of this namelist is to pass information to the MBPT/CC program which generates the required cluster amplitudes for the intermolecular perturbation theory program sapt.x. The namelist variable CCPRINT tells the program whether or not to do extra printing. This printing involves information about the memory partitioning and the integral class which is currently being read in. The namelist variable VCRIT gives the tolerance for retaining cluster amplitudes. Here we recommend a tolerance of 1 × 10-10.

The variables TOLITER and TOLAMP determine the convergence criteria of the CC iterations if the converged amplitudes are required, i.e., if the variable CONVAMP (see namelist INPUTCOR) is set to .TRUE. In such a calculation, the CC program continues the iterations until one of the following two conditions is fulfilled:

  1. The iteration number, k, exceeds 39 (no convergence).
  2. All of the conditions listed below are met:
     | | (k) ||E-(kC)C --E-(kC-C1)|| DE = || E(k) - E || < TOLIT ER ∘ ----------CC----SCF-- ∑ijab(tabij(k) - taibj(k- 1))2 D(D )(k) = ---------o2v2--------- < TOLAM P ∘ ∑------------------ (k) --ia(tai(k) --tai(k--1))2 D(S) = ov < TOLAM P

where tijab and tia are the double- and single-excitation amplitudes, respectively. Note that DE(k) is the relative correlation energy change. The default values are TOLITER=10-5 and TOLAMP=10-6.

DIIS convergence acceleration in the coupled cluster iterations can be turned off by the DIIS=F directive. DIIS is turned on by default (recommended). The variables RPA, CCD, CCSD, and several others can be set to .TRUE. or .FALSE. to choose one of the possible forms of CC theory. At this moment only CCSD=T and CCD=T are guaranteed to work and are needed for the SAPT corrections. Since CCSD=T and all other variables of this type are false by default, these keywords can be omitted from the namelist. The variables SITEA and SITEB allow to perform MBPT/CC calculations for one of the monomers only (for testing purposes). The default values are .TRUE. and therefore these variables can be omitted from the list. The variable WRTEACH allows writing of the cluster amplitudes from each iteration into separate disk files for future use. It is .FALSE. by default and is not needed for the ordinary SAPT run; however, it must be set to .TRUE. if one wants to calculate the dispersion energy at the CCD level - see Sec. 10.4 . Finally, the logical variable AOCC switches on/off the atomic-orbital (AO) based algorithm for calculating the most time-consuming four-virtual contribution to the CCSD amplitudes. This slows down the CC calculations slightly, but the tran program requires much less time since no four-virtual transformation is needed. At present, AOCC=.TRUE. works only when atmol1024 or molpro has been used as the integral/SCF interface, and when either the DCBS/DC+BS approach is used, or, in case of an MC+BS calculation, the basis functions are properly grouped (i.e., the variable BLKMB in the TRN namelist is equal to its default value of .TRUE.). Note that this implies that the AOCC=.TRUE. algorithm with the molpro interface works only for the DCBS/DC+BS approach. The default value of AOCC is .TRUE. if the above conditions are satisfied and .FALSE. otherwise, so an explicit specification of this variable is not needed unless one wants to turn off the AO-based pathway for atmol1024 or molpro interfaces. The defaults should be sufficient for this namelist, so all that is absolutely necessary to specify would be a statement of the form: &CCINP &END, i.e., no input.

10.2.3 Namelist INPUTCOR

This namelist tells the SAPT program which of the perturbation theory corrections are to be computed. All the variables are by default set to .FALSE., so only those corrections that one wants to be computed have to be mentioned in the namelist. The list of variables and of the associated currently available corrections is as follows:

  1. E1TOTEelst(10) and Eexch(10). By default, the so-called S2 approximation to Eexch(10) is also computed. This computation can be turned off by setting E1S2=.FALSE. in the INPUTCOR namelist.
  2. E2INDEind(20)
  3. E2INDREind,resp(20)
  4. EEX2IEexch-ind(20)
  5. EEX2IREexch-ind,resp(20)
  6. EEX2DEexch-disp(20)
  7. EEX2Eexch-disp(20) and Eexch-ind(20), i.e., EEX2=.TRUE. implies EEX2I=.TRUE. and EEX2D=.TRUE.
  8. EEX2REexch-disp(20) and Eexch-ind,resp(20), i.e., EEX2R=.TRUE. implies EEX2IR=.TRUE. and EEX2D=.TRUE.
  9. E12Eelst(12)
  10. E12REelst,resp(12)
  11. E13PLEelst(13)
  12. E13PLREelst,resp(13)
  13. E11Eexch(11)
  14. E111Eexch(111)
  15. E12XEexch(120) + Eexch(102)
  16. CONVAMPEexch(1)(CCSD)
  17. E2DSPEdisp(20)
  18. E21DEdisp(21)
  19. E22DEdisp(22)
  20. EMP2E(02) = MBPT2 correction to monomer’s correlation energy (for tests only)
  21. E22ItEind(22) (see Ref. 2)
  22. E300DEdisp(30)
  23. E3INDEind(30)
  24. DSPINDEind-disp(30)
  25. E30XDEexch-disp(30)
  26. E30XDSDQEexch-disp(30) without the most time-consuming ‘triples’ term
  27. E30XIEexch-ind(30)
  28. DSPIXEexch-ind-disp(30)
  29. E3INDREind,resp(30) and a scaled approximation to Eexch-ind,resp(30)

Notice that the electrostatic energies Eelst(1i) are sometimes denoted as ERS(1i) as these terms result from the Rayleigh-Schrödinger perturbation theory that in this context is often called polarization theory. The acronym “pol” in several places in the program, as well as the letters “PL” in E13PL, are resulting from this terminology. If tEind(22) is requested, its exchange counterpart will be estimated from the formula of Eq. (7 ). The triple superscripts appearing for example in the correction Eexch(111) denote the order with respect to the operators V , WA, and WB, respectively.

The programs calculating the following corrections either have not been sufficiently tested, or are known to contain errors. Therefore, calculations of the following corrections are not recommended:

  1. E122Eelst(122)
  2. E14PLREelst,resp(140) + Eelst,resp(104) (without triples)
  3. TE14 — Triples for Eelst,resp(14)
  4. E1CC — calculates the electrostatic energy from the formulas analogous to those appearing in Eelst(12) using the converged CCSD amplitudes in place of the second-order ones. Instead, use Eelst,resp(1)(CCSD) described in Sec. 10.5.

Six variables are provided to select groups of corrections: SAPT0, SAPT2, SAPTNOCC, DELTASCF, SAPTKS, and SAPT, and only one of them should be set to .TRUE. The settings SAPT2=T and SAPT=T select the groups of corrections defined by Eqs. (6) and (8), respectively. The choice SAPTNOCC=T is equivalent to SAPT=T except that the intramonomer correlation contribution to Eexch(1) is approximated as Eexch(11) + Eexch(12) instead of ϵexch(1)(CCSD). The choice SAPT=T is approximately equivalent in accuracy of predictions to using supermolecular MBPT through fourth order. It has been used for most published recent SAPT calculations for small and medium-size systems. The choice SAPT2=T, equivalent to MBPT2, requires significantly less computer time and can be recommended for larger dimers if time of calculations at the SAPT=T level becomes an issue. In most cases, we recommend that the δEint,respHF term defined by Eq. (5 ) is included as implied by Eq. (6) (this requires the scfcp keyword as a command-line parameter to the SAPT script). Only for nonpolar or nearly nonpolar monomers this term should be omitted, see Ref. 38 and the text below for a discussion of this issue.

The choice SAPT0=T selects all available SAPT corrections of the first and second order in V and of zeroth order in W, i.e.,

ESAinPt T0 = E(e1ls0)t + E (1ex0c)h + E (2in0d),resp + E(e2x0)ch- ind,resp + E(d2is0)p + E (2ex0c)h-disp.
(10)

This level is recommended for calculations for very large systems when a higher level of theory would be too time-consuming (however, note that SAPT(DFT) described in Sec. 16 would offer much better accuracy at similar computational costs). One can expect that this level of theory may introduce about 20-30% errors with respect to the exact interaction energies, but such an accuracy can be acceptable for large systems. Also the errors due to the basis set truncation will most likely be of the same order of magnitude for systems that large. For increased accuracy, we recommend the addition of the δEint,respHF term.

The choice DELTASCF=T selects all SAPT corrections necessary for computing the δEint,respHF term, i.e. Eelst(10), Eexch(10), Eind,resp(20), and Eexch-ind,resp(20). This setting should be selected when only δEint,respHF is required since it skips calculations of the dispersion and exchange-dispersion corrections. Such separate calculations of δEint,respHF may be required for certain SAPT(DFT) jobs, see Sec. 16.3 . Since δEint,respHF is calculated automatically also for SAPT0, SAPT2, SAPTNOCC, and SAPT (as long as the keyword scfcp has been specified upon the submission of the SAPT script), it is not necessary to select DELTASCF=T when the former options are used. For additional keywords required for SAPT(DFT), see Sec. 16.2 .

The SAPTx variable takes precedence over the settings of individual corrections in the sense that a correction included in a given level will be computed even if it is explicitly set to .FALSE. However, a correction not included will be computed if it is explicitly set to .TRUE.

The variable CONVAMP selects the use of the converged CCSD amplitudes in the expressions analogous to those appearing in the corrections Eexch(111), Eexch(120), and Eexch(102). Setting this variable to .TRUE. results in the calculation of the ϵexch(1)(CCSD) correction regardless of whether Eexch(111), Eexch(120), and Eexch(102) terms are calculated or not. The logical variable DCONVAMP selects the calculation of the dispersion energy with the inter- and intramonomer correlation treated at the CCD level. However, this energy must be computed in a separate run, see Sec. 10.4 .

The six leading components of the third-order SAPT energy, Eq. (9 ), are calculated when the variables E3IND, DSPIND, E300D, E30XI, DSPIX, and E30XD, respectively, are set to .TRUE. It is also possible to calculate the relaxed third-order induction correction Eind,resp(30) [35] by setting E3INDR=.TRUE. In this case, the corresponding relaxed exchange-induction energy Eexch-ind,resp(30) has to be estimated by a scaling of the nonrelaxed quantity,

 (30) (30) E-(3in0d),resp Eexch-ind,resp = Eexch-ind ⋅ E(30) . ind
(11)

Note that the polarization corrections, especially Eind(30) (or Eind,resp(30)), tend to cancel to a large extent with the corresponding exchange corrections, so calculating also the latter is highly recommended. The effects described by Eind,resp(30) and Eexch-ind,resp(30), together with the higher-order induction and exchange-induction contributions, are approximately included in the δEint,respHF term resulting from the supermolecular Hartree-Fock interaction energy. For the interactions of polar molecules, where the induction component of the interaction energy plays an important role, δEint,respHF tends to provide a more accurate description of the high-order induction interactions than the third-order approximation. For nonpolar and nearly nonpolar systems, the inclusion of Eind(30) and Eexch-ind(30), or of Eind,resp(30) and Eexch-ind,resp(30), usually gives more accurate results than the addition of δEint,respHF. Whereas the corrections Eind(30), Eind,resp(30), Eind-disp(30), Eexch-ind(30), and Eexch-ind-disp(30) are fairly inexpensive, the terms Edisp(30) and Eexch-disp(30) scale like o2v4 and o3v4, respectively, with the numbers (o,v) of occupied and virtual orbitals for a given monomer. Moreover, these two corrections require the mixed-monomer four-virtual integrals over the molecular orbital basis that are expensive to produce. Because of this, SAPT has a built-in mechanism to calculate these terms in a semi-direct way, utilizing the integrals over AOs and removing the need for a mixed-monomer four-virtual transformation. Currently, this mechanism, just like the AO-based algorithm for the four-virtual diagram in the monomer CCSD calculations (controlled by the variable AOCC in the CCINP namelist), works only when atmol1024 or molpro are employed as integral and SCF front-ends, and when either the DCBS/DC+BS approach is used, or, in case of an MC+BS calculation, the basis functions are properly grouped (i.e., the variable BLKMB in the TRN namelist is equal to its default value of .TRUE.). If this is the case, the semi-direct calculation is chosen by default; the user can change this behavior by setting DIRECTE3=.FALSE. in the INPUTCOR namelist. The computation of the most expensive o3v4 contribution to the Eexch-disp(30) correction can be omitted by specifying E30XD=.FALSE. and E30XDSDQ=.TRUE. This contribution usually constitutes about 10% of Eexch-disp(30) and converges fast with the basis set size [38]. The remaining part of Eexch-disp(30) scales like o2v4 with the number of orbitals.

The variables FROZEN, NFOA, NFOB, NFVA, and NFVB control the range of orbitals used for electron excitations [40]. If FROZEN is set to .FALSE., an all-electron calculation is performed and the values of NFOA, NFOB, NFVA, and NFVB are irrelevant; this is the default setting. If FROZEN is set to .TRUE., NFOA lowest orbitals on A and NFOB lowest orbitals on B are treated as core and no excitations out of these orbitals are taken into account. Additionally, NFVA highest virtual orbitals on A and NFVB highest virtual orbitals on B are also excluded from the excitation space. One may set NFVA=NFVB=0 to perform a conventional frozen-core calculation with excitations onto all virtual orbitals allowed. Setting NFVA=NFOB and NFVB=NFOA is another reasonable choice. Note that setting FROZEN=.TRUE. affects not only the SAPT program, but also the integral transformation and the calculation of monomer CC amplitudes. However, the parameters FROZEN, NFOA, NFOB, NFVA, and NFVB need, and can, be set only in the INPUTCOR namelist.

sapt2012 can be used with effective core potentials (ECPs). So far, this option has only been tested with the molpro front end, however, using other ECP-enabled integral front ends interfaced with sapt2012 should be pretty straightforward. Contrary to a frozen-core SAPT run, the ECP effects are taken care of at the integral program level and no special input or options are needed for SAPT. In particular, the variable FROZEN in the INPUTCOR namelist should be set to its default value of .FALSE. unless freezing of a larger core on top of a small-core ECP is requested. The frozen-core and ECP SAPT approaches are described in Ref. 40. Note that when using SAPT with ECPs, the inclusion of the δEint,respHF term, Eq. (5 ), is strongly recommended [40] (this requires the scfcp keyword as a command-line parameter to the SAPT script).

The relativistic contributions to the SAPT interaction energy can be estimated by using relativistic ECPs. Alternatively, one can calculate a relativistic SAPT interaction energy by employing the second-order Douglas-Kroll-Hess relativistic one-electron Hamiltonian as implemented in molpro. This requires a special syntax of the molpro integral and SCF input: please see the example in the directory MOLPRO/ArHF_AVDZ for details.

The variable PRINT is used to print more information about intermediate results and memory partitioning. The variable MEMSAPT can be used to dynamically allocate memory for the perturbation theory stage (sapt.x) of sapt2012. As a default, the memory for the sapt.x program is set to 20 Mwords. The amount of memory required by the sapt.x program may be computed using the memcalc utility, as described in Sec. 10.7 .

10.3 How to read the output

The values of all the calculated SAPT corrections are summarized at the end of the output file in the section entitled Summary Table. An example of the Summary Table can be found in Appendix D. The first part of the table lists the numbers of orbitals, the Cartesian geometry of the dimer, and the SCF energies of the monomers and the dimer (if computed) obtained in the full DC+BS. What follows is a set of low-order SAPT corrections which, when summed up, approximate the supermolecular SCF interaction energy (also printed as E^{HF}_{int} if the keyword scfcp was used on job submission). The quantity SAPT SCF_{resp} is equal to the sum of E^{(10)}_{elst}, E^{(10)}_{exch}, E^{(20)}_{ind,resp}, and E^{(20)}_{ex-ind,r}, i.e., the first 4 terms on the rhs of Eq. (5 ). The quantity \delta^{HF}_{int,r} represents the last term in this equation. If the third-order induction and exchange-induction corrections have been calculated, the quantity SAPT SCF(3)_{resp}, equal to a sum of SAPT SCF_{resp}, E^{(30)}_{ind}, and E^{(30)}_{exch-ind}, is displayed, and the appropriate third-order effects are subtracted from \delta^{HF}_{int,r} to form the \delta3^{HF}_{int,r} quantity. If the non-response corrections E^{(20)}_{ind} and E^{(20)}_{ex-ind} have been computed, the corresponding non-response approximation to the SCF interaction energy, SAPT SCF, and \delta^{HF}_{int} are also given.

The CORRELATION part of the Summary Table contains all the computed SAPT corrections of order higher than zero in the intramonomer correlation operator, and the dispersion and induction-dispersion energies, as defined in Sec. 3 . Two partial sums are also provided, SAPT_{corr,resp} and SAPT_{corr}. The first of these quantities is the sum of \eps^{(1)}_{elst,r}(k),
\eps^{(1)}_{exch}(k) or, if available, \eps^{(1)}_{exch}(CCSD), ^tE^{(22)}_{ind},
^tE^{(22)}_{ex-ind}, E^{(2)}_{disp}(k), E^{(20)}_{exch-disp}, E^{(30)}_{disp},
E^{(30)}_{ind-disp}, E^{(30)}_{exch-disp}, and E^{(30)}_{exch-ind-disp}. The definition of the second partial sum is completely analogous, except that \eps^{(1)}_{elst}(k) is used. Note that if any of the corrections appearing in the definition of a partial sum is not computed (for example, because it has not been requested in the namelist INPUTCOR), this partial sum will not contain this correction. For example, if only SAPT=T is requested in the INPUTCOR namelist, the non-response electrostatic corrections, as well as the components of E(30), will not be computed and the quantity SAPT_{corr} will not contain any correlation contribution to electrostatics, and neither SAPT_{corr} nor SAPT_{corr,resp} will include any contribution of the third order in the intermolecular interaction operator.

If the elst(1)(CCSD) energy is requested by setting E1CC=T in the namelist INPUTCOR, the quantity ϵelst(1)(CC) = elst(1)(CCSD) -Eelst(10) will be reported in the correlation section of Summary Table, but it will not be counted as a part of SAPT_{corr,resp}. An example of using the E1CC option can be found in examples/GAMESS/HF2_MCBS.

Finally, the last set of entries in the Summary Table gives the hybrid interaction energies, i.e., the sums of the supermolecular SCF and the correlation parts, represented by SAPT_{corr,resp} or SAPT_{corr}. If all the relevant corrections have been computed (such as when one of the SAPTx variables is set to .true. in INPUTCOR), the quantity SCF+SAPT_{corr,resp} should be considered the recommended SAPT interaction energy. If the third-order induction and exchange-induction energies have been computed, the sums SAPT SCF(3)+SAPT_{corr} and SAPT SCF(3)_{resp} +SAPT_{corr,resp}, with the terms \delta^{HF}_{int} and \delta^{HF}_{int,r}, respectively, replaced by their third-order component E^{(30)}_{ind}+E^{(30)}_{exch-ind}, are also displayed.

10.4 Calculations of dispersion energy at CCD+ST(CCD) level

An algorithm for calculating the second-order dispersion energy with the inter- and intramonomer correlation effects included at the CCD level has been developed in Ref. 39. This energy cannot be calculated along with the other SAPT corrections: instead, a separate run of the SAPT script must be performed. For this run, the user needs to supply a set of integral and SCF inputs just like for the ordinary SAPT calculation, and the file nameP.data must contain, apart from the TRN, CCINP, and INPUTCOR namelists, a namelist E2DINP which contains the necessary input for the e2disp program.

The logical variables RPA, MPENER, CCD, and CCSD in the E2DINP namelist control the level of theory employed in the e2disp program. Currently, only the CCD=.TRUE. version works, so this variable should be set to .TRUE. and all others to .FALSE. Note that the default values are CCD=.TRUE. and RPA=MPENER=CCSD=.FALSE. so that no explicit specification of these variables in the E2DINP namelist is necessary. The logical variable PT governs the perturbative/nonperturbative methodology of the iterations; currently, only the first one works so this variable should be always set to .TRUE. The logical variables PMEM, PNRM, PKL, PSEP, and PCOMP control the printing of various intermediate quantities. Finally, the variable TOLITER states the (relative) convergence threshold for the dispersion iterations; the recommended default is TOLITER=1.0d-5.

The CCD+ST(CCD) path of the calculation is turned on by specifying DCONVAMP=.TRUE. in the namelist INPUTCOR. Note that this turns off all SAPT corrections other than the second-order dispersion energy since the values of some of these corrections would be incorrect when CCD, rather than CCSD, amplitudes for monomers are calculated. Note also that the frozen core (FROZEN=.TRUE.) option has not been implemented for the CCD+ST(CCD) dispersion energy.

The namelist CCINP controlling the behavior of the CC program also needs to be adjusted when one performs the calculation of the dispersion energy at the CCD+ST(CCD) level. First of all, the single-excitation contribution needs to be omitted from the calculation, i.e. both CCD=.TRUE. and CCSD=.FALSE. need to be specified in the CCINP namelist. Furthermore, partial amplitudes from each coupled-cluster iteration need to be saved for the further use in the e2disp program. To save these amplitudes, one needs to set the option WRTEACH to .TRUE. Finally, a sufficiently tight convergence threshold for the CCD equations is required: we recommend setting TOLITER=1.0d-9 in the CCINP namelist.

The computation of the CCD+ST(CCD) dispersion energy proceeds as follows. After the integral transformation, the CC program is invoked to produce CCD amplitudes for both monomers. Next, the e2disp program is run which perturbatively calculates the converged intermolecular dispersion amplitudes and the CCD part of the dispersion energy. Finally, the amplitudes are used by the SAPT program, with the option DCONVAMP=.TRUE. specified in the INPUTCOR namelist, to calculate the singles and triples contribution to the dispersion energy with converged doubles amplitudes, i.e., the ST(CCD) term. The final result for the CCD+ST(CCD) dispersion energy is then listed in the SAPT summary table.

10.5 Calculation of electrostatic energy from relaxed CCSD densities

In order to compute the energy Eelst,resp(1)(CCSD) from the relaxed CCSD monomer densities, it is necessary to use a special script ccsddSAPT from the bin subdirectory. At present, such a calculation can only be performed using gamess as the SCF program.

10.5.1 DCBS calculation

In the DCBS or DC+BS approach, the user needs to supply a set of gamess SCF input files, just like for the ordinary SAPT calculation in DCBS or DC+BS basis. The namelist TRN in the file nameP.data has to set ISITGAMS=T and DIMER=T. In the namelist INPUTCOR, the option CONVAMP=.TRUE. must be requested and the namelist CCINP has to be set up so that the CCSD calculations for the monomers are performed (i.e., the variables CCSD and CCD should be either omitted or set to CCSD=.TRUE. and CCD=.FALSE.). The computation of the Eelst,resp(1)(CCSD) electrostatic energy using the ccsddSAPT script proceeds as follows. After the integral transformation, the ccsdt program is invoked to produce the CCSD amplitudes for both monomers. Next, the ccsdm program is run twice (for each monomer) to calculate the relaxed CCSD densities. The latter are stored in the AO representation in the unformatted sequential files ccsd_dena.out.A and ccsd_dena.out.B. The obtained densities are then used by the program e1dcbs, together with the atomic one- and two-electron integrals, to compute Eelst,resp(1)(CCSD), reported in the output as CCSD electrostatics:. The correction Eelst(10) is computed here as well and reported as SCF electrostatics:. The script ccsddSAPT then proceeds to invoke the sapt.x module, which calculates the other SAPT corrections, as specified by namelist INPUTCOR.

An example of a DCBS calculation of the electrostatic energy from the CCSD densities can be found in examples/GAMESS/CO_E1DEN_DCBS.

10.5.2 MCBS calculation

In the (pure) MCBS basis set (i.e., without midbond or farbond functions), it is possible to precompute the electron densities for both monomers and then reuse them, after suitable translations and rotations, for a whole set of dimer geometries. In this way, the expensive calculation of CCSD densities has to be done only once per monomer. A calculation of this type can be accomplished using the ccsddSAPT script described above and a separate program elstdenrot, which computes the electrostatic energies using the AO density files produced by ccsdm and basis set information extracted for this purpose from gamess output files. The script ccsddSAPT follows the appropriate path once the option DIMER=F is detected in the TRN namelist. The steps involved in the calculation are the following:

  1. Assuming that the name of the job is name, prepare the files nameA and nameB containing the specifications of the geometries and basis sets of monomers A and B in gamess(us) format. Remember to put blank lines after the basis set of each atom, including the blank line at the end of the file. From these files, the ccsddSAPT script will construct the gamess input files needed to calculate the SCF vectors and generate integrals. A spherical basis set will be assumed (i.e., ISPHER=1) and the name of the job will be used as the comment line. Geometries of the monomers specified in nameA and nameB should correspond to whatever is considered to be the initial configuration of a given monomer, i.e., the COM (or other characteristic point around which rotations will be performed) should coincide with the origin of the coordinate system, and all Euler angles describing the orientation of the monomer will be assumed zero at this geometry.
  2. Prepare the file nameP.data, as for the regular MCBS SAPT calculation (note that in the namelist TRN, the variables basis and tags have to be specified for such a calculation, as well as DIMER=F). In the namelist INPUTCOR, the option CONVAMP=.TRUE. must be requested and the namelist CCINP has to be set up so that the CCSD calculations for the monomers are performed (i.e., the variables CCSD and CCD should be either absent or set to CCSD=.TRUE., CCD=.FALSE.). The corrections specified in INPUTCOR will be calculated and printed in the Summary Table for one dimer geometry, obtained by shifting the position of monomer B by 10 bohr along the positive z axis with respect to the position specified in the nameB file. The Eelst(10) result from the Summary Table may be later compared to what comes out from the density algorithm.
  3. Prepare the file input.edi, specifying the dimer geometries for which the electrostatic energies are to be calculated by elstdenrot out of the precomputed monomer densities. The file input.edi consists of lines (one for each dimer geometry) containing, in this order, the COM-COM separation (in Å), the β and γ Euler angles (in degrees) for monomer A (α is assumed as zero), followed by the α,β, and γ Euler angles of monomer B. It is assumed that the COM of monomer A coincides with the origin of the coordinate system and that monomer B is shifted in the positive direction of the z axis. For example,
    5.29177249 0.          0.            0.          0.           0.  
    2.38521611 86.80692336 90.00000000 180.00000000 93.37890335 360.00000000  
    2.63658901 87.13779711 90.00000000 180.00000000 81.94039222 360.00000000  
    2.89422970 87.42491802 90.00000000 180.00000000 75.80434134 359.99999915

    would be a valid input.edi file containing 4 geometries. The first of these geometries is exactly the same as the one for which the interaction energies are computed by the sapt.x module during the ccsddSAPT run.

  4. Run the ccsddSAPT job the same way a “regular” job would be run. For example, to run in /scratch/mydir from ksh, one would type
    ccsddSAPT name noscfcp /scratch/mydir > name.out 2>&1 &

    Note that the noscfcp keyword has been used since the supermolecular SCF interaction energy is of no interest here.

The result of running the ccsddSAPT script in MCBS will be the file name.out containing the standard output from the whole run. In this file, the SAPT corrections requested in the nameP.data file will be reported (in the Summary Table section) for one specific dimer configuration, described earlier. Furthermore, the calculated monomer densities will be packed for a further use into a file name.den.tar.gz, which, after running

   gzip -d name.den.tar.gz  
   tar -xvf name.den.tar

will decompress info the following files:

  1. unformatted sequential files ccsd_dena.out.A and ccsd_dena.out.B, containing the SCF and relaxed CCSD densities of monomer A and B, respectively,
  2. formatted files infoa_m.data and infob_m.data (in the format readable by the code elstdenrot, run subsequently) containing geometry and basis set info for the monomers.

After collecting the files listed above and input.edi in one scratch directory, one can run the elstdenrot program by issuing a command similar to

elstdenrot > dimers.out 2>&1

modifying, of course, the paths and the output file name appropriately. As usual, the executable elstdenrot can be found in SAPT2012/bin directory. The electrostatic energies at the Eelst(10) and CCSD levels, calculated from the monomer densities translated and rotated into appropriate positions, will be reported in the standard output from elstdenrot (is this example—in dimers.out).

An example of MCBS CCSD densities calculation using the ccsddSAPT script, followed by a batch of electrostatic energy calculations with the elstdenrot program, can be found in examples/ GAMESS/CO_E1DEN_MCBS.

10.6 Submitting a sequence of sapt2012 jobs

The directory ./SAPT2012/bin contains three utility scripts, RunlotATMOL, RunlotCADPAC, and RunlotDALTON, which allow automatic calculations for multiple points on a potential energy surface in a single submission of a computer task. Some members of our group found it also convenient to utilize RunlotATMOL2, a more elaborate and flexible version of RunlotATMOL. Currently available only for atmol1024, cadpac, and dalton, these scripts can be fairly easily extended to other integral/SCF programs. An example of using RunlotATMOL can be found in the directory

./SAPT2012/examples/ATMOL1024/ArH2O_MCBS.

The idea behind the Runlot* scripts is to construct the input files (appropriate for a given SCF front-end) by combining universal, geometry-independent templates containing basis set and all the necessary integral/SCF options with Cartesian geometries calculated from a minimal set geometric data. Once all the input files are generated, the SAPT script is invoked to calculate the interaction energy for a given dimer geometry and the results are recorded in a file with a name unique for this geometry. Then, the input files are prepared and the calculation performed for the next geometry from the set, and the process continues until the set of geometries is exhausted.

Below, using the ArH2O_MCBS example, we describe steps needed to set up a RunlotATMOL job.

  1. Prepare the files containing the basis sets in atmol1024 format:
    • iso.d: “isotropic” part of the whole (DCBS) basis, i.e., the functions which will be used in SCF calculations for both monomers; the sequence should be A, midbond (if present), and B;
    • polA.d and polB.d: polarization functions for A and B, respectively; functions for polA.d will not be used in expansion of molecular orbitals of B, and vice versa;
    • head.d: a simple file containing a header common to all *.intinp files;
    • end.d, endA.d, endB.d, endMA.d, and endMB.d: end parts of *.intinp files, specific for a given system (e.g., end.d corresponds to the dimer, endMA.d – to monomer A in its MC+BS, endA.d – to monomer A in DC+BS, and so on; note that in this example the files endA.d and endB.d contain the atmol1024 directive BYPASS TWO which allows to avoid recalculation of two-electron integrals generated during the dimer SCF run);
    • *.scfinp files needed for SCF calculations on all subsystems;
  2. Prepare the file dimer.cnf which specifies the Cartesian geometry (in bohr) of monomers A and B in their “initial” configurations, i.e., for all Euler angles equal to zero. The charge, atomic mass, and an up to two-character symbol are also given for each atom.
  3. Prepare the file geoparm.d which contains dimer geometries for which the sapt calculations are to be performed. Each such geometry is specified by a single line containing the separation (in Å) between the centers of mass (as defined by the masses and geometries supplied in dimer.cnf) of the monomers, the z -y -z Euler angles [55] βA and γA of monomer A, and the z -y -z Euler angles αB, βB, and γB of monomer B. These angles (to be supplied in degrees) are measured with respect to the coordinate system in which the z axis points from A to B. The Euler angle αA is not needed, since the intermolecular potential depends only on the difference αB - αA, so that αA may be assumed zero without loss of generality. At the end of each line of geoparm.d there is also a string ‘DOIT‘ (including the apostrophes). If for some reason you do not need the calculation for a given geometry but still want to keep it in the geoparm.d file, put ’DONE’ instead. The file geoparm.d may contain any number of lines.

Once all these files are prepared, copy them, along with the RunlotATMOL script itself, to your scratch directory where the whole calculation is to take place. Then start the job with the command

nohup time RunlotATMOL name >name.con 2>&1 &

in ksh or bash, or

time RunlotATMOL name >& name.con &

in csh, where, as usual, name is how the job is called and how the names of all input files start.

The calculation will then proceed as follows. First, the program getgeoATMOL is invoked by RunlotATMOL (this program is automatically compiled during the installation of sapt2012 and the path to it is inserted into RunlotATMOL) which analyzes the geoparm.d file looking for the first line containing the ’DOIT’ directive. When such a line is found, getgeoATMOL reads the COM separation and Euler angles and uses them, along with the data of dimer.cnf, to produce a set of files geo*.d containing Cartesian coordinates of all atoms comprising the given dimer configuration with appropriately set charges. A ghost (i.e., zero-charge) site called Mb is also inserted halfway between the COMs of the monomers. A copy of geoparm.d is also produced (fort.7) in which the just processed input line is given the label ’DONE’. The RunlotATMOL script then proceeds to overwrite the old geoparm.d with this modified copy and create all the needed *.intinp files out of the previously prepared pieces (head.d, end*.d, iso.d, polA.d, polB.d), and the Cartesian geometry files geo*.d generated by getgeoATMOL. Once all the *.intinp files are created, the SAPT script is called (the path pointing to this script is automatically set up during installation) to do the proper sapt2012 calculation for this geometry. The output is written to the file name1.out. After the SAPT script completes its task, the whole cycle is repeated, i.e., geoparm.d is searched for the new geometry labeled ’DOIT’, the input files for this geometry are generated and the SAPT script invoked. The process continues automatically until all the geometries in geoparm.d are taken care of. Consecutive output files are named name1.out, name2.out, name3.out, ..., in accordance with the position of geometries in geoparm.d. In this way, using the Runlot utility, the whole surface (or a significant portion of it) can be filled up with data points with a minimum interference on the user’s part.

For larger systems, placing of the midbond ‘ghost’ atom halfway between monomer COMs can result in significant linear dependencies since the location chosen in this way can be close or overlapping with the monomers. In such cases the ’DOR6’ directive can be used instead of ’DOIT’. With the ’DOR6’ directive, getgeoATMOL places the midbond according to the 1∕r6 weights based on distances from all atoms. With this scheme, the midbond is placed in a more balanced way [56]. The rest of the atoms and the behavior of the getgeoATMOL program is identical to the ’DOIT’ directive. The ’DOR6’ scheme is recommended for all non-trivial monomers.

The philosophy behind the scripts RunlotCADPAC and RunlotDALTON (and the related getgeoCADPAC geometry conversion program) is very similar, consult the scripts themselves for details. The structure of both scripts is simple enough to allow relatively easy modifications.

10.7 Memory and disk requirements

When setting up a sapt2012 computational project, it is imperative to know how much computational resources this project will consume and whether or not it will fit into the memory and disk of the machine at hand.

Memory requirements of sapt2012 can be estimated based on the total number of basis functions and the numbers of occupied and virtual orbitals. The transformation code tran should work even with small memory, although it may then choose the “out-of-core” path which significantly increases the time of this step. To allow the transformation to work entirely in-core, the amount of available memory should be slightly more than on32 8-byte words, where n is the dimension of the basis set and o is the number of occupied orbitals of the larger monomer. For DCBS or DC+BS calculations, it is advantageous to set memory (if possible) to about twice as much, i.e., slightly more than on3 words, as a somewhat faster algorithm is chosen when additional memory is available. For the cc part, memory requirements can be roughly estimated as 2o2v2 + 2o2n2 or o2v2 + 2o2n2 + n3 words, whichever is greater (v is the number of virtual orbitals for the larger monomer.) The sapt.x code will ask for about 5o2v2 or v3 + 3o2v2 words, whichever is greater, and max(2o2v2,v3 + ov) + 6000000 words will be needed to generate the relaxed CCSD densities using the ccsdm code.

More precise calculation of the memory required for an efficient run can be done using the program memcalc (built automatically during installation). In order to use this program, prepare the regular nameP.data file with the first (title) line containing five numbers, in the following order: the total number of DC+BS basis functions, the total number of functions for monomer A (will be different than the previous number in a MC+BS-type run), the number of occupied orbitals for monomer A, and then the same information for monomer B. Then run the program by typing

memcalc < nameP.data > memcalc.out

The file memcalc.out will contain detailed information about memory requirements of different parts of the code. The variables MEMTRAN from namelist TRN and MEMSAPT from namelist INPUTCOR should then be adjusted accordingly. The memory needed to run the cc program is calculated there explicitly so that no separate namelist variable is needed.

The memory requirements of the SCF programs are much smaller than those of sapt2012. For specific information, consult the manuals distributed with these programs.

Disk space requirements of sapt2012 are more difficult to estimate as these depend not only on the size of the monomers and basis set dimensions, but also on the assumed integral thresholds and the dimer configuration. For larger intermonomer separations, more two-electron integrals will be neglected, and the integral files will be smaller than for “close” configurations. While providing an accurate estimate of the required disk space is not possible, some guidance can be obtained by considering the upper limits on sizes of different scratch files involved and their scaling with the system and basis sizes.

With the dimension of the atomic basis (DC+BS) equal to n, the size of the raw integral file produced by an SCF run scales as n48. This should be multiplied by 12 to 16 bytes (each integral is represented by an 8-byte real number plus 4 to 8 bytes for index storage) to give an upper bound (in bytes) for the file size. In practice, with usual integral thresholds, this maximum size is often reduced by about 50%.

In the transformation step of the calculation, the atomic integral file has to coexist with files produced by the transformation (it is deleted after the tran program completes unless the AO-based calculation of Edisp(30) or Eexch-disp(30) has been requested). During the most demanding four-virtual transformations (performed in the beginning), the upper bound on the disk space becomes roughly (3n48 + v2n24 + v48) × 13 bytes. Once the four-virtual transformations are completed, the files (including raw integrals) will take at most about (n48 + v44) × 13 bytes, which will have to coexist with new “incoming” integrals being generated subsequently. At the end of a typical transformation for a dimer consisting of identical monomers, the disk space taken up by the raw and transformed integrals will scale approximately as 2.5o4 + 8o3v + 8o2v2 + 2.5ov3 + 0.25v4, which should be multiplied by 13 to give the maximum number of bytes. Again, due to nonzero thresholds, this number has a good chance to be reduced by about 50%.

In the subsequent stages of the calculation, the ccsdt and sapt.x codes will generate additional scratch files of the order of o2v2, ov3, and smaller, which will have to share disk space with the transformed integrals. Moreover, unless the non-default algorithm has been switched on at compile time by adding the -DTPDRVN definition to the EXTRADEFS variable, one more file of the order v44 will be temporarily created by the ccsdt program. However, the original file of raw atomic integrals (n48) will be removed beforehand unless the corrections Edisp(30) and/or Eexch-disp(30) (employing the AO integrals) are to be calculated.