New methods for ab initio quantum mechanical calculations in molecular and crystalline systems - CaltechTHESIS
CaltechTHESIS
A Caltech Library Service
About
Browse
Deposit an Item
Instructions for Students
New methods for ab initio quantum mechanical calculations in molecular and crystalline systems
Citation
Langlois, Jean-Marc
(1994)
New methods for ab initio quantum mechanical calculations in molecular and crystalline systems.
Dissertation (Ph.D.), California Institute of Technology.
doi:10.7907/j75s-5f43.
Abstract
NOTE: Text or symbols not renderable in plain ASCII are indicated by [...]. Abstract is included in .pdf document.
This thesis deals with the development of new methods for doing ab initio quantum mechanical calculations of electronic wavefunctions of large molecules and crystalline systems with the emphasis on inclusion of electronic correlation or many body effects using generalized valence-bond (GVB) wavefunctions.
Chapters 1 and 2 describe two necessary steps for using the generalized valence-bond (GVB) formalism in large molecular systems. In Chapter 1 a fast method for generating GVB trial wavefunctions is described. The method is based on piecewise atomic and diatomic localization and makes possible calculations with large numbers of GVB pairs. The efficacy of the method is illustrated by application to several cases including GVB wavefunctions with up to 26 pairs. In Chapter 2 the pseudospectral (PS) method for self-consistent-field calculations is applied to the GVB formalism. In the GVB perfect pairing approximation, the PS method is shown to reduce the scaling cost of the calculation from [...] to [...], where N is the number of basis functions. This makes possible the calculation of GVB wavefunctions for large molecular systems.
Chapter 3 describes a density-functional method for calculations on crystalline systems using Gaussian type orbitals. Accurate and efficient strategies were developed for computing both the Hamiltonian matrix elements and the Coulomb field. The Hamiltonian matrix elements are computed by decomposing the multicenter numerical integrations into single-center integrations via a projection technique and the Coulomb field is evaluated analytically using a dual-space approach based on the Ewald method. The self-consistent field is obtained by a fast conjugate gradient method which uses both first and second derivative information and an efficient preconditioning strategy. Illustrative calculations are performed on two allotropes of carbon: diamond and [...] crystals.
Item Type:
Thesis (Dissertation (Ph.D.))
Degree Grantor:
California Institute of Technology
Division:
Engineering and Applied Science
Major Option:
Applied Physics
Thesis Availability:
Public (worldwide access)
Research Advisor(s):
Goddard, William A., III
Thesis Committee:
Goddard, William A., III (chair)
Defense Date:
7 October 1993
Record Number:
CaltechETD:etd-12042007-081615
Persistent URL:
DOI:
10.7907/j75s-5f43
Default Usage Policy:
No commercial reproduction, distribution, display or performance rights in this work are provided.
ID Code:
4780
Collection:
CaltechTHESIS
Deposited By:
Imported from ETD-db
Deposited On:
06 Dec 2007
Last Modified:
19 Apr 2021 22:36
Thesis Files
Preview
PDF (Langlois_jm_1994.pdf)
- Final Version
See Usage Policy.
5MB
Repository Staff Only:
item control page
CaltechTHESIS is powered by
EPrints 3.3
which is developed by the
School of Electronics and Computer Science
at the University of Southampton.
More information and software credits
New Methods for Ab Initio
Quantum Mechanical Calculations in
Molecular and Crystalline Systems
Thesis by
Jean-Marc Langlois
In Partial Fulfillment of the Requirements
for the Degree of
Doctor of Philosophy
California Institute of Technology
Pasadena, California
1994
(Submitted October 7, 1993)
ii
To Tania Lesk
iii
Acknowledgments
I would to thank my advisor Bill Goddard for his encouragement and his
enthusiasm in science. His support was always appreciated and his vast knowledge
in chemistry, physics and materials science has been a constant source of inspiration.
I would also like to thank Murco Ringnalda for being a good friend and col-
laborator. Our many discussions have given momentum to my work. Xiaojie Chen
has been a very valuable collaborator on the latter part of this thesis and his con-
tributions were beneficial. I would like to thank Terumasa Yamasaki and Rick
Muller for discussions and contributions to the first part of this thesis. I also had
fruitful interactions in the earlier days with Yuejin Guo, Guanhua Chen and Naoki
Karasawa.
I also want to acknowledge the following members of the group: Siddharth
Dasgupta, Gilles Ohanessian, Kevin Plaxco, Jason Perry, Charles Musgrave, Anne
Miller, Terry Coley and especially Jamil Tahir-Kheli for his own special way of
mixing science with life.
iv
Abstract
This thesis deals with the development of new methods for doing ab initio
quantum mechanical calculations of electronic wavefunctions of large molecules and
crystalline systems with the emphasis on inclusion of electronic correlation or many
body effects using generalized valence-bond (GVB) wavefunctions.
Chapters 1 and 2 describe two necessary steps for using the generalized
valence-bond (GVB) formalism in large molecular systems. In Chapter 1 a fast
method for generating GVB trial wavefunctions is described. The method is based
on piecewise atomic and diatomic localization and makes possible calculations with
large numbers of GVB pairs. The efficacy of the method is illustrated by applica-
tion to several cases including GVB wavefunctions with up to 26 pairs. In Chapter
2 the pseudospectral (PS) method for self-consistent-field calculations is applied to
the GVB formalism. In the GVB perfect pairing approximation, the PS method is
shown to reduce the scaling cost of the calculation from N* to N*, where N is the
number of basis functions. This makes possible the calculation of GVB wavefunc-
tions for large molecular systems.
Chapter 3 describes a density-functional method for calculations on crystalline
systems using Gaussian type orbitals. Accurate and efficient strategies were devel-
oped for computing both the Hamiltonian matrix elements and the Coulomb field.
The Hamiltonian matrix elements are computed by decomposing the multicenter
numerical integrations into single-center integrations via a projection technique and
the Coulomb field is evaluated analytically using a dual-space approach based on
the Ewald method. The self-consistent field is obtained by a fast conjugate gradi-
Vv
ent method which uses both first and second derivative information and an efficient
preconditioning strategy. Illustrative calculations are performed on two allotropes
of carbon: diamond and Cp crystals.
vi
Table of Contents
Acknowledgments ......... 0... cece cece eee e ene e een e ene ene ili
Abstract ..... Lecce ee ee eee tere nant eeenes cece eee ee tenet tenet eeeene iv
Introduction ....... 0... ene ence tee eee eens eee e beeen eens 1
References ......... cece cece cee cece e nnn nee eee een enes 11
Chapter 1. Generating Trial Functions for Generalized Valence-Bond
Wavefunctions ........ 0... cece cence eee eee teen nee nee eees 12
I. Introduction ............ eee e cece eee eee Se 14
Il. The GVB-PP Wavefunction ......... 0... cece cece cee nee aes 15
III. Initial Guesses for GVB Wavefunctions ............. 0... c cece cece ees 17
IV. Results and Discussion ............ 0... e cece eee eee eee eet ene ees 26
V. Conclusions .......... 0.0 c cece cece eee ete e nent e nent eens 33
References 2.0... 0. cece ccc ene eee ence tent eee nnn tenes 35
Chapter 2. Pseudospectral Generalized Valence-Bond Calculations: Application
to Methylene, Ethylene and Silylene ......................000085 43
Chapter 3. Local Density Approximation Electronic Structure Calculations
using Gaussian Type Orbitals ............. 0. ccc cece cece eee 54
I. Introduction 2.0... 0... ccc cece eee eee cnet een tenn ee tn eee 56
II. Density Functional Theory ........... 0... cece cece eee eee eee 59
III. Dual-Space Approach ........... ccc cece cee ene eee e eee e ene ees 68
IV. Grid Generation .......... 0. cece cece een eben eee e eet eenas 77
V. Self-Consistent Procedure ............ 0. cece cece eee teen ene e nen ens 81
VI. Results and Discussion ........ 0. ccc cece ccc eee e eect eee e eee en eeeee 86
vii
VIL. Conclusions .......... 00. cece ec eee cee een ee eee eens eee eeeeeeneeennenee
Appendices ....... 0. cece cece cee cee n een tenn ee nn teen eens
References ........ 0c ccc cee cece cence ee cee eee eee bee ence eee eeeeneeees
Appendix 1. Electronic Structure and Valence-Bond Band Structure
of Cuprate Superconducting Materials ....................005.
Appendix 2. Superconducting Properties of Copper Oxide
High-Temperature Superconductors ...............ec seen eens
Introduction
The relentless increase in computer power is making ab initio computer sim-
ulations a more viable tool for predicting the properties of new materials. Along
with these hardware advances, dramatic improvements in theoretical methods and
associated computer software are essential for prediction of accurate and interesting
properties. This thesis describes some progress in developing these new methods
for calculating electronic wavefunctions of large molecules and crystals.
In molecular systems, the Hartree-Fock method (HF), based on a simple single
Slater determinant of spin orbitals, is the most common approach for doing ab initio
calculations. In HF theory the Euler-Lagrange equation for optimum orbitals leads
to a set of equations :
HEY, = €a Va (1)
for the optimum orbitals {~o,a =1,...Nors = Netec/2}. Each Fock operator:
Noro
H¥F =h+S° (2Ja- Ka) (2)
a=}
involves the field due to the electrons in occupied orbitals where J, and Ka are
referred to as the Coulomb and exchange operators. The HF approximation is use-
ful for many purposes, however, it leaves out the correlation energy so important
to chemical reactions, excited states, transition metals, etc. Ab initio approaches
for estimating this correlation energy involve including additional configurations to
the HF determinental wavefunction (each configuration involves various number of
electrons excited from the occupied orbitals of the HF wavefunction to various un-
occupied orbitals). For large systems, cost consideration demands that the number
of such configurations be kept small. Thus it is important to choose a method which
gets the best description of the system with the fewest configurations.
The simplest generalization to the HF method achieving such a compact de-
scription is the Generalized Valence Bond method!~3 (GVB), where each pair of
electrons of interest is described by a pair of orbitals instead of a single orbital as in
HF. In most cases each pair of GVB orbitals localizes near a bond or an atom, and
each GVB pair contributes to two configurations which describe the local correlation
in that bond or atom.
The orbitals, %.,%4, of a GVB pair resemble hybridized valence bond orbitals
and provide useful interpretation for complicated molecules (e.g., transition metals).
However for computational purposes it is convenient to combine the GVB orbitals
(which overlap) into orthogonal orbitals called GVB natural orbitals (NO) :
pa t+ vo and wp _ ba —- Vd
/2(1 + §) " ./2(1 —S)
where w, resembles a localized HF bond orbital and ¥, resembles a localized an-
by = (3)
tibonding orbital. The two GVB orbitals lead to two Fock operators and the self-
consistent equations to solve consist of a system of Ngvg +1 coupled eigenvalue
equations where Ngyg is the number of GVB orbitals.
Much success has been obtained with the application of the GVB method to
small molecules. Examples of this can be found in Appendix 1 and 2 where the
GVB method was applied to cluster of atoms describing the important sites of the
high Tc materials. The GVB formalism was used to study the local spin properties
of the Cu atoms and to determine physical parameters relevant to understanding
the origin of the superconductivity.
The extension of the GVB method to larger systems has been slowed by several
difficulties:
(i) the need for an automatic method for generating good trial wavefunctions,
(ii) the need for a fast and reliable iterative approach for achieving self-consistency,
and
(iii) the need for much faster construction of the GVB Fock operators.
We have addressed each of these issues.
The solution to the first problem is discussed in Chapter 1. For small molecules
the initial GVB orbitals have traditionally been constructed by hand using chemical
intuition. This was an acceptable solution because of the simplicity of the wave-
functions and the small number of degrees of freedom involved. For large molecules,
this simple approach becomes cumbersome and may lead to orbitals too different
from the self-consistent GVB orbitals. Chapter 1 describes an automatic method
for generating GVB initial orbitals in large molecules using piecewise localization.
For a pair of GVB orbitals localized on a bond, the first GVB natural orbital (i-e.,
the bonding orbital) is obtained by finding the pair of atomic orbitals that leads
to the largest overlap onto the bond space. The second GVB natural orbital is the
orthogonal anti-bonding combination. For GVB lone pairs, the first GVB orbital
consists of a lobe-like orbital and the second GVB orbital is obtained by applying
a localized gradient operator to the first GVB orbital. This procedure is automatic
and only requires the selection of the bond or lone pair of interest.
The second issue is that for larger systems it becomes critical to have an
efficient iterative approach to converge the GVB wavefunctions. For HF wavefunc-
tions, the method of choice is the Direct Inversion in the Iterative Subspace (DIIS)
method.* This method produces rapid convergence because it makes use of the
gradient information obtained for all the SCF iterations. In the DIIS method an
effective Fock operator is constructed from a linear combination of the previous
Fock operators where the coefficients are obtained by minimizing the corresponding
linear combinations of the gradient vectors. Previously the method had been lim-
ited to wavefunctions with only one GVB pair because of the difficulty in finding
an appropriate effective Fock operator. By incorporating the proper orbital mixing
terms into a multi shell Fock operator, the DIIS method has now been generalized
to wavefunctions with any number of closed shell orbitals, open shell orbitals, and
GVB orbitals.®
The third issue to be resolved is the efficient construction of the Fock opera-
tors. The molecular orbitals or band orbitals, {y.}, are described in term of basis
functions, {@i}:
Wo = S> Cia Pi (4)
where the expansion coefficients, {c;,q}, determine the optimum orbitals. In molec-
ular systems, the most common basis functions are atomic Gaussian-type orbitals
(GTO). The use of a basis set converts the Fock operator into Fock matriz, H;;.
In standard HF theory, the Fock matrix is evaluated in term of one-electron and
two-electron integrals as
Hy; = hij + Qi; — Kay (5)
where h;; is a one-electron matrix element (kinetic energy plus nuclear attraction).
The Coulomb J;; and exchange K;; matrices are given by :
kl
Ki; = S| < a||kj > Dri (7)
ki
where Dy: = 9), CkaCia is the density matrix and the two-electron integrals
< ij||kl > are defined as :
roel
In this formulation of HF theory, the overall computational effort scales as N* where
N is the number of GTOs used to represent the orbitals.
In contrast to ab initio methods, the application of density functional theory
(DFT) to molecular systems® leads to a scaling of N* when using GTOs . This is
achieved by fitting the density p(r) and exchange-correlation potential V,,(r) with
an auxiliary set of GTOs. This approach cannot be applied to HF theory because
of the the non-local nature of the exchange operator, K.
A solution to this N* problem for ab initio theory is the pseudospectral (PS)
method.’~* With PS both numerical grids and Gaussian basis functions are used
to describe the orbitals.2 The Coulomb and exchange matrices are given by :
Jg= 0 Qsles) O(a) Q, 2Dx1 Axi(ry)) (9)
Ki= >) Aslr9) Qe Ain(ra) 2Di1 $i(79))) (10)
where the sum > g 18 over the physical space gridpoints and the one-electron field
integrals are given by:
eee
The matrix Q;(r,) is used to transform a function represented on the grid onto a
function represented by the basis set. In practice, the matrix Q;(r,) is a least squares
fitting operator, which is designed to fit any product of the form A,i(r,) ¢;(7,).
With PS the construction of the Fock matrices (5) scales as N* rather than N‘*.
The number of grid points scales as the number of atoms and hence as N. This leads
to N° work in evaluating Coulomb and exchange matrix elements in equations (9)
and (10). Since the various integrals decrease with distance, the use of an integral
cutoff reduces the scaling to N?. The application of the PS method to the GVB
formalism is similar to its application to HF and the details are given in Chapter 2.
In Chapter 2 the least squares fitting operators Q;(r,) are constructed with
the use of a fitting basis X,(rq) :
Qi(ra) =D) Sin [Spe™* Xa(ra) w(r9) (12)
Pg
where w(r,) is the weight of the gridpoint, 5;, is the overlap integral between basis
function ¢; and fitting function Xp, and S}?™ is the numerical overlap integral
between two fitting functions :
Soa = > Xp(tq) Xq(7g) w(7). (13)
When the basis function ¢; is short range (SR) it is advantageous to restrict
the fitting to the region of space where the basis function is non-zero. A very
complete basis set localized in the same region is used as the fitting basis set. The
fit is constrained by multiplying the weight by an envelope function centered on the
same atom as basis function ¢,.
When the basis function is long range (LR), then the fit needs to be done
over the whole molecule. Hence the fitting basis set includes functions on the whole
molecule as well. For small molecules this is not a problem but for large molecules
this becomes prohibitive because of the matrix inverse operation required in forming
Qi(r,) which scales as the number of fitting functions to the third power, M®. Since
M >N this step can easily dominate the overall calculation.
The solution to this problem results from the realization that the fitting op-
erator is always more precise for SR functions because the fit can be of higher
quality. This allows the Coulomb and exchange matrices, equations (9) and (10),
to be rewritten in such a way that the basis function index of Q(r,) is a SR func-
tion. This can always be done except when all four basis functions in a two-electron
integral are LR. In that case the fitting operator Q(r,) needs only to include LR
fitting functions since all the SR components have been removed from the quantity
to be fit. This makes the cost of constructing this LR fitting operator practical
because the number of LR fitting functions is of the same order as the number
of basis functions. This length scale algorithm has been implemented successfully.
Coupled with the use of diatomic corrections introduced in Chapter 2, this leads to
an efficient method for doing ab initio calculations with HF or GVB wavefunctions
on large molecules.®
Unfortunately the added complexity required by these improvements have
precluded the application of the PS method to crystalline systems. Because of the
length scale algorithm and the diatomic corrections, the exchange and Coulomb
fields are not constructed completely and this greatly complicates the handling of
the lattice sums in crystalline systems.
For the Coulomb field the lattice sums do not converge in real space because
of the long range 1/r tail. This is usually dealt with by screening the Coulomb field.
However one cannot do this in the present framework of the PS method because
the Coulomb field cannot be separated into short range and long range pieces.
For the exchange field, the lattice sums are convergent but the exchange field
becomes more problematic because it decays slowly. Because of its non-local nature,
the exchange field cannot be screened and hence it decays more slowly than a
screened Coulomb field where the long range 1/r component has been removed.
At this point it is useful to reexamine the features required in order to apply
the GVB formalism to crystalline systems.
(i) use of localized basis functions (GTOs) to describe the GVB orbitals,
(ii) accurate evaluation of the Fock matrix elements without using density or field
fitting in order to handle the non-local HF exchange operator, and,
(iii) effective screening of the Coulomb field.
The simplest method that encompasses the first two requirements is a scheme where
the fields are evaluated on a grid and the matrix elements of the Fock matrix are
evaluated by numerical integration over the grid points. Historically, numerical
integrations in ab initio calculations have been avoided because of the difficulty in
reaching sufficient accuracies. It is only recently!°~1? that good accuracy has been
achieved for reasonable grid sizes in molecular and crystalline systems.
The solutions proposed for crystalline systems!!~1* have been based on the
division of space into regions of different shapes over which independent integrations
are performed. Three different regions are used: spherical regions around atomic
nuclei, regular polyhedra in the interatomic region and truncated polyhedra to
make the transition between the other two regions. The location and weight of
the gridpoints are relatively easy to select in the case of the spherical regions and
regular polyhedra but they are very difficult to choose for the truncated polyhedra.
A different approach, which avoids the truncated polyhedra regions, con-
sists in transforming the integration over space into atomic integrations by using
atomic projection functions.'° The method has previously been employed for small
molecules. We have generalized its use to crystalline systems. In crystalline sys-
tems, the larger number of neighbors makes the integration less accurate. By adding
a new functional form to the projection function, the accuracy was significantly im-
proved. The atomic integrations were also modified to better take advantage of the
Gaussian nature of the basis set used. The description of grid generation along with
accuracy results are given in Chapter 3.
In Chapter 3, a solution to the evaluation of the Coulomb field in crystalline
systems is presented. Unlike density fitting approaches which tends to spherically
average out the field around each atom, the current approach evaluates the Coulomb
field by an exact method based on a perfect screening of the field.1* For each pair
of GTOs a smooth screening field is defined which exactly cancels out the long
range part of the field. Like in the Ewald method, the screening field is added and
subtracted and this leads to an efficient dual-space approach. The combined normal
field minus screening field now rapidly converges in real space, whereas the field of
the opposite screening density converges rapidly in reciprocal space. This dual-
space method for evaluating the Coulomb field scales as N° just like the Ewald
method.
The soundness of previous methods for evaluating the Fock matrix elements
and the Coulomb field in crystalline systems were tested on crystalline calcula-
tions of diamond and Cg. At this point the HF exchange field was replaced by
10
the exchange-correlation field of density functional theory in the local density ap-
proximation (LDA) and comparison with other LDA calculations using plane waves
(PW) and GTOs were made.
VII.
10.
11.
12.
13.
11
References
F. W. Bobrowicz and W. A. Goddard III, in Modern Theoretical Chemistry:
Methods of Electronic Structure Theory, edited by H. F. Schaefer III (Plenum,
New Yok, 1977), Vol. 3, p. 79.
E. A. Carter and W. A. Goddard III, J. Phys. Chem. 92, 2109 (1988).
§. K. Shin and W. A. Goddard III, J. Chem. Phys. 93, 4986 (1990).
T.P. Hamilton and P. Pulay, J. Chem. Phys. 84, 5728 (1986).
R.P. Muller, J.-M. Langlois, M.N. Ringnalda, R.A. Friesner and W.A. God-
dard III, J. Chem. Phys., (in press).
J. Andzelm, in Density Functional Methods in Chemistry, J. Labanowski and
J. Andzelm, eds., Springer-Verlag, Berlin, 1992.
. S.A. Orszag, Stud. Appl. Math. 51, 253 (1972); D. Gottlieb and S. Orszag,
Numerical Analysis of Spectral Methods; Theory and Application (SIAM,
Philadelphia, 1977).
R.A. Friesner, J. Chem. Phys. 86, 3522 (1986); R.A. Friesner, Ann. Rev.
Phys. Chem 42, 341 (1991)
B.H. Greeley, T.V. Russo, D. Mainz, R.A. Friesner, J.-M. Langlois, W.A.
Goddard III and M.N. Ringnalda, (in preparation).
A.D. Becke, J. Chem. Phys. 88, 2547 (1988).
M.R. Pederson and K. A. Jackson, Phys. Rev. B 41, 7453 (1990).
G. te Velde and E.J. Baerends, J. Comp. Phys. 99, 84 (1992)
C. Pisani, R. Dovesi and C. Roetti, “Hartree-Fock Ab Initio Treatment of
Crystalline Systems,” Lecture Notes in Chemistry Vol. 48 (1988).
12
Chapter 1
Generating Trial Functions for
Generalized Valence Bond Wavefunctions
13
Abstract
A common difficulty with obtaining wavefunctions for Generalized Valence
Bond (GVB) and other multiconfigurational self-consistent field (MC-SCF) wave-
functions is finding suitable initial guesses or trial functions. We present a general
method (GVB-INIT) for obtaining such initial guesses. This method uses pseudo-
HF molecular orbitals formed from HF atomic orbitals. The occupied HF orbitals
are projected onto atomic basis functions to obtain GVB first natural orbitals and
unoccupied HF orbitals are projected to obtain GVB second natural orbitals. We
find for a variety of systems that these trial functions lead to excellent convergence
and exhibit good overlap with the optimum GVB orbitals. Constructing the GVB-
INIT wavefunction is fast because HF wavefunctions are not calculated and the
localization is piecewise atomic.
In conjunction with the recently developed GVB-DIIS method for converg-
ing GVB wavefunctions and new pseudospectral programs (PS-GVB) for the Fock
matrix elements, GVB-INIT makes calculation of highly correlated GVB wavefunc-
tions now quite practical. The efficacy of GVB-INIT is illustrated by application
to several cases including GVB wavefunctions with up to 26 pairs. It can also be
used for other MC-SCF wavefunctions such as CAS-SCF.
14
I. Introduction
Essentially all methods for calculating the electronic wavefunctions (i.e., solv-
ing the Schrodinger equation) of atoms, molecules and solids use a first step involv-
ing optimization of orbitals (atomic orbitals, molecular orbitals, or band orbitals).
The most common starting point for accurate wavefunctions is the Hartree-Fock
(HF) method in which a product of orbitals and spins is antisymmetrized (using
a Slater determinant) and the orbitals optimized self-consistently (SCF for self-
consistent field). This might then be used as the starting point for obtaining more
accurate wavefunctions [configuration interaction (CI) or perturbation theory] by
including excitations where various numbers of electrons are excited from the occu-
pied orbitals of the HF wavefunction to unoccupied or virtual orbitals.
An alternative approach is the Generalized Valence Bond method!~* (GVB)
in which each pair of electrons is described with two orbitals (rather than one,
as in HF), the wavefunction is generalized from a Slater determinant to a form
that ensures that both the Pauli principle and spin symmetry are satisfied, and the
orbitals are optimized self-consistently. In general, this GVB approach leads to twice
the number of orbitals to be optimized as HF. Generally each pair of GVB orbitals
localize uniquely near a bond or an atom and can be viewed as involving one orbital
(the first NO for natural orbital) very similar to a localized occupied HF orbital
and a second orbital (the second NO) that corresponds to a localized unoccupied
(virtual) HF orbital. As with HF, many GVB calculations are followed by some
kind of CI. However, since the dominant correlating orbitals have been solved self-
consistently, the GVB wavefunction often allows!’ an accurate description of the
wavefunction to be obtained with very compact CI’s.
15
The two-electron interactions lead to nonlinearity in the optimization of wave-
functions. Critical to such SCF methods as HF or GVB is a prediction of good trial
functions followed by an iterative approach that converges quickly and reliably to
the desired state. |
For HF wavefunctions the trial function is generally taken from an approxi-
mate or semi-empirical HF-based method such as extended Hiickel, CNDO/INDO,
MINDO, etc. This calculation is very fast and generally leads to adequate trial
orbitals. However, for GVB there is not yet an analogous semi-empirical method.
Previously GVB calculations have involved constructing localized HF orbitals (both
occupied and unoccupied). However, these methods tend to be unreliable and not
automatic.
In addition, the HF and localization calculations can be almost as expensive
as the GVB calculations, making the total process unacceptably costly.
We present here an approach, GVB-INIT, that solves the problem of making
reliable initial guesses, providing excellent trial functions for general GVB wavefunc-
tions. Combined with the GVB-DIIS method for converging these wavefunctions,*
this leads to an automatic procedure for calculating GVB wavefunctions.
II. The GVB-PP Wavefunction
The GVB?® wavefunction generally involves a product of orbitals with a general
spin function x that is an eigenfunction of the total spin operator [$?x = $(S+1)x]
@OVB — Ag; .-- dyx] (2.1)
where both the orbitals and spin functions are optimized. More commonly, the
Perfect Pairing (PP) restriction® is made in which each pair of electrons is written
16
as a valence bond (VB) pair
(dado + do$a) (a8 — Ba). (2.2)
The GVB-PP formalism allows any number of electrons (2ncore) to be described
with doubly-occupied orbitals (®.ore) and any number of electrons, (Nopen) to be
high-spin coupled (S = nopen/2), leading to the composite wavefunction
w=A [W core VopenY pair] , (2.3)
where A is the antisymmetrizer,
Wore = I] (d5a)($78) (2.4)
t=1
W open = Il (pia) (2.5)
w= 1
Wyair = Il (vaio: + Privai) (a8 ™ Ba) ? (2.6)
t=1
where {#°}, {$2} are orthogonal spatial orbitals and where a and # are spin func-
tions. Here 4; and %; denote the GVB orbitals in pair 1. Within a pair, the
GVB orbitals overlap, S; =< tails; >. However, in the PP restriction we take
the orbitals of pair i as orthogonal to all other pairs (and to {df}, and {¢3}). For
computational purposes it is convenient to rewrite the GVB wavefunction (2.6) as
Tepair
Wpair = Il (Coibgidgi + Cuibuidbui) (a8 — Ba), (2.7a)
w=1
where
(bai + Pri) (dai — Poi)
Pai = /2(1 + Si) and ti = 2(1 — S;) (2.76)
17
are orthogonal GVB natural orbitals. Here the GVB Configuration Interaction (CI)
coefficients satisfy
Cui 1— S$;
Go = ite (2.8)
and
C2,4+ C2, =1. (2.9)
IIT. Initial Guesses for GVB Wavefunctions, GVB-INIT
A. Introduction
The generation of the trial wavefunction is composed of two steps.
(2) In the first step a HF trial wavefunction is generated based on atomic SCF
orbitals.
(12) In the second step, the GVB trial orbitals are obtained by piecewise localiza-
tion of the HF trial wavefunction.
This is our standard GVB initial guess. Alternatively, one could use (7) as the
starting point for a molecular HF-SCF calculation, which could in turn be used
in place of the HF trial function for step two. However, use of the HF optimum
orbitals in (2) generally saves only about one or two iterations in the GVB-SCF
calculation, and sometimes gives worse results for highly polar or stretched bonds.
Hence the extra expense of the HF SCF calculation is not justified and is not a part
of GVB-INIT.
We find that the HF-SCF orbitals of the individual atoms can be used to form
an HF trial wavefunction for a molecule. This is plausible since the molecular density
is not dramatically different from the sum of the atomic densities. The problem is
18
how to go from a set of atomic SCF orbitals (SCF-AO) to a set of molecular orbitals
(MO) without actually solving the HF equations for the molecular orbitals.
(a):
(8):
The procedure we use is:
Atomic HF orbitals {84}: For each atomic basis set {x7}, where A is the
atom label, we define orthonormal HF atomic orbitals {84} obtained from
occupation-averaged atomic Hamiltonians (so that orbitals of the same sym-
metry are equivalent). Thus an oxygen atom would have 4/3 electrons in each
p orbital. This atomic calculation is done once and stored with the basis set.
The atomic occupation number {f4} is associated with each HF-AO. The or-
bitals are then sorted into core orbitals, {65°74}, valence orbitals, {82%},
and unoccupied orbitals, {6%"°°4}. The core orbitals are (1s) for first row
elements and (1s,2s,2p) for second row elements. The valence orbitals are (1s)
for hydrogen atoms, (2s,2p) for the first row elements and (3s,3p) for second
row elements.
VB Assignment: We next partition the electrons of the system in terms of bond
pairs, lone pairs, core orbitals just as in a simple valence bond structure. This
can be done in several ways: (7) automatically using the autotyping features
of commercial molecular dynamics software, (e.g., BIOGRAF/POLYGRAF’),
(iz) by drawing a molecule (with bonds) on a screen, or (ii) by inputting a
list of atoms to be involved in various bonds, lone pairs, etc. GVB-PP is
oriented toward molecules with one dominant valence bond structure, and we
consider only this case herein. GVB does not require that every bond pair be
correlated. In this case one designates which bonds are to be correlated and
second natural orbitals (step 64 below) are formed only for these.
(7):
(71):
(72):
(73):
(74):
19
Build occupied HF-MO’s {67 "}:
Core HF MO’s: each core HF-AO {0¢°°*} is considered as a doubly-occupied
HF MO {@9°""}
Valence HF MO’s: The remaining valence HF-AO’s {6%*'} are used to con-
struct a subset of HF-MO’s {¢22'} as follows: We construct the overlap matrix
over all valence HF-AO’s {62%},
Sig = (68*"|09", (3.1)
this matrix is diagonalized and the eigenvectors with the largest eigenvalues
are selected. The number to be selected, noc, is obtained by adding the atomic
occupation numbers for the various atoms and correcting by the total charge
of the molecule. The resulting {¢%°'}, which are not orthogonal, include the
combinations of HF-AO having the largest overlap, a good first approximation
to the best bonding orbitals. We refer to this as the HF-INIT space. It is
an approximation to the HF wavefunction of the molecule (which we do not
calculate).
Orthonormalize the HF-MOs {¢,}. Since the core orbitals were not included
in the diagonalization of the overlap matrix in y2 , we now order the {¢,}
with the core orbitals first (ordered by atomic orbital energy) and then the
valence orbitals (ordered by eigenvalue from diagonalizing the valence AO
overlap matrix). This set of occupied orbitals is then Schmidt orthogonalized.
Create a new set of orthogonal orbitals {74} on each atom. Here we want the
atomic orbitals ordered according to their projection onto the initial HF-MO
space (the occupied HF orbitals). To do this we construct the atomic density
20
matrix over AO
PSA = (xf lenrlx?), (3.2)
where py F is the electron density of the occupied HF-MO’s
TNoce
pur => a#*)(gHF |. (3.3)
n=1
The associated eigenvalue problem to be solved is:
P44 ¢ = S44 ¢ 744 (3.4a)
where
Sty = (xa lxe) (3.46)
is the overlap matrix over atomic basis functions. The HF-projected orthog-
onal orbitals {74} are associated with the eigenvectors {cn}. Near unity
eigenvalues indicate large projections onto the HF occupied orbital space.
At this point the orbitals in {n4} are sorted by projections into core orbitals,
{noere:4} | valence orbitals, {7224}, and unoccupied orbitals, {n%"°°o4}. The
valence atomic orbitals will define the space used to create the GVB orbitals
for both bond pairs (section III.B) and lone pairs (section II.C). For the 2nd
GVB natural orbitals, the unoccupied atomic orbitals will be used (section
III.C).
B. Bond Pairs
GVB-INIT is based on piecewise localization approach to obtain trial GVB
orbitals. Each pair of GVB orbitals is generated independently. They are later
combined by performing a symmetric orthogonalization.
21
The most common case is a GVB bond-pair localized between two nearest-
neighbor.atoms. In this case there are two GVB orbitals [%.; and ys; of (2.6)], one
localized on each atom, which overlap. We then combine these to obtain the GVB
natural orbitals [¢,; and ¢u; of (2.7)], which are orthogonal.
The task here is to choose the localized GVB natural orbitals as suitable linear
combinations of HF-MO’s. The most important characteristic of the GVB orbitals
forming a bond pair, (wa; and ¥;), is the large overlap between them. The GVB
orbitals can thus be found by searching for the linear combination of the valence
atomic projected HF orbitals {n2%4} (which will form %;) and {2%} (which
will form ;) so that wa; and y; have maximum probability of sharing electrons.!®
The procedure is as follows:
(6): Consider one by one each bond and its pair of atoms (A and B). We will obtain
one pair of orbitals {y%;} and {¥»;} for each bond between A and B.
(61): Evaluate the off-diagonal density matrix elements of the occupied HF-MO’s in
the basis of the orthogonal valence atomic orbitals {n?*"“} and {n?*"?}. This
defines the bond space from which we create the overlapping GVB orbitals:1!
val, LB
Pi? = (n; ana lpxF|n;" ). (3.5)
(62): Find the pair of unitary transformations, U4 and U®, that makes each orbital
on atom A overlap with only one orbital on atom B. This biorthogonalization
process!” transforms the density matrix P4® into a diagonal matrix A4?
where the largest diagonal element of A indicates the largest overlap for the
pair of localized orbitals (one each on atom A and B). Since the matrix PAB
(63):
(64):
22
is always non-symmetric, equation (3.6) is solved by singular-value decompo-
sition.
Starting with the largest singular-value of A4®, select GVB-like orbitals (a;
and %5;) as biorthogonal pairs. [Here the bar indicates that the #,; contains
only basis functions on center A.|] For multiple bonded cases, select the appro-
priate number of pairs. Generally the highest A“? corresponds to a o-bond,
the next two largest eigenvalues correspond to a-bonds, the next two to 6
bonds etc.
The two overlapping GVB orbitals of each pair (a; and y;) from 63 are
combined to form the GVB natural orbitals [,; and ¢,; of (2-7)]. The actual
GVB orbitals (wai and »;) will be linear combinations of the atomic Wai
and 5; (Wai = Pai + Aa%oi and voi = Poi t+ Asai where A; ~ 0.1) and
thus the GVB-NO’s, (¢,; and ¢,;), will be combinations of bai and yy; as
in (2.7) but with coefficients that are not + 1. The lst GVB-NO is written
as bg = CaWai + cov pi where the coefficients c,,cy are chosen to produce the
largest projection onto the occupied HF space. The 2nd GVB-NO, ¢ui, is
just the orthogonal linear combination. This is done by solving the 2 x 2
corresponding eigenvalue problem:
PEVB ¢ = GOVE, \GVB (3.7a)
where
PS? = (boi lpuF| dpi) (3.76)
and
Sop” = (Pailba;)- (3.7c)
23
Generally the projection of the 1st GVB-NO is very large (AG&Y? > 0.95) and
we replace the GVB-NO by its projection to the occupied HF-MO space. On
the other hand the projection of the 2nd GVB-NO is very small. Thus the
1st GVB-NO can be viewed as a linear combination of the initial occupied
HF-MO’s whereas the 2nd GVB-NO corresponds to some linear combination
of the initially unoccupied MO’s.
C. Lone Pairs
For the GVB wavefunction of a molecule well described in terms of a single
valence bond structure, the above procedure leads to the starting guess for a full
GVB wavefunction. In this case lone pair orbitals (and core orbitals) of the wave-
function remain doubly occupied (not correlated). However, for many purposes it
is also desirable to correlate the lone pairs. In this case, the 1st GVB-NO consists
of a localized lobe-like orbital (say, an sp*-like hybrid corresponding to an orbital
centered off the atom). The 2nd GVB-NO has the same overall direction and lo-
cation, but with a nodal plane passing through the middle of the first NO, leading
to an off-center p-like orbital. With GVB-INIT we choose the lobe 1st GVB-NO
from a linear combination of the atomic valence orbitals {72°} not previously used
in making GVB bond-pair orbitals. The lobe 2nd GVB-NO is chosen as the lin-
ear combination of unoccupied atomic orbitals {7%"°°°} having the largest gradient
matrix element with the lobe 1st GVB-NO. The overall method is summarized as
follows:
(e1): Evaluate the overlap matrix between the valence orthogonal atomic orbitals
{rary in the subspace of the GVB bond-pair orbitals involved with this
(€2):
24
atom:
Rpair
So Wan) Won
n=1
PAA — 6; — (nf n}) (3.8)
where {ba} are the bond-pair 1st GVB-NO’s involved with atom A. The
matrix P44 is diagonalized and the appropriate number of lobe Ist GVB-
NO’s, {plore}, are chosen as the eigenvectors with the largest eigenvalue. If
there is more than one (e.g., O, F), they are Boys localized to obtain sp*-like
orbitals.
The lobe 2nd GVB-NO’s, {w!2>*}, are formed by linear combinations of un-
unocc,A
occupied orthogonal atomic orbitals {n;
} } having the maximum overlap
with the gradient of the 1st GVB-NO’s:
Riobe -V
( lobe
|r _ Riobe|®
ut
plate (3.9)
gt
where the Riobe are the centroids of the lobe 1st GVB-NO’s:
Riobe = (piaee |r| pierre). (3.10)
Empirically, the 1/|r — Riose|* term was found useful in restricting the effect
of the gradient to a small region of space near the centroid location. For
computational simplicity the 1/r* term was approximated by an expansion
over eight Gaussian s functions.
D. Symmetric Orthogonalization
So far we have described how to obtain each pair of GVB natural orbitals in-
dependently. We must now merge these GVB orbitals with the existing HF orbitals.
The procedure is as follows.
25
(A1): Symmetrically orthogonalize all the 1st GVB-NO’s {#,;}; then symmetrically
orthogonalize all the 2nd GVB-NO’s {wui}.
(A2): Form the remaining occupied orbitals by projecting out the bond and lone
pair lst GVB-NO’s {¢317} from the initial set of HF-MO’s, {¢"}. This is
done by selecting the near unity eigenvalues of the following matrix:
pair
n=1
PHY = 6:5 —(oF* oF”). (3.11)
(A3): Combine all occupied GVB orbitals. The sequence is:
i, all doubly occupied orbitals (core, uncorrelated lobe and bonding or-
bitals),
zi. all the lst GVB-NO’s {eve ;
wt. any singly occupied (high-spin) molecular orbitals,
iv. all 2nd GVB-NO’s {po}.
(A4): Schmidt orthogonalize the 2nd GVB-NO’s to all doubly occupied and redo
the symmetric orthogonalization of the 2nd GVB-NO’s to remove any overlap
between the 2nd GVB-NO’s introduced by the Schmidt orthogonalization.
In steps Al and A4 it is important to use symmetrical orthogonalization in
order to preserve any symmetry between the GVB orbitals. The lst GVB-NO’s are
symmetrically orthogonalized independently of the 2nd GVB-NO since the former
have much larger occupation.
In summary, the notation used in the above steps is as follows:
i, The atomic basis functions are {x,}.
i. The atomic HF-SCF orbitals are {64} where f4 are the occupation numbers.
Step a.
26
447. The molecular orbitals of the HF-INIT wavefunction are {¢,}. This contains
only the occupied MO’s for the HF wavefunctions. Steps 1, y2, 73.
iv. Atomic projections of the HF-MO’s {74}, sorted into core, valence and unoc-
cupied orbitals according to occupations. Step 4.
v. Bond Pairs (if correlated); Ist and 2nd GVB-NO {¥~8V"} and {pGVvB
formed from valence orbitals {74} and {72} with maximum overlap. Steps
61, 62, 63, 64.
vi. Lone Pairs (if correlated); 1st GVB-NO {yp'2?A) selected from valence or-
gi
bitals {n2*44} after eliminating {¥2" 7} on center A; 2nd GVB-NO {piorerAy
obtained from unoccupied orbitals {%”°°°4} by maximizing overlap with gra-
dient operator applied to (wry. Steps el and 2.
vii. Symmetrically orthogonalize with highest occupations first (steps 1, 2, 3, 4).
Computationally, the method described above is very efficient. The most
time consuming steps are the construction of the projection operator over the HF
orbitals, the construction and diagonalization of the matrix P¥* of (3.11), and
the final Schmidt orthogonalization. All three steps scale as M*. Note that this
method does not require a global localization of all the orbitals of the molecule’®
(an iterative process) but only piecewise noniterative localization based on simple
chemical bonding ideas.
IV. Results and Discussion
The GVB-INIT method as described above has been used as the starting point
for a number of studies of various wavefunctions. In particular, a recent paper* on
GVB-DIIS utilizes it for a number of systems, including glycine correlated with
up to 10 GVB pairs. In this paper we have concentrated on making comparisons
27
between the GVB-INIT wavefunctions and the converged wavefunctions in order to
assess the soundness of the approach proposed.
A. Hartree-Fock SCF Calculations
In Table 1 we compare the efficacy of the HF initial guess method defined in
section III.A (called HF-INIT hereafter) with that of the standard initial guess from
GAUSSIAN90 (G90-INIT). All calculations used the 6-31G** basis sets with the
geometries specified in Table 1. As a measure of the efficacy, we evaluate the overlap
population between the initial guess wavefunction and the SCF wavefunction. The
overlap population, Nguess, is defined as follows:
Noce
Nouess = S 5 (genet |giP SCF? (4.1)
ty
where the {#9"°**} are occupied orbitals of the initial guess wavefunction and the
{p7¥-SCF) are occupied orbitals of the final HF-SCF wavefunction. Thus 0 <
Nguess S Noce-
The test cases shown in Table 1 are for systems up to intermediate size such
as porphine (with 81 occupied orbitals and 430 basis functions). Both methods
do quite well with overlaps between the initial guess orbitals and the final SCF
orbitals that are always greater than 98%. The average error per orbital is 1.2% for
GAUSSIANQ0, which is slightly better than the value of 1.7% for HF-INIT.
Given the simplicity of the HF-INIT method it is encouraging that the accu-
racy is almost as good as the semi-empirical method used in GAUSSIANQQ, partic-
ularly since HF-INIT involves no adjustable parameters.
The success of HF-INIT validates the basic hypotheses:
2. the molecular density is not significantly different from the sum of the atomic
28
densities in covalent systems, and
ut. the HF-INIT orbitals can be selected out of the space spanned by the valence
atomic orbitals by choosing the linear combination of valence atomic orbitals
with maximal overlap. .
It should be noted that the HF-INIT orbitals are different from the orbitals
obtained by diagonalizing the sum of atomic densities. In that case the sum of
atomic densities is given by
Natoms TMocc(A)
Paverage = >, >, |On) far (82 (4.2)
A=1 9 n=l
where Mocc(A) is the number of occupied HF-AO on atom A. Here the associated
initial guess orbitals are the eigenvectors of the corresponding eigenvalue problem:
Pe=Scy (4.3a)
with
Puy = (Xp |Paverage| Xv), (4.35)
and
where the {x,} are the molecular basis functions. The occupied orbitals are chosen
according to their eigenvalue which correspond to projection onto the space formed
by Paverage- Those density selected orbitals are different from the the HF-SCF or-
bitals by the same amount as the HF-INIT orbitals, i.e., an average orbital error of
1.7% when compared with the converged HF-SCF orbitals, but the average differ-
ence per orbital between the HF-INIT wavefunction and the wavefunction coming
from the density selected orbtials is 1.3 %.
29
B. GVB-SCF Calculations
The GVB-INIT approach has been applied to a variety of molecules and wave-
functions, as summarized in Table 2. We use two criteria to assess the soundness
of the approach.
1. Orbital Space:
a) how well does GVB-INIT separate the space of core orbitals from the
space of lst GVB-NO’s,
b) how well does it separate the space of 2nd GVB-NO’s from the space of
unoccupied orbitals.
As a measure we compare the initial guess orbitals to the final SCF orbitals.
In this orbital space test we consider only the space of lst GVB-NO’s (or 2nd
GVB-NO’s) and do not distinguish whether or not the 1st GVB-NO’s (or the
2nd GVB-NO’s) are mixed together by some unitary transformation.
2. Orbital Pairs
a) how well each 1st GVB-NO and each 2nd GVB-NO from GVB-INIT
compares with the self-consistent GVB orbitals.
The orbital space efficacy (separating core orbitals from the Ist GVB-NO’s
and the 2nd GVB-NO’s from the unoccupied orbitals) is given in Table 3 for all
the systems studied. Here we consider both the GVB-INIT approach and also for
a Boys localization method. The efficacy for each set of orbitals (core orbitals , 1st
GVB-NO’s, 2nd GVB-NO’s) was defined as Nguess/Nords Where Norbs is the number
of orbitals in that set and nguess is defined in equation (4.1).
In order to have as close as possible a comparison between GVB-INIT and the
Boys method, the Boys orbitals were obtained as follows:
30
1. a Boys localization was performed on the HF-INIT guess orbitals with the
unoccupied orbitals localized independently from the occupied orbitals and
the inner core orbitals (i.e., 1s orbital for C,N,O atoms and 1s,2s,2p for S
atoms) kept frozen from HF-INIT,
2. the separation of the core orbitals from the lst GVB-NO’s and the 2nd GVB-
NO’s from the unoccupied orbitals was done by selecting the Boys orbitals
having the maximum overlap with the SCF orbitals.
For Boys localization one needs some form of analysis of the location and
composition of each Boys orbitals in order to properly assign which are core or-
bitals and which are lone pairs or 1st GVB-NO’s. In contrast GVB-INIT does this
assignment automatically. For these tests we did this analysis for Boys by hand.
For all systems studied here the assignment for the Boys localized orbitals is simple
since the electron distributions are consistent with simple valence bond structures.
Assigning the 2nd GVB-NO’s with a Boys localized orbital becomes very difficult
since the 2nd GVB-NO’s must be localized in the same region as the corresponding
1st GVB-NO.
Table 3 shows that the Boys method is slightly better than GVB-INIT for
orbital space separation of the core orbitals from the lst GVB-NO’s. The Boys
efficacies vary from 81% to 98% for the 1st GVB-NO’s, whereas the efficacies vary
from 77% to 98% for GVB-INIT. The advantage of the Boys method is largest
when only some valence orbitals are correlated. If all possible bond and lone pair
orbitals are correlated, then the core space is reduced to the inner core orbitals and
the Boys and GVB-INIT methods become equivalent for the core orbitals and the
~ Ist GVB-NO’s. The reason for the slight disadvantage for GVB-INIT is that the
31
1st GVB-NO’s are formed from projections without any mechanism to deliberately
make the orbital localized. Of course GVB-INIT does automatic assignment of the
orbitals whereas we had to do this by hand for Boys.
For the 2nd GVB-NO orbital space, the GVB-INIT method is markedly supe-
rior to the Boys method. Here the Boys efficacies range from 46% to 75%, whereas
the GVB-INIT efficacies range from 82% to 96%. Clearly, this indicates that the
2nd GVB-NO’s should be obtained directly from the 1st GVB-NO’s as in the GVB-
INIT method. In the Boys method, the localization principle does not necessarily
lead to 2nd GVB-NO’s that correspond well with the 1st GVB-NO’s.
More relevant for convergence is the orbital pair criteria, that is how well
each GVB-NO is predicted. In order to assess this, we evaluate the sum of the
populations over the 1st GVB-NO’s or the 2nd GVB-NO’s.
Thpair
NSumPop. = S- (p9"°** |gGVB-SCP 2 (4.4)
where the {¢2"°**} are GVB-NO’s of the initial guess wavefunction and the
{pGV8-SCF) are GVB-NO’s of the final HF-SCF wavefunction. The results are
shown in Table 4.
Clearly, for both the 1st GVB-NO’s and 2nd GVB-NO’s, the GVB-INIT
method gives much better orbital pair results than the Boys method for all cases
except for the lst GVB NO’s of water. One difference with Boys localization is
that double bonds and triple bonds look like ”banana” bonds with two or three
equivalent bonds, whereas the optimum GVB-NO’s generally have the character
of sigma and pi orbitals. A similar situation occurs for the lone pair orbitals on
oxygen and sulphur atoms. GVB-INIT always gives lobe orbitals as in the final
32
GVB wavefunction, whereas Boys localization sometimes leads to o and z lone pair
orbitals.
The fact that Boys localization is better for the orbital space of lst GVB-
NO’s for water indicates that choosing the most localized orbitals within the space
of Ist GVB-NO’s might lead to better trial GVB orbitals. To test whether this is
true we performed a series of calculations where the Boys localization is followed
by simple linear combination of the 1st GVB-NO’s in order to reproduce the GVB-
INIT orbitals as closely as possible. Those results are tabulated in Table 4 under
the GVB-Boys heading. They consistently give the best results.
The most difficult GVB orbitals to generate are the 2nd GVB-NO’s for lone
pairs. For example in the case of glycine with 15 GVB pairs, the average error
per orbital is about 8.1 % for bond pair 1st GVB-NO’s, 13.0 % for lone pair 1st
GVB-NO’s, 11.7 % for bond pair 2nd GVB-NO’s and 48.2 % for lone pair 2nd
GVB-NO’s. Unlike the bond pair 2nd GVB-NO’s, the lone pair 2nd GVB-NO’s
are not constructed directly from the Ist GVB-NO’s. In the bond pair case, the
1st GVB-NO and 2nd GVB-NO are obtained from the set of two localized atomic
orbitals centered on the nearest neighbor atoms. The lst GVB-NO corresponds
to the bonding orbital and the 2nd GVB-NO to the anti-bonding orbital. For
lone pair, we have only one atomic-centered orbital involved from which one can
construct only the lst GVB-NO. The 2nd GVB-NO must be obtained by some extra
criteria. The guiding principle is that the lst GVB-NO is typically s-like and the
2nd GVB-NO is p-like and pointing away from the nucleus in the direction of the
1st GVB-NO centroid. Two simple operators, the dipole moment and the gradient,
can be used to transform an s-like orbital to a p-like orbital. If the s-like orbital is
33
composed of a single s Gaussian function, then the two operators generate the same
p-like orbital. If the s-like function contains many different Gaussian functions, then
the p-like orbital generated by the gradient operator emphasizes tighter Gaussian
functions than the dipole moment operator. The effect of the operators can also be
limited to a small region of space around the 1st GVB-NO centroid by multiplying
the operator by 1/|r — Riose|”. Different values of the 1/r exponent were tried (see
Table 5) and the best results were obtained with n = 3 for the gradient operator.
So far we have compared directly the initial guess wavefunction to the final
SCF wavefunction. Another important criteria is to compare the SCF convergence
obtained with the initial guess wavefunctions. In Table 6 the GVB-INIT guess is
compared to the Boys guess for three different convergence methods (for formalde-
hyde with four GVB pairs). The convergence criteria requires that the sums of
squares of changes in the orbital coefficients be less than:10~°
Tebasis Thocc
S| So (AC)? <107°. (4.5)
In all cases the GVB-INIT initial guess leads to much better convergence than Boys
localization. The fastest convergence was obtained with the GVB-DIIS method.
V. Conclusions
The GVB-INIT method of obtaining initial guesses for GVB wavefunctions is
very inexpensive, using only piecewise localization, and yet provides trial guesses
which are very close to the final orbitals and have the correct localization properties
for GVB wavefunctions.
Together with GVB-DIIS, these two methods make it now possible to use
physically accurate wavefunctions to calculate force-fields, to describe bond distor-
34
tion and dissociation processes, and to obtain highly converged wavefunctions for
the purpose of calculating more exact molecular properties such as charges, dipole
moments and solvation energies.
The GVB-INIT automatic scheme for GVB trial wavefunctions and the GVB-
DIIS convergence scheme have already been implemented in the PS-GVB electronic
structure program. We expect that the pseudospectral approach, when combined
with the methods detailed here, should allow GVB calculations on much larger
systems than have been possible before.
35
References
1. F. W. Bobrowicz and W. A. Goddard III, in Modern Theoretical Chemistry:
Methods of Electronic Structure Theory, edited by H. F. Schaefer III (Plenum,
New Yok, 1977), Vol. 3, p. 79. |
2. L. G. Yaffe and W. A. Goddard III, J. Chem. Phys. 67, 1777 (1977).
3. W. J. Hunt, W. A. Goddard III, and T. H. Dunning, Jr., Chem. Phys. Lett.
6, 147 (1970). See also Chem. Phys. Lett. 4 231 (1969) and Chem. Phys.
Lett 3,606 (1969).
4. R. P. Muller, J-M. Langlois, M. N. Ringnalda, R. A. Friesner, and W. A. God-
dard III, “A Generalized Direct Inversion of the Iterative Subspace Approach
for Converging Generalized Valence Bond Wavefunction (GVB-DIIS),” to be
published.
5. R. C. Ladner and W. A. Goddard III, J. Chem. Phys. 51, 1073 (1969). W.
A. Goddard II, T. H. Dunning, Jr., W. J. Hunt, and P. J. Hay, Accts. Chem.
Res. 6, 368 (1973). W. A. Goddard III and L. B. Harding, Ann. Rev. Phys.
Chem. 29, 363 (1978). W. A. Goddard III, Science 227, 917 (1985).
6. W. J. Hunt, P. J. Hay, and W. A. Goddard III, J. Chem. Phys. 57, 738
(1972). P. J. Hay, W. J. Hunt, and W. A. Goddard ITI, J. Am. Chem. Soc.
94, 8293 (1972).
7. BIOGRAF/POLYGRAF from Molecular Simulation Inc., Burlington, Mass.
8. M. J. Frisch, et al, GAUSSIAN 90, Gaussian, Inc., Pittsburgh, Pennsylvania,
1990.
9. R. Krishnan, J. S. Binkley, R. Seeger, and J. A. Pople, J. Chem. Phys. 72,
650 (1980).
36
10. GAMESS program suite, M. W. Schmidt, K. K. Baldridge, J. A. Boatz, J. H.
Jensen, S. Doseki, M. S. Gordon, K. A. Nguyen, T. L. Windus, S. T. Elbert.
11. R. Polack, Int. J. Quant. Chem. 4, 271 (1970).
12. A. T. Amos and G. G. Hall, Proc. R. Soc. A263, 483 (1961).
13. J. M. Foster and S. F. Boys, Rev. Mod. Phys. 32, 300 (1960).
14 C. Edmiston and K. Ruedenberg, Rev. Mod. Phys. 35, 457 (1963).
15 T. F. Koetzle, M. N. Frey, M.S. Lehmann and W. C. Hamilton, Acta Cryst.
B29, 2571 (1973).
16 B. M. L. Chen and A. Tulinsky, J. Am. Chem. Soc. 94, 4144 (1972).
17. E. A. Carter and W. A. Goddard III, J. Phys. Chem. 92, 2109 (1988). S. K.
Shin and W. A. Goddard III, J. Chem. Phys. 93, 4986 (1990).
18. H. Fujimoto, N. Koga, K. Fukui, J. Am. Chem. Soc., 103, 7452 (1981).
Table 1. Population difference nocc — Nguess [see (4.1)] between initial and con-
verged HF wavefunctions.
37
Molecule No. of _HF-INIT GAUSSIAN90 Geometry
orbitals
water 5 0.149 0.066 STO-3G optimized
formaldehyde 8 0.167 0.084 STO-3G optimized
ethylene 8 0.141 0.077 Reference 9
SioHe * 17 0.297 0.249 STO-3G
glycine 20 0.455 0.224 STO-3G optimized
glutamine 39 0.808 0.454 Reference 15
TTF? 52 0.819 0.707 6-31G** optimized
porphine 81 1.213 0.832 Reference 16
* Staggered geometry.
’ Tetra-thio-fulvelene.
38
Table 2. Description of GVB-PP wavefunctions used for tests.
Molecule No. of Pair distribution | Pair correlated
GVB pairs o «lone
water 2 2 0 0 all bond pairs
water 4 2 0 2 all bond and lone pairs
formaldehyde 2 1 1 0 CO double bond pairs
formaldehyde 4 3 1 0 _all bond pairs
formaldehyde 6 3 1 2 all bond and lone pairs
ethylene 2 1 1 0 CC double bond pairs
ethylene 6 5 1 0 all bond pairs
glycine 5 4 1 0 all NC,CC,CC,CO bond pairs
glycine 10 9 1 0 all bond pairs
glycine 15 9 1 5 all bond and lone pairs
TTF 18 15 3 0 all bond pairs
TTF 26 15 3 8 all bond and lone pairs
porphine 22 ll il 0 all double bond pairs
39
Table 3. Orbital space efficacies.
Molecule GVB Core Orb. lst GVB-NO 2nd GVB-NO
pairs Boys GVB-INIT Boys GVB-INIT BOYS GVB-INIT
water 2 0.9201 0.8632 0.8813 0.7862 0.4607 0.8432
water 4 0.9998 0.9998 0.9629 0.9629 0.4904 0.8935
formaldehyde 2 0.9570 0.9272 0.8992 0.8085 0.7039 0.8978
formaldehyde 4 0.9551 0.9008 0.9379 0.8831 0.6562 0.9084
formaldehyde 6 0.9998 0.9998 0.9737 0.9737 0.5983 0.9180
ethylene 2 0.9732 0.9522 0.9630 0.8980 0.7508 0.9654
ethylene 6 0.9999 0.9999 0.9766 0.9766 0.6004 0.9375
glycine 5 0.9244 0.9118 0.8154 0.7715 0.6622 0.8906
glycine 10 0.8866 0.8827 0.8750 0.8696 0.6014 0.9005
glycine 15 0.9998 0.9998 0.9693 0.9693 0.5549 0.9124
TTF 18 0.9594 0.9485 0.9162 0.8954 0.7077 0.9249
TTF 26 0.9996 0.9996 0.9670 0.9670 0.6399 0.8208
porphine 22 0.9647 0.9435 0.9242 0.8667 0.6946 0.9445
40
Table 4. Orbital pair efficacies. Sum of populations between initial and con-
verged GVB Natural Orbitals.
Molecule GVB 1ts GVB-NO 2nd GVB-NO
pairs Boys GVB-INIT GVB-Boys Boys GVB-INIT
water 2 1.756 1.547 1.756 0.919 1.681
water 4 3.724 3.435 3.724 1.419 3.349
formaldehyde 2 0.899 1.617 1.798 0.704 1.795
formaldehyde 4 2.836 3.530 3.744 1.918 3.633
formaldehyde 6 4.807 5.523 5.761 2.501 5.269
ethylene 2 0.963 1.796 1.926 0.782 1.930
ethylene 6 4.867 5.858 5.851 2.883 5.624
glycine 5 3.168 3.853 4.470 3.252 4,447
glycine 10 7.816 8.647 9.112 5.944 8.975
glycine 15 11.662 13.542 14.053 7.139 13.105
TTF 18 13.501 16.061 16.434 10.447 16.623
TTF 26 17.749 23.891 24.095 13.897 20.921
porphine 22 10.839 19.021 19.626 9.040 20.737
41
Table 5. Lone pair 2nd natural orbital efficacies. Sum of populations between initial
and converged 2nd GVB Natural Orbitals.
Molecule No. of Operator Power of 1/r”
GVB pairs n=0 n=1 n=2 n=3
water 4 Dipole 1.857 1.762 1.627 1.760
water 4 Gradient 1.809 2.007 2.826 3.334
formaldehyde 6 Dipole 3.856 3.840 3.730 3.746
formaldehyde 6 Gradient 3.766 3.882 3.949 4.147
glycine 15 Dipole 9.383 9.292 9.010 9.401
glycine 15 Gradient 9.221 9.630 10.722 11.477
TIF 26 Dipole 17.132 16.609 17.492 16.549
TTF 26 Gradient 16.972 18.244 20.452 20.724
42
Table 6. Number of iterations for converging the GVB wavefunctions of
formaldehyde.
No. of Convergence Method*
Trial Function GVB pairs GVB-DIIS GVB2P5 GAUSSIAN90
GVB-INIT 2 15 28 34
Boys 2 21° 83 95
GVB-INIT 4 15 58 69
Boys 4 21° | >300° >300°
*Convergence criteria of 1079.
’Converged to nearest root.
“Unconverged after 300 iterations.
43
Chapter 2
Pseudospectral Generalized Valence-Bond Calculations:
Application to Methylene, Ethylene and Silylene
44
Pseudospectral generalized valence-bond calculations: Application
to methylene, ethylene, and silylene
Jean-Marc Langlois, Richard P. Muller, Terry R. Coley,
and William A. Goddard III
Contribution No. 8101 from the Arthur Amos Noyes Laboratory of Chemical Physics, California Institute
of Technology, Pasadena, California 91125
Murco N. Ringnalda, Youngdo Won, and Richard A. Friesner
Department of Chemistry, The University of Texas at Austin, Austin, Texas 78712
(Received 2 February 1990; accepted 28 February 1990)
The pseudospectral (PS) method for self-consistent-field calculations is extended for use in
generalized valence-bond calculations and is used to calculate singlet-triplet excitation
energies in methylene, silylene, and ethylene molecules and bond dissociation and twisting
energies in ethylene. We find that the PS calculations lead to an accuracy in total
energies of <0.1 kcal/mol and excitation energies to <0.01 kcal/mol for all systems. With
effective core potentials on Si, we find greatly improved accuracy for PS.
|. INTRODUCTION
Recently, we have shown that the pseudospectral (PS)
numerical method provides an attractive alternative to the
usual all-integral methods of calculating Hartree-Fock
(HF) wave functions for molecules.'* Thus, for a valence
double-zeta plus polarization description of glycine (100
basis functions), PS calculations’ leading to an accuracy in
the total energy of 0.04 kcal/mol now take about 75 CPU
seconds on a Cray X-MP (recalculating integrals at every
self-consistent iteration) vs 200 CPU seconds using GRAD-
SCF, a fast program for the traditional all-integral approach
(which calculates integrals only once). For HF wave func-
tions the PS approach scales as M> (where M is the num-
ber of basis functions), whereas the all-integral approaches
scale as M*, so that PS should become even more advan-
tageous for very large systems. Using integral cutoffs, we
have recently achieved M? scaling for PS for systems as
small as benzene and uracil.® In this paper we extend the
PS approach to generalized valence-bond (GVB) wave
functions (which include electron correlation) and to
second-row compounds (Si) treated either with all elec-
trons or with core effective potentials.
Earlier attempts to use purely numerical methods for
electronic wave functions of polyatomic molecules have
failed to provide uniform accuracy, primarily because the
accurate description of the cusps at atomic centers requires
an impractical number of grid points. The PS method uses
both a basis set and a numerical grid and preserves the
advantages of both approaches. Using a basis set for the
calculation of one-electron and selected two-electron inte-
grals accurately reproduces the description of the wave
function near atomic centers. Use of a grid for the evalu-
ation of the majority of the two-electron terms both (a)
preserves the M° scaling of the numerical calculation with
basis-set size and (b) increases the speed of calculating the
integrals that must be evaluated.
In previous papers the general algorithm has been pro-
posed and applied to atomic,’ diatomic,? and polyatomic
molecules.’ A suitable automatic grid generation scheme
7488 J. Chem. Phys. 92 (12), 15 June 1990
0021-9606/90/127488-10$03.00
has been developed,’ and the method has been applied to
glycine.’ It has been demonstrated that the method prom-
ises significant reduction of computation time (even when
compared with the GRADSCF program that was specifically
designed for Cray computers). In all cases the PS method
has shown accurate reproduction of total and relative en-
ergies to within 0.1 kcal/mol, as compared with all-integral
approaches.
There remain, however, many issues to be addressed.
Only a few basis sets have been used, and it has not yet
been determined whether the agreement with all-integral
method energies will degrade with different basis sets.
Since calculations have only been performed on first-row
elements, the ability to replace the core electrons with an
effective core potential (ECP)—a virtually essential con-
dition to effectively model systems with heavy atoms—has
not yet been established. The calculations have not yet
attempted to solve for excited electronic states, or for the
broken bonds of stretched or twisted molecular conforma-
tions. Finally, no amount of electron correlation has en-
tered into any of the calculations. In each of these areas it
is necessary to test the efficacy of the PS method.
In this paper we examine the above issues for methyl-
ene, ethylene, and silylene molecules. In the methylene and
silylene molecules we investigate (a) the effect of basis sets
where s, p, and d orbitals have different exponents, (b) the
use of diffuse functions, and (c) multiple sets of polariza-
tion functions. In silylene we also examine the use of an
effective potential to replace the core electrons in the wave
function. For the ethylene system we determine how well
the PS method reproduces all-integral results for broken
bonds and twisted geometries. In all of these systems we
have examined both restricted HF wave functions and gen-
eralized valence-bond (GVB) multiple configuration wave
functions for both the singlet and triplet electronic states.
We find excellent agreement with all-integral results for
these more complex systems, suggesting that the PS
method can be applied to a variety of chemically relevant
systems.
45
Langlois ef al.: Pseudospectral generalized valence bond
In Secs. II and III we review the relevant aspects of
GVB wave functions and the PS method and indicate how
the PS approach is used with GVB. In Sec. IV we report
the results on CH}, SiH,, and C,H, while Sec. V discusses
some key points.
il. OVERVIEW OF THE GVB FORMALISM
A. The GVB-PP wave function
The GVB wave function for an N electron system has
the general form”®
VE = bd SM Osu (N)], (2:1)
where the orbitals are singly occupied and allowed to over-
lap, ©{4;"" is a general spin eigenfunction
SOs y=S(S+1)Osu, SOsu,=MsOs.m,» (2.2)
for an N electron system, and « is the antisymmetrizer.
There are fys ways of combining the spin to satisfy Eqs.
(2.2) so that Os M, is written as
O= > ri},
in
where each Q; satisfies Eqs. (2.2). (Thus f= 2 for N= 4
and S=0, and f=5 for N=6 and S=0.) The GVB
orbitals are obtained by simultaneously optimizing the or-
bitals ¢; and spin function © [i.e., the y; for Eq. (2.3)].
A restricted form of GVB that is much more tractable
for computation is the GVB-PP (for perfect-pairing) wave
function, in which the spin function in Eq. (2.3) is re-
stricted to the simple valence-bond form
Ovs=9,= (aB — Ba) (aB — Ba): (aB — Ba)aa-:-a
(2.4)
with (1/2)N—S singlet pairs and 2S electrons coupled
high spin.
In this case the orbital functions in Eq. (2.1) corre-
sponding to the jth singlet pair in Eq. (2.4) have the
valence-bond form
(dy; ba + $2; by)- (2.5)
For computational purposes it is convenient to rewrite Eq.
(2.5) as
(2.3)
Cj be; bgj— Cy; buj buj (2.6)
using the GVB natural orbitals
bj= (Py + by)/ y20 +S),
(2.7)
buj= ($1; — $4)/ y201 — S)),
where S; = (¢,;|¢2)) and C2, + Ci; = 1. Taking natural
orbitals of different pairs to be orthogonal leads to the
general energy expression for GVB-PP wave functions:
occ occ
E=2 > fihgt D (ay Jj + 6K; (2.8)
i ij
where
7489
hi= (bil h| di, (2.9)
Jj= Gil) = fen $1) 7 (461), (2.10)
“a (2)6,(2
I= fer, (2.11)
T12
Kj= (|= fan $1) K(1)6(), (2.12)
Ryay= f Pr FOU (2.13)
7,;= transposition operator
are standard expressions for the one-electron, Coulomb,
and exchange energies. (Here A may include core effective
potentials in addition to kinetic energy and nuclear poten-
tial terms.)
The coefficients f;, aj, and 5, in Eq. (2.8) depend on
the wave function, and the general prescription is given in
Ref. 9. Some special cases are that doubly occupied orbitals
lead to f; = 1, while a GVB natural orbital leads to /;
= C3. For two orbitals of the same GVB pair, a; = fis
bi = 0, a2) = 0, bij2j = Cy Cy For orbitals ij (i=4/)
that are doubly occupied or of different GVB pairs, aj
= 2f,fjand by = —Si fy
With Eq. (2.8), the general condition for the optimum
orbitals is that the first-order change in the energy due to
changes in the various orbitals is zero, leading to
D (5$i\ Fi) ¢;) =0, (2.14)
where F; is the Fock operator
n~ “ ~ ~
F=f, h + & (ay J j+ byK)- (2.15)
Because F; depends on the orbitals, solution of the varia-
tional equation (2.14) is nonlinear, requiring an iterative
procedure to solve. The general conditions for the orbitals
to be.optimal is that the matrix
Ayg= U| (Fy — Fi) |) (2.16)
be zero,
Aj;=0, (2.17)
for all pairs of orbitals: If j is unoccupied, then F; = 0 is
used in Eq. (2.16). The iterative GVB equations are based
on expanding (2.8) through second order in the orbital
changes and lead to the general equation
DY BuiyAv= —Api> (2.18)
where v>j for index vj and where j is occupied and v is
occupied or virtual. The antisymmetric matrix A is the
generator of the orthogonal transformation matrix
T=e (2.19)
J. Chem. Phys., Vol. 92, No. 12, 15 June 1990
46
7490
that transforms an initial set of orthogonal orbitals {4°}
into a final set of orthogonal functions {¢!} which mini-
mizes the energy (2.8) expanded to second order in the
changes. The general formulas for A,; and B,,;,; can be
found in Refs. 9-11.
The diagonal elements of B have the form
where the a, and 8,,; depend on the ay and by. Since only
Coulomb and exchange operators are involved, the quan-
tities required for the diagonal terms can be evaluated from
a list of integrals (an M* procedure) without carrying out
integral transformations (an M procedure). On the other
hand, the off-diagonal terms generally involve more com-
plicated integrals requiring integral transformations. In the
GVB2PS program we ignore any terms that cannot be eval-
uated with Coulomb and exchange operators, and in the
present calculations we used only the diagonal elements of
B. As a result, Eq. (2.18) reduces to
Aui= _ Ani /Baipi .
(2.21)
In addition to A, the configuration-interaction (CI)
coefficients in Eq. (2.6) must be optimized each iteration,
leading to the two-by-two diagonalizations®
2 Yigg Cy= Ex; Cry (2.22)
where I=g or u and
oce
Vigag=Aniag + 3 kik + dS fiig— Kins
taj uy
(2.23)
¥g juj=i Kg ius:
An important feature of the GVB-PP wave function is that
the calculations require only Coulomb and exchange inte-
grals [see Eqs. (2.8), (2.15), (2.20), and (2.23)].
B. Basis-set expansions
The standard approach to solving for the GVB-PP or-
bitals is to expand each orbital ¢, as .
M M
¢= y XuCuir 8b= »y XulSC,i) (2.24)
ue “=
in a finite basis set {y,; = 1---M} and solve for the
coefficients {C,,,} leading to orbitals satisfying Eqs. (2.17)
and (2.22). With a basis set, all quantities needed for op-
timizing the orbitals can be calculated in terms of the one-
electron integrals,
huy=(BAlv), (2.25)
the matrix elements of the Coulomb operators,
J4,,=(u| Flv) = (uri), (2.26)
Langlois ef a/,: Pseudospectral generalized valence bond
and the exchange operators,
K1,,=(u|Kj|v) =(uily)- (2.27)
These operators can be calculated from the fundamen-
tal list of two-electron integrals by a simple two-index
transformation:
Jiy= DL (wom Cony» Kiw= L (uolvn) Ca Cy
A g,
on ” (2.28)
so that it is never necessary to transform the two-electron
set of integrals. This contrasts with more general
multiconfiguration-self-consistent-field (MC-SCF) proce-
dures where a full transformation of the two-electron inte-
grals (a M? procedure) is necessary every iteration.
HL PS GVB
A. Overview
In the usual all-integral GVB calculations, a set of
M*/8 integrals (uv|o7) is evaluated and used to calculate
the matrix elements for various Coulomb and exchange
operators as in Eq. (2.28). In contrast, for PS GVB cal-
culations, the Coulomb and exchange operators are evalu-
ated directly using a numerical grid. This involves evalu-
ating the M’/2 quantities
of 2)Xy(2)
Ag te) = { nT (3.1)
directly for a set of grid points {7,}. With Gaussian-type
basis functions, y, = re—*"Y,,,(@,¢), on various centers,
the potentials 4,,(r,) can be evaluated analytically for
each grid point.
Given the potentials (3.1), the fundamental Coulomb
and exchange matrices (2.26) and (2.27) could be calcu-
lated numerically using
J4,= (ul Flv) = Lxuedxla) 2D Coy Cy Aonl8),
g on
(3.2)
Ki=(u| Rv) = Dxuls) Lxals) DL Coj Cy Aovl(e)-
& n c
However, Eq. (3.2) leads to large numerical errors that
can be overcome only with huge grids. The reason is that
the functions
Ftv and K; v (3.3)
lead to components outside the basis space {y,} even
though ¢, is described within this space. Use of Eq. (3.2) is
equivalent to expanding J /¢; and Kg; defined over the
grid in terms of the basis {y,,}. If the grid were infinite, this
would be acceptable, but with finite grids it leads to large
errors in (3.2), referred to as the alias.* This manifests
itself as noise in the wave function.* The solution to this
difficulty is to expand Eq. (3.3) in terms of a larger deali-
asing basis {y%;0 = 1---M*}:
J. Chem. Phys., Vol. 92, No. 12, 15 June 1990
47
Langlois et a/.: Pseudospectral generalized valence bond 7491
uM
Jigiy(g= XL xt(e)W4,,
(3.4)
a” )
Ki(g)y(g= D xt(g) Ww,
used only for evaluating Eq. (3.2)
Me
Jiy=(w\Jivy= 2 SEW,
ge
(3.5)
Mm
Ki,=(u|Ki|v)= D SEW,
oa
where the Sf, = (v,|x%) is the overlap matrix element be-
tween the normal basis and the augmented dealiasing set
(normal basis functions plus dealiasing functions). This set
of dealiasing functions then filters out the noise due to
incompleteness of (y,,). The coefficients W#, and WX, in
Eq. (3.4) are obtained by a least-squares fit over the grid
points.”
The normal basis (y,,) is augmented to form the deali-
asing basis (yf) by adding new radial wave functions so
that the augmented basis has functions whose exponents
differ by a factor of at least 1.5 and at most 2.5. When two
exponents are any closer than this, linear dependencies
start to appear, making the final energy unreliable. When
two consecutive exponents are spaced by more than this,
then the least-squares fit is less accurate, again making the
final energy less reliable. In addition, the angular functions
of the basis are augmented so that the d basis contains
exponents corresponding to the p functions of the normal
basis, and so there are f functions corresponding to the d
functions of the normal basis.
_ With the Coulomb and exchange operators J dy and
Ki, in Eq. (3.5), we proceed to calculate the wave func-
tions as in normal GVB-PP calculations.
B. The grid
The general grid generation approach is outlined in ~
Ref. 4. For each atom in the molecule, there are angular
and radial grids using Legendre polynomials and hyper-
bolic tangent functions, respectively. These atomic grids -
are then merged to form a molecular grid using Lagrange
interpolation to ensure that the grid produces functions
that dissociate smoothly as atoms are separated to infinity.
We use a multigrid strategy as outlined in Ref. 5. This
Strategy involves generating three different grids denoted
as medium, fine, and ultrafine with about 200, 400, and
1100 points per carbon atom. The medium grid is used for
the vast majority of the SCF iterations. Once the wave
function begins to converge, the ultrafine grid is used for
one iteration, followed by the fine grid for several iterations
(usually three). After this reconsolidation, all subsequent
iterations can be carried out using the medium grid. Fock
matrix updating is used subsequent to the ultrafine grid
iteration, so that the remaining iterations require increas-
ingly less accuracy in the integrals.
The great advantage in using such a multigrid strategy
is that one can obtain wave functions whose energies are
more accurate than those obtained from using the medium
grid alone, while using the medium grid for all but about
four of the iterations. The integrals for the fine and ul-
trafine grids need never be stored, and consequently the
storage requirement for the PS algorithm is proportional to
the medium grid size. More recently, we have switched
entirely to a direct SCF scheme in which no disk space is
required for integral storage. Regeneration of the medium
grid integrals is very inexpensive, so that with an optimized
integral package and efficient cutoffs there is only a ~ 15%
increase in CPU time for recalculation.
C. Monoatomic and diatomic corrections
A major challenge in using numerical methods for the
wave functions of molecules is the accurate description of
the core orbitals. These orbitals vary rapidly near the nu-
cleus, requiring a large number of grid points and dealias-
ing functions for reasonable accuracy. An efficient way to
deal with this problem is to calculate selected two-electron
integrals analytically for these core orbitals. This method
has been applied on two levels, which we denote as mono-
atomic (1C) and diatomic (2C) corrections.
1. 1C corrections
In the 1C approach the two-electron integrals having
all four of the atomic orbitals centered on one atom are
evaluated analytically. Such an approach is computation-
ally inexpensive and is easy to implement. The number of
two-electron integrals required is M7°/8, where 7 is the
number of basis functions per atom and M the total num-
ber of basis functions. For large molecules this scales as M,
with a prefactor of 7°. Since 7 is always small (10-20) and
a large fraction of the 1C two-electron terms are zero by
symmetry, the 1C calculations are always inexpensive.
The procedure is as follows. The Coulomb and ex-
change operators containing all one-center contributions
are calculated as in Eqs. (2.26) and (2.27), but with the
sum restricted to one-center terms only, to obtain
je dle
Ji, and Ki.
mY
The time required to evaluate analytically the necessary 1C
two-electron integrals and to assemble these operators is
negligible.
Next we must correct for these 1C terms in evaluating
J; xX. and K; x2 over the grid. This is done by calculating
the 1C operators on the grid using
JE (g)= LY Aan(E) Ca; Cay» (3.6a)
o,yea
KIM (g)xv(8)= 2 Aer (8) Coy Cy Xq(8), (3.6)
o,nea
where the sums in Eq. (3.6) are over all basis functions of
atoms a. In Eq. (3.6b) the v, is only for basis functions on
a.
These functions Eq. (3.6) are then described with the
antialiasing functions:
J. Chem. Phys., Vol. 92, No. 12, 15 June 1990
48
7492
Mw
JEN (e)yy (g)= L Whe yk (a),
(3.7)
uM
Ki (g)y, (g)= L Wie lyF(g).
Using these quantities the total Coulomb and exchange
operators are calculated as
sil : .
= L SRW, — 80,04 ee) + Ii
Go
Juv
(3.8)
Me
iK jK. 1
Kuy= x Sirol Wey - 5,,0,¥ ov “y+ Ku »
Cc
where 8,,0, ensures that basis functions p: and v be on the
same center a (Ji¢ and K}<, are zero for y: and v on different
centers). :
2. 2C corrections
In the diatomic corrections approach, the two-electron
integrals having all four of the atomic orbitals centered on
one or two atoms are evaluated analytically and used to
calculate
J, and K%, (3.9)
The number of such terms is Mn’, and for large molecules
this scales as M* with a prefactor of 7. The implementa-
tion of the diatomic correction is much more complicated
than for the monatomic correction. New correction oper-
ators need to be defined for every pair of atoms,
Ji2(g)= >
Agn(8) Cay Cny
o,nea,b
(3.10)
K¥(g)y(g)= LL Aov(8) Ca; Cutan (8)
o,nea,b
(where v,, indicates v on either a or b). These are in turn
described with antialiasing functions as in Eq. (3.7),
Mu
JE (xu, (8) = 2 Wain’ x4(8),
(3.11) -
KS (g)xv,(g) = 2 WEA y8(g).
oC
The Coulomb and exchange operators are then defined as
H= Z Sto |W — (TD) wee) e7B
(3.12)
where the sum over all the atoms, £,, needs to be done if
a, = a,. The number of correction operators that need to be
evaluated scales as the square of the number of atoms in
the molecule. The time required to include these correc-
Langlois et a/.: Pseudospectral generalized valence bond
tions is not as trivial as in the monoatomic corrections, but
the diatomic corrections give results that are significantly
more accurate than the monoatomic corrections.
IV. RESULTS
We picked three systems designed to test how effective
the PS method would be for GVB calculations. In each
case the quantity of interest is an energy difference, either
between two excited states or for breaking a bond. For CH,
and SiH, the quantities of interest are the excitation energy
to the excited triplet state. In addition, a core effective
potential was tested for Si where an issue is the accuracy of
PS approaches for such cases. For C,H, the quantities of
interest include the 7-to-7* excitation energy, the rota-
tional barrier for twisting the molecule by 90° about the CC
bond, and the CC bond energy (energy to double the CC
bond length).
A. Methylene
For CH, the ground state has 7B, symmetry with o
and 7 nonbonding orbitals as in Eq. (4.1a), whereas the
singlet has the two nonbonding electrons spin paired as in
Eq. (4.1b):
H-»,,,,) H.,,
Hm — od
(4.18) (4.1b)
Because of the difference in the nonbonding orbitals, the
geometries are quite different. Thus, with GVB
configuration-interaction (GVB-POL-CI), Harding and
Goddard” calculated Roy = 1.084 A, Qyucy = 133.2” for
the triplet state, and Roy = 1.113 A, @ycy = 101.8” for
the singlet. The important quantity here, the triplet-singlet
excitation energy, is Asp = 9.09 kcal/mol = 0.0145
hartrees = 14.5 mhartrees.
In studying methylene, our goals are (a) to test the
accuracy of the PS method for good quality basis sets [va-
lence double-zeta plus polarization (VDZp)] and for basis
sets where additional polarization or diffuse functions are
added, and (b) to test the change in accuracy when GVB
wave functions are used rather than HF wave functions.
All-integral HF calculations lead to excitation energies
ranging from 26.1 to 25.1 kcal/mol for various basis sets,
while GVB-PP leads to 9.1 to 7.8 kcal/mol. As described
below, the PS-1C leads to a maximum error of 0.09 kcal/
mol for both HF and GVB calculations, while PS-2C leads
to a maximum error of 0.01 kcal/mol. We conclude that
the PS method is quite adequate for these systems.
J. Chem. Phys., Vol. 92, No. 12, 15 June 1990
49
Langlois ef a/.: Pseudospectral generalized valence bond 7493
TABLE I. Dealiasing set for methylene. TABLE Il. Total energy for CH, using GVB-PP wave functions.
Atom Basis Lbiock Exponent* Basis sets (carbon/hydrogen)
Cc vDZ, ~~ SP 10.24 20.48 Method State’ = VDZ,/VDZ, - VDZ2,/TZ2, VDZ2,,/TZ2,
SPD 0.32 0.64 1.28 2.56 5.12
. Total energy (hartrees)
H VDZ, SP 4.0 8.0 '
SPD 01775 032 064 10 20 Alleints® 'A, —38.938766 ~38.942557 — 38.943 274
All-ints 3B, — 38.953 285 — 38.955 554 ~~ 38.955 770
c vDZ2, SP 8.0 16.0
VDZ2,, SPD 0.06 0.422 0.971 2.0 4.0 Energy difference (kcal/mol)*
H TzZ2, SP 4.0 8.0 PS-1C* ‘A, 0.015 0.032 0.024
SPD 0.0822 0.2246 0.6 1.38 2.0 PS-2C* 1A, — 0.005 0.005 0.004
PS-1C 3B, — 0.054 0.117 0.112
"Exponents that already exist in the basis set are not included in the PS-2C 5B, < 0.001 0.013 0.013
dealizing set.
1. Geometry and basis sets
For CH,, we used the geometries optimized by Hard-
ing and Goddard’ with GVB-POL-CI calculations (see
below).
Three different basis sets of increasing size were used:
(a) VDZp/DZp: The C uses the Dunning—Huzinaga
valence double-zeta basis set!?!* [based on (95/5p)] with
one set of d functions on C (a = 0.64). The H uses the
Dunning—Huzinaga basis (based on four s functions con-
tracted to two, with the exponents scaled by’ ¢ = 1.2) and
one set of p functions on H (a = 1.0).
(b) VWDZ2p/TZ2p: Here we replaced the 4s H basis
with the 6s (unscaled) basis'* contracted to 3s triple zeta.
The d and p polarization functions from VDZp/DZp were
replaced by a pair of functions scaled by the factors ‘2.3
and 1/12.3 from the above values.
(c) VDZ2pn/TZ2p: To the basis in 6 was added a
diffuse set of s and p functions on the C atom (a,
= 0.45, a, = 0.34) optimized"? for the negative ion of C.
2. Grids and dealiasing basis sets
Table I shows the dealiasing basis sets used for meth-
ylene. As in previous work,° the dealiasing basis sets are
formed of SP and SPD blocks of atomic orbitals with the
same exponents. This makes the assembly of the least-
squares-fitting matrix more efficient. The molecular grid
for CH, is generated using atomic grids similar to the ones
used in studying glycine.° Thus, for methylene the sizes of
the grids are 181, 409, and 1114 grid points. .
3. Discussion
For the triplet state, the two C-H bonds are correlated,
leading to the GVB-PP (2/4) wave function. For the sin-
glet state, the two C-H bonds and the singlet lone pair are
all correlated, leading to the GVB-PP (3/6) wave func-
tion. These wave functions have the same total number of
orbitals, leading to a consistent level of description for the
two states.
The results for methylene are summarized in Tables II
and III. In Table II we compare the total energies for the
various calculations. The all-integral results are expressed
in hartrees, and the PS results are expressed as the differ-
"GVB-PP (3/6) wave function for the ‘A, state and GVB-PP (2/4) for
the 3B, state.
*Allintegrals result from the GvB2PS program.
‘Energy difference, Eps — Ex).
4PS results with analytic two-electron integrals for monoatomic terms.
‘PS results with analytic two-electron integrals for diatomic terms,
ence from the all-integral results. These are quoted in kcal/
mol to simplify comparison with bond energies and other
chemical data.
For the singlet state using the PS-1C correction, the
errors in total energy (GVB) range from 0.015 to 0.032
kcal/mol, which drops to a maximum error of 0.005 kcal/
mol for PS-2C. In the triplet state the errors are a factor of
3 to 5 larger, ranging from 0.054 to 0.117 kcal/mol for
PS-1C and up to 0.013 kcal/mol for PS-2C. The larger
error for the triplet arises because the grid is less complete
in the directions perpendicular to the molecule. The grid is
more complete in directions parallel to the plane since each
atom contributes its own grid. Such difficulties could be
ameliorated by a more sophisticated grid design procedure.
Similar energy differences are obtained for the HF wave
functions, again a maximum error of about 0.120 kcal/mol
was found for the triplet-state geometry with monoatomic
corrections.
Table IIT shows the singlet-triplet gap obtained for the
HF and GVB-PP wave functions. For both wave functions,
the error is smaller than 0.09 kcal/mol using PS-1C cor-
rections and smaller than 0.02 kcal/mol for the PS-2C:
corrections.
TABLE III. The singlet-triplet gap (AZs,) for CH."
Basis sets (carbon/hydrogen)
Method Wave function VDZp/VDZ, VDZ2p/TZ2, VDZ2pn/TZ2,
All-ints® HF 26.11 25.48 25.13
PS-1C* HF 26.18 25.41 25.06
PS-2C¢ HF 26.11 25.48 25.12
All-ints GVB-PP 9.11 8.16 7.84
PS-1C GVB-PP 9.18 8.07 7.75
PS-2C GVB-PP 9.11 8.15 7.83
*All energies in kcal/mol. Experimental value 9.09 kcal/mol.
See footnote b of Table II.
See footnote c of Table I.
SSee footnote d of Table II.
J. Chem. Phys., Vol. 92, No. 12, 15 June 1990
7494
An important result is that the errors do not increase
as the size of the basis set is increased. This shows that the
PS method does not have any intrinsic problems in manip-
ulating the diffuse functions present in the largest basis set.
More significant is the fact that the same grid was used for
all three basis sets, so that the number of one-electron
potentials (3.1) only increases as AMf*. This scaling is much
better than the traditional all-integral method, where the
number of two-electron integrals grows as M*. The PS
method is thus expected to be particularly advantageous
for larger basis sets.
B. Silylene
For SiH, the two low-lying states have the character of
the two states in CH, as in Eq. (4.1); however, the singlet
state is the ground state (see Ref. 12 for a discussion). The
studies of silylene emphasize two points: (a) to test the PS
method for a second-row element, and (b) to test the use
of effective core potentials for PS calculations. Two sets of
calculations were carried out, one with all the core elec-
trons present and the other using the shape and Hamil-
tonian consistent (SHC) effective core potential to replace
them.
As discussed below, we find that use of the ECP leads
to an enormous improvement of the accuracy for PS with-
out atomic corrections (i.e., so that 1C and 2C corrections
are not necessary). For the all-electron calculations, use of
1C and 2C corrections leads to results for SiH, comparable
with those of CH), suggesting that PS will be effective for
the later rows of the Periodic Table.
1. Geometry and basis sets
For the '4, state, we used Resin = 1.508 A and
@usin = 92.4", while for the 3B, state, we used
Rg = 1.471 A and @ysy = 118.2°. [These were based on
geometry optimizations using second-order Méller—Plesset
perturbation theory (MP2) and unrestricted MP2 with the
6-31G** basis (valence double zeta with polarization func-
tions on Si and H)].
When no effective core potential is used, a Huzinaga
valence double-zeta basis set (11s8p/4s3p) was used for
silicon and a Huzinaga unscaled double-zeta basis set (4s/
2s) was used for hydrogen. A set of d polarization func-
tions (a = 0.42) was used with the silicon basis, and a set
of p polarization functions (a = 0.6) was used with the .
hydrogen basis.
The SHC effective core potential (Rappé et al. 16) was
used to eliminate the |” core electrons of Si. The corre-
sponding VDZ basis set (3s3p/2s2p) was augmented with
the same 3d polarization functions as before. The same
VDZp basis was used for hydrogen.
2. Grids and dealiasing basis sets
The dealiasing basis sets are chosen by the same
method described for methylene, but slightly different
atomic grids are used. For the silicon atom, the size of the
bonding region is changed from 4.5 to 6.5 bohrs for the
ultrafine grid. This increases the size of the grids from 1114
of carbon to 1389 for silicon. This was found to be neces-
sary to achieve good accuracy.
50
Langlois et a/.: Pseudospectral generalized valence bond
TABLE IV. Energies comparison for SiH).
PSs Ps-1c# PS-2C°
Calculation*® All-ints® error error error
State Energy! (kcal/mol) (kcal/mol) (kcal/mol)
. (a) All-electron
HF 3B, — 289.920 507 22.7 0.065 0.007
HF 14, — — 289.929 006 22.6 0.080 0.025
AEs;* 0.008 499 0.152 — 0.015 — 0.019
GVB" 3B, — 289.940 694 22.7 0.062 0.008
GvBi ola! — 289.972 813 22.9 0.075 0.025
AEs; 0.032 119 0.150 — 0.013 — 0.017
(b) Si core electrons replaced by the effective potential
HF 3B, = — 289.994 255 0.003 0.002 0.010
HF 14, — 290.003 034 0.066 0.063 0.024
AEsy 0.008779 0.064 — 0.061 ~ 0.014
GVB 3B, —290.014773 0.003 0.001 0.004
GVB ‘A, — 290.048 020 0.068 0.063 0.024
AEsr 0.033 247 = — 0.066 — 0.061 ~— 0.020
"Experimental geometry with the VDZp basis set for Si and H atoms.
>All-integral calculations.
°PS results with no atomic corrections.
“ps results with analytic bielectronic integrals for monoatomic terms.
‘PS results with analytic bielectronic integrals for diatomic terms.
‘AN total energies in hartrees.
*Singlet-triplet gap = Erin — Esingter
*GVB-PP (2/4) wave function for the 3B, state.
iGVB-PP (3/6) wave function for the 'A, state.
3. Discussion
The errors obtained for SiH, are of the same order as
those for CH,. The maximum error is 0.079 kcal/mol
when PS-1C corrections are included and 0.014 kcal/mol
with PS-2C corrections. As can be seen in Table IV, the
errors are always similar for the HF and GVB wave func-
tions at a given geometry.
The presence of an effective core potential improves
the accuracy of the PS method. This is because the largest
errors in the PS method are caused by the tight core or-
bitals. Thus, when no 1C or 2C corrections are made, the
error in the PS result is 22 kcal/mol for all electrons, which
drops to less than 0.07 kcal/mol when the core electrons
are replaced by the effective potential. On the other hand,
with 1C or 2C corrections, the accuracy is about the same
for all electrons as for ECP; this shows that the 1C and 2C
corrections are very effective in removing large errors as-
sociated with core orbitals.
Note that for the triplet state with an effective core
potential, the error using the PS method is unusually small,
0.001-0.002 kcal/mol. We believe this results from a for-
tuitous cancellation of errors arising from the particular
geometries and has nothing to do with either the effective
core potential or the nature of the wave function.
Cc. Ethylene
In the ethylene system we shift our focus away from
electronic criteria (basis sets and effective core potentials)
and concentrate on the effect of larger geometric changes
on the accuracy of the PS method. We considered four
geometries:
J. Chem. Phys., Vol. 92, No. 12, 15 June 1990
51
Langlois et a/.: Pseudospectral generalized valence bond 7495
TABLE V. Summary of energy differences for C,Hy.
PS energy error (kcal/mol)*
All-integral total energy (hartrees)° PS-1C° PS-2c4
State HF GVB° HF GVB HF GVB
Experimental geometry (Rec = 1.338 A, ¢ = 0")
34, — 77.931 907 — 77.942 874 — 0.137 — 0.119 — 0.036 — 0.038
14, — 78.060 723 — 78.099 240 — 0.243 — 0.209 — 0.045 — 0.045
AEs! 0.128 816 0.156 366 0.106 0.089 0.008 0.007
Twisted geometry (Roc = 1.338 A, ¢ = 90°)
34, ~ 71.973 417 — 77.984 305 — 0.183 — 0.022 0.003 0.004
M4, — 77.888 273 — 77.981 006 0.016 — 0.025 0.007 0.004
AEs — 0.085 144 — 0.003 299 — 0,199 0.003 — 0,004 0.000
Barrier® 74, — 0.041 510 — 0.041 431 0.046 — 0.097 - — 0.039 — 0.042
Barrier 1A, 0.172 450 0.118 234 — 0.259 — 0.184 — 0.053 — 0.049
Stretched geometry (Roc = 2.676 A, ¢ = 0")
3A, — 71.772 934 — 77.884 587 — 0.199 — 0.196 — 0.013 — 0.013
‘4, — 71.726 946 — 77.847 384 — 0.128 — 0.209 — 0.038 — 0.016
AEs — 0.045 988 0.002 797 — 0.070 0.013 — 0.009 0.003
Bond energy °4; 0.158 973 0.098 287 0.062 0.077 0.023 0.025
Bond energy 7A, 0.333 777 0.251 856 0.115 0.000 0.007 0.029
Stretched/twisted geometry (Roc = 2.676 A, ¢ = 90°)
4, — 71.773 688 — 77.845 388 — 0.128 — 0.104 0.003 — 0.013
A, — 77.727 351 —~ 77.845 284 — 0.139 — 0.104 0.012 0.013
AEsr — 0.046 337 — 0.000 104 0.011 0.000 — 0.006 — 0,000
“Energy difference, Eps — Egy.
>All total energies in hartrees.
SPS results with analytic bielectronic integrals for diatomic terms.
°PS results with analytic bielectronic integrals for monoatomic terms.
°“GVB-PP (3/6) wave function for 1A, state and GVB-PP (2/4) for 4A, state.
‘Single-triplet gap Evripier ~ Exingier
*E(¢ = 90°) — E(¢ = 0).
(a) Planar: The ground-state geometry with a torsion
angle of ¢ = 0°, and Rec = 1.338 A. The other geometric
parameters are Rey = 1.085 A, Oey = 117.8".
(b) Twisted: The torsion angle is ¢ = 90° but no other
parameters are changed. (The equilibrium twisted geome-
try would have a larger CC bond and HCH angle.)
(c) Stretched: With ¢ = 0° the CC bond distance is
doubled to Rec = 2.676 A (all other quantities fixed). At
this distance the bond is broken,
(d) Stretched/twisted: With Roc = 2.676 A, we twist *
to ¢ = 90°. Since there is essentially no bond, the energy
changes should be small.
We find that with PS-1C the errors in the GVB wave
functions have a maximum error of 0.09 kcal/mol, while
PS-2C leads to a maximum error of 0.007 kcal/mol. These
are quite acceptable results and indicate that PS-GVB has
sufficient accuracy for chemical applications.
1. Basis sets, grids, and wave functions
For all of the calculations on the ethylene system we
use the VDZ,/VDZ, basis described above for CH, and the
same grid and dealiasing set.
In these calculations we correlate the two pairs of elec-
trons associated with the CC bond, leading to a GVB-PP
(2/4) wave function for the singlet and a GVB-PP (1/2)
wave function for the triplet state.
2. Discussion
Table V summarizes the calculations performed on
ethylene. The first two columns tabulate the standard all-
integral results for both HF and GVB-PP wave functions
for each of the four geometries. Included are the absolute
singlet and triplet energies, the singlet-triplet gap, and the
rotational barriers.
The neat four columns detail the corresponding results
for PS calculations. Columns three and four include results
with 1C corrections, and columns five and six include re-
sults with 2C corrections. The PS results are expressed as
a difference from the all-integral results and are quoted in
kcal/mol to indicate the significance for bond energies and
other chemical data.
We note that with 1C corrections the absolute energies
for HF wave functions agree with the analytic ones to
within 0.25 kcal/mol, and the relative energies agree to
within 0.2 kcal/mol. For GVB wave functions the absolute
energies agree to within 0.2 kcal/mol, and the relative en-
ergies to within 0.1 kcal/mol.
J. Chem, Phys., Vol. 92, No. 12, 15 June 1990
7496
For calculations using the 2C corrections, the absolute
energies for HF wave functions agree with the analytic
ones to within 0.05 kcal/mol, and the relative energies to
within 0.008 kcal/mol. For GVB wave functions the sim-
ilar results are 0.05 and 0.007 kcal/mol.
For singlet-triplet excitation energies (same geome-
try), the PS-1C calculations lead to errors of 0.01 to 0.20
kcal/mol for HF and 0.00 to 0.09 kcal/mol for GVB.
These errors are quite acceptable. With PS-2C the error
drops to a maximum of 0.009 kcal/mol, which is negligi-
ble.
In general, for ethylene, the PS method does a better
job at reproducing the results for GVB wave functions than
it does for HF wave functions. We presume this to be an
artifact to the particular grids and dealiasing sets used,
since we see no physical reason for this effect.
V. DISCUSSION
We find that the PS method with 1C corrections leads
to an accurate description of the GVB-PP wave functions
and that the energy differences for singlet-triplet excita-
tions and bond dissociation are well described. In addition,
we find that with use of the effective potential (ECP) on Si,
the uncorrected PS results are quite accurate. This suggests
that PS approaches will be useful in calculations of chem-
ically interesting phenomena, such as excitation energies
and bond dissociation processes.
The code used to carry out the calculations reported
above was obtained by interfacing the standard Gvs2Ps
program!® with the original PS HF program. The com-
bined code has not yet been optimized for efficiency, so we
do not present timing results in this paper. An optimized
implementation is now nearing completion so that timing
tests should be available soon. Preliminary results indicate
that substantial reductions in CPU time and disk storage
similar to those obtained for HF studies of large systems
can be achieved for GVB calculations.
With regard to higher levels of electron correlation, the
PS method can be applied to general CI calculations if the
CI matrix can be expressed in terms of Coulomb and ex-
change operators (the integrals are evaluated each itera-
tion, a direct method). In these cases, a factor of M (as in
the HF and GVB algorithms) can be gained in formulating
the action of the CI matrix on a trial vector. Of course, the
actual performance of the method for such purposes is best
tested with calculations.
A particularly attractive application would be to some
of the less computationally demanding forms of CI that
could be implemented for very large systems. Among these
are restricted CI, which properly spin couples the GVB-PP
wave function, and correlation-consistent CI, in which
only a few electron pairs are allowed to enter the virtual
space.'”!° We plan to begin implementation of these meth-
ods in the near future.
It is straightforward to incorporate analytic first and
second derivatives into PS calculations and such work is in
progress.
52
Langlois et a/: Pseudospectral generalized valence bond
These studies have given us some additional insight
into the role of the grids in determining accuracy, and we
believe that it will be possible to develop similar sized grids
with even higher accuracy.
The 2C corrections give results that are more accurate
than the 1C corrections by a factor of about three or more.
However, 2C corrections require greater computation time
than is required for 1C corrections, and we have not de-
termined whether adding diatomic corrections is faster
than simply increasing the grid size.
VI. CONCLUSIONS
We find that the PS algorithm for the direct evaluation
of Coulomb and exchange operators leads to accurate re-
sults for GVB wave functions, suggesting a wide range of
applications in quantum chemistry. The PS method pro-
vides uniform accuracy for a range of basis sets (including
diffuse functions of s, p, and d with various exponents), for
systems with effective core potentials, and for a wide range
of geometric conformations (including bond dissociation).
Additional studies are underway using the PS algorithm
for transition-metals complexes. In summary, we believe
that the PS algorithm has a future in ab initio quality stud-
ies of molecular systems.
ACKNOWLEDGMENTS
This work was supported in part by grants to R.A.F.
from the National Institutes of Health and the National
Science Foundation, and in part by grants to W.A.G. from
the National Science Foundation (Grant No. CHE-
8318041) and the Air Force Office of Scientific Research
(Grant No. AFOSR-88-0051). R.A.F. is a Camille and
Henry Dreyfus Teacher-Scholar and the recipient of a Re-
search Career Development Award from the National In-
stitutes of Health, Institute of General Medical Sciences.
W.A.G. acknowledges the support of the Endowment
Fund for the Charles and Mary Ferkel Chair.
'R. Friesner, Chem. Phys. Lett. 116, 39 (1985).
?R. Friesner, J. Chem. Phys. 85, 1462 (1986).
3R. Friesner, J. Chem. Phys. 86, 3522 (1987).
“R. Friesner, J. Phys. Chem. 92, 3091 (1988).
5M. Ringnalda, Y. Won, and R. Friesner, J. Chem. Phys. 92, 1163
(1990).
®M. Ringnalda, M. Belhadj, and R. A. Friesner (unpublished).
7R. C. Ladner and W. A. Goddard III, J. Chem. Phys. 51, 1073 (1969);
W. A. Goddard HII, J. Am. Chem. Soc. 94, 793 (1972).
"W. A. Goddard IIT and L. B. Harding, Annu. Rev. Phys. Chem. 29,
363 (1978).
°F, W. Bobrowicz and W. A. Goddard III, in Modern Theoretical Chem-
istry: Methods of Electronic Structure Theory, edited by H. F. Schaefer
III (Plenum, New York, 1977), Vol. 3, p. 79.
OL. G. Yaffe and W. A. Goddard III, Phys. Rev. A 13, 1682 (1976).
Nw. J. Hunt, W. A. Goddard III, and T. H. Dunning, Jr., Chem. Phys.
Lett. 6, 147 (1970).
“LB. Harding and W. A. Goddard HI, J. Chem. Phys. 67, 1777 (1977).
DT. H. Dunning, Jr. and P. J. Hay, in Modern Theoretical Chemistry:
Methods of Electronic Structure Theory, edited by H. F. Schaefer UI
(Plenum, New York, 1977), Vol. 3, Chap. 1.
4S. Huzinaga, J. Chem. Phys. 42, 1293 (1965).
'5 When discussing scale factors for basis functions, the use of ¢ denotes
an effective Slater exponent, whereas a denotes a Gaussian exponent.
Thus, for H, ¢ = 1.2 implies that the standard exponents for the Gauss-
J. Chem. Phys., Vol. 92, No. 12, 15 June 1990
53
Langlois ef a/.: Pseudospectrai generalized vaience bond 7497
ian expansion have all exp scaled by 1.44. (to be submitted).
‘6. K. Rappé, T. A. Smedley, and W. A. Goddard III, J. Phys. Chem. "Sw. A. Goddard III, R. A. Bair, F. W. Bobrowicz, W. R. Wadt, L. G.
85, 1662 (1981); P. J. Hay and W. R. Wadt, J. Chem. Phys. 82, 270 Yaffe, and A. K. Rappé, Ref. 9; R. A. Bair, Ph.D. thesis, California
(1985). Institute of Technology, 1975 (unpublished).
"7S. K. Shin, W. A. Goddard III, and J. L. Beauchamp, J. Chem. Phys. "EA. Carter and W. A. Goddard III, J. Chem. Phys. 86, 862 (1987).
J. Chem. Phys., Vol. 92, No. 12, 15 June 1990
54
Chapter 3
Local Density Approximation Electronic Structure
Calculations using Gaussian Type Orbitals
55
Abstract
We present a method for large-scale electronic calculations in the framework of
local density approximation. The Bloch orbitals are expanded in terms of real-space
atom-centered Gaussian type orbitals. The Coulomb potential is evaluated over a
set of gridpoints using an analytical dual-space approach. The approach is based
on an extension of the Ewald method to electron density where every pair of basis
functions in the density is screened independently. The Fock matrix elements are
evaluated by replacing the three-dimensional analytical integrations over the whole
space by a set of atom-centered numerical integrations via a projection technique.
The self-consistent field is obtained by a modified conjugate-gradient method with
preconditioning which also avoids expensive minimization.
This method was used to calculate the band structures of the diamond and Cgo
allotropes of carbon. Standard double zeta basis sets were used and the results are
compared to other recent theoretical calculations and experimental measurements.
56
I. Introduction
The standard approach to calculating electronic wavefunctions of solids uses
plane waves as basis functions. To make such calculations practical it is necessary to
use pseudopotentials in order to avoid the large number of plane waves which would
otherwise be required to describe the core electrons. This allows the focus to be on
valence electrons which are usually well described as band orbitals. For larger unit
cells the bottleneck of such calculations becomes Fock matrix diagonalization, which
scales as N* where N is the number of plane waves at each point in the Brillouin
zone. For example, about 27000 plane waves were required for calculations on
Ceo crystal.! In global minimization approaches like the Car-Parrinello method? or
the conjugate-gradient method, the diagonalization step is avoided and the time
limiting factor becomes the calculation of forces, which scales as N*n where n is
the number of occupied orbitals.
An alternative approach is to use localized atom-centered basis functions. This
is essential if the pseudopotential approximation is to be avoided. Such localized
basis sets can be much smaller, typically a factor of 5 to 10 for a close packed
structure. For Cgo, the atoms are far from being compactly packed and the atom-
centered basis may be 20 to 40 times smaller than the plane wave basis.
Gaussian type orbitals (GTO’s) have been used extensively and successfully
in molecular systems. Their popularity is based on the ease with which multicenter
integrals can be computed analytically (as compared to Slater type orbitals) and
on the accuracy for treating the bonding in molecules. Many approaches have been
explored to speed up the evaluation of multicenter integrals over GTO’s.*~® The
most powerful methods involve recursion relations to improve the evaluation of
57
integrals involving high angular momentum orbitals.’
In our approach the Fock matrix is constructed by first calculating the
Coulomb and exchange-correlation field on a set of gridpoints and then integrating
the field with a pair of basis functions. The integration is done numerically over
the gridpoints. It is only recently that numerical integrations with reliable accuracy
have been developed.8~!° One variation consists of dividing the space into regions
of different shapes over which independent integrations are performed. The most
important regions are the regions around the atomic nuclei, where atomic fields
are very large, requiring high accuracy. This leads naturally to defining a spheri-
cal region around each atom inside which the integration is performed in spherical
coordinates. The remaining space is divided into regular polyhedra and truncated
polyhedra having a spherical boundary. The location and weight of the gridpoints
are relatively easy to select in the case of the spherical regions and regular polyhedra.
The difficulty is in choosing gridpoints for the truncated polyhedra. Proposed solu-
tions have been to map the truncated polyhedra back onto a regular polyhedra® and
to use transformed Gaussian-quadrature mesh with a continuous parameter which
is adjusted variationaly to minimize an objective function.’°
A different approach, which avoids the truncated polyhedra regions, is based
on transforming the integration over all space into atom centered integrations via
a projection technique.® The atomic projection functions have the properties that
they have unit value on the atom on which they are centered, and that they go
to zero on all other atoms. This is done by defining the projection function as
a product of diatomic functions. The challenge in this case is to find the best
diatomic functions. In Becke’s original work, the diatomic functions were iterative
58
in nature which means that only a discrete set of functions were obtained. We have
modified the Becke approach to include a continuous degree of freedom which can
be adjusted to improve the accuracy. The atom centered integrations use spherical
grids decomposable in radial and angular components. The angular integrations use
Gaussian-type formulas for selecting the location and weight of the gridpoints.!! For
the radial integrations, we use a geometrically scaled mesh with the weights chosen
to exactly integrate Gaussian functions. This takes advantage of the Gaussian
nature of the basis set.
In GTO based methods, the computationally most intensive step is not the
Fock diagonalization (as in plane wave methods), but rather the evaluation of the
Coulomb field over the grid. We evaluate the Coulomb field efficiently by using
a dual-space approach similar in spirit to the Ewald method.!? In the dual-space
approach, the density is written as a sum of pairs of GTOs where each pair is
screened independently. The screening density for each pair of functions is defined
in such a way that the potential generated by the combined density (normal plus
screening density) goes rapidly to zero away from the product center of the GTO.
This perfect screening is possible because we are using GTOs.* The Coulomb field
of the combined density now converges rapidly in real space whereas the field of the
opposite screening density converges rapidly in reciprocal space. Like in the Ewald
method, the screening density is more slowly varying than the normal density. By
choosing the screening exponent appropriately, the evaluation of the Coulomb field
scales as N}-> with proper cutoffs. This approach contrasts with the popular density
fitting procedure!* which has the disadvantage of requiring more expensive three-
center integrals of GTO’s instead of the two-center integrals of the current approach.
59
The density fitting procedure also leads to extra components in the atomic forces
when doing geometry optimization’® which are subject to instabilities due to aliasing
problems.
A critical component of any self-consistent field (SCF) theory is the iterative
procedure used to achieve convergence. A simple technique is the linearization
technique of te Velde and Baerends,!® denoted VB. In this method a new Fock
matrix is expressed as a linear combination of the Fock matrices from previous
cycles. If the fit is accurate, then the new density is obtained by the same linear
combination of the corresponding density matrices. This could be improved by
implementing the Direct Inversion in the Iterative Subspace (DIIS) method?” used
in molecular Hartree-Fock (HF) calculations. In DIIS the new Fock matrix is a
linear combination of the current and previous Fock matrices where the coefficients
of the matrices are obtained by minimizing the corresponding linear combinations of
the error matrices (deviation from convergence). The major drawback of both VB
and DIIS is the large amount of disk space required to store the information from
the previous iterations. A simpler approach is the conjugate-gradient technique?®
but this often exhibits unacceptably slow convergence. By preconditioning the
conjugate-gradient vectors the number of iterations required for self-consistency can
be greatly reduced.® In our approach the preconditioning is obtained by a Green
function-like operator. Essentially, the expensive-to-evaluate second-order term,
required for the fast convergence of Newton-Raphson type methods, is replaced by
an inexpensive but effective approximation.
II. Density Functional Theory (DFT)
Hohenberg and Kohn!® showed that the electron density can be used as the
60
basic variable in minimizing the energy of an electronic system. However,the anti-
symmetry implied by the Pauli principle leads to a kinetic energy functional whose
character precludes a simple and accurate variational method based only on the
electron density. The simplest solution is the Kohn-Sham (KS) method?° where
an orbital representation of the density is used to obtain a simple expression for
the kinetic energy evaluation. The KS method consists of mapping the original
system of interacting electrons onto a system of independent electrons moving in
an effective external potential:
[-5V? + Vers(r)] dnel(r) = ent nel?) (1)
where ¢n;(7) is the one electron state with wavevector k, band index n and energy
€nk; the kinetic energy operator is given by -;V? in atomic units.
The effective potential in equation (1) is written as:
Vers(t) = Vauc(r) + J(r) + Vecl(r) (2)
where Vnu-(7) is the Coulomb potential generated by the nuclear charges, J(r) is the
Coulomb potential generated by the electron density and everything else is lumped
into the term V,,(r) referred to as the exchange-correlation potential. The nuclear
charge potential is given by:
Vauc(?) = » poe (3)
where R labels the unit cells in the crystal and a labels the atoms in the primitive
unit cell. The atom locations are given by € and the atomic charges by Z,. The
electronic Coulomb potential is given
(0) = fate p(r') 5 (4)
r=]
61
where p(r) is the electron density. Both of these electrostatic potentials involve
infinite sums which are dealt with by a dual-space approach in the spirit of the
Ewald method.
The exchange-correlation potential V,,(r) written as the functional
Vee(t) = bEzclp|/5p(r), (5)
where E,, is referred to as the exchange-correlation energy. The exact functional
form of the exchange-correlation energy is unknown, but local approximations,
Belo] = | @r olr)ecclo(r) (6
have been developed over the years. In the above equation, €z,[p] is the exchange-
correlation energy per electron, which can be obtained numerically from studies
of interacting electron system of constant density. We have used the exchange-
correlation potential as parametrized by Perdew and Zunger”! from the Quantum
Monte Carlo simulations of Ceperley and Alder.?? We also included the relativis-
tic correction of Bachelet.23 The exact form of the exchange-correlation potential
is given in Appendix A. We also used other exchange-correlation functionals for
comparison purposes. The overall theory described above is referred to as Density
Functional Theory (DFT).
In DFT, the total energy per unit cell can be evaluated as follows:
Etotal = Exin + Exc + Ee—e + Een + En-n (7)
where Ej;n is the kinetic energy, E,, is the exchange-correlation energy defined
above and the last three terms are the various electrostatic energies. The kinetic
62
energy is given by
Exin = > 2 O(€nk —_ €Fermi) < Onk| _ 5V Ione >, (8)
nk
where O(€nki — €Fermi) is the Heavyside step function with €rerm: the energy of
the Fermi level. Here k denotes the wavevector and n is the band index. The
expressions for the electrostatic energies are:
I(r )e(r')
= 3p 3!
Bae= 55 [ee fate’ SUP (9)
for the electron-electron repulsion,
for the electron-nuclear attraction, and
_ ZaZs
En-n= 1) TR OR’ (11)
a b,R’
for the nuclear-nuclear repulsion.
The orthogonal one electron band orbitals ¢,;(7r), can be expanded in linear
combinations of Bloch basis functions Xus(r):
dnk(t) = S> Cun(k)Xur(r). (12)
The expansion coefficients Cun(k) obey the orthogonality constraint:
Enm = > Cin(k)Suv(k)Cum(k),; (13)
where the overlap matrix elements S,,(k) between Bloch basis functions are given
by:
Suv(k) =< XuzlXoz >= / dr X%,(r)Xya(r). (14)
63
A. Basis Set Expansion
The expansion of the Bloch basis functions in terms of atomic orbitals(AO)
is:
Xuk(r) = =e > etFR XY R(r), (15)
where X,R denotes an AO composed of one or many Gaussian functions centered
on atom &, in unit cell R:
primitives
Xur(r)= > caiNus(@ — bur — Re)!*(y — buy — Ry)"
x (z — bug — Rz) ere “BD (16)
Here i labels the primitive Gaussian functions, Iz,ly,lz label the exponents of the
x,y,z component of the primitive function. The coefficients of the contraction are
denoted by c,; and the normalization constant for each Gaussian function is given by
Nui. The number of primitives is typically large (~ 5— 10) for AO representing core
orbitals and is small (~ 1 — 2) for AO representing valence orbitals or unoccupied
orbitals.
The Bloch basis functions have translational symmetry:
Xun(r + R) = eX, (r) (17)
and
Xurta(r) = OX, (7), (18)
where equation (17) is the Bloch theorem and equation (18) comes from the phase
factor e***" in equation (15). The Bloch basis functions are orthogonal for different
wavevectors, k # k':
64
However, Bloch basis functions with the same wavevector are not generally orthog-
onal,
Suv(k) # buy. (20)
The spatial electronic density is given by:
p(r) = 25— Wi ldne(r)|" O(€nk _ €Fermi) (21.a)
nk
p(r)= >> Dur D> Xu(r—F’) Xvrn(r-F’), (21.b)
uvk R'
where the factor of 2 is for the spin degeneracy and W, is the wavevector-dependent
weight used in the integration over the Brillouin zone. In equation (21.b) the density
matrix in real space, Duyr, is defined as:
DuwR = S| eth(te eu) etkR Duv(F), (22)
where the k space density matrix is given by:
Duv(k) = 25) We Cly(4) Con(k) O(€nk — Erermi)- (23)
For the evaluation of the spatial density over gridpoints, p(r,), equation (21.b)
is less expensive computationally than equation (21.a) because efficient cutoffs can
be applied to the construction of the atomic orbitals X,(r,) and X,r(r,) over the
gridpoints.
B. Evaluation of Effective Hamiltonian
In the Kohn-Sham method the matrix elements of the effective Hamiltonian
between a pair of Bloch basis functions, Xy;(r) and Xy,(r), are given by:
Huv(k) = Tuv(k) + Vuol(k). (24)
65
The kinetic energy matrix elements T;,,, are given by
Tuv(k) =< Xup| _ 5 VY |Xoue > (25)
which can be evaluated analytically. The effective potential matrix elements V,_(k)
include the exchange-correlation and electrostatic potentials. They are evaluated
numerically over a set of gridpoints within the first primitive cell:
cell
Vuk) = > Wg Xuk(t9) [Vac(tg) + J(7g) + Vnucltg)] Xve(tg) (26)
with W, being the weight of gridpoint ry.
In equation (26) the matrix elements of the nuclear attraction potential
Vnuc(?g) are evaluated numerically so that the 1/r tails of Vnuc(r,) and J (rg) cancel
exactly. In molecular application of DFT, we found that this cancellation is crucial
in Minimizing numerical integration errors.
For a purely covalent system, such as Co, each atom is neutral and the 1/r tail
is effectively eliminated. This is not the case for ionic systems. For those systems
we add and subtract the field created by the net atomic charge:
Qner = Za + Vine (27)
where Qt) ; is the effective atomic electronic charge. This effective electronic charge
is evaluated from the Mulliken population:
on(a) all
Qe) = S| > O(€nk —_~ €Fermi) Suv(k) Du(k), (28)
u v ok
where D,»(k) is the density matrix for each wavevector k. We should emphasize that
this definition of the effective electronic charge is not unique. However, Mulliken
populations are simple and well behaved.
66
In crystalline systems the Coulomb potential and the nuclear attraction po-
tential are both evaluated using an Ewald-like method . This guarantees that the
two potentials have zero average value:**
0= / d’r J(r) (29)
0= / dr Vauc(?). (30)
This means that the potential Vnu-(r) + J(r) does not tend to go to zero in the in-
teratomic regions but instead go to to some constant value Vai; introduced by the
Ewald reference point of the potential. Since the numerical integration is least accu-
rate precisely in the interatomic regions, we want to subtract the constant potential
shift, Vsaift, from the numerical integration and add its contribution analytically.
Taking into account the two previous points, the final expression for construct-
ing the Hamiltonian is given by:
Huy (k) = Tuv(k) + Vret + Suv(k)Vsnife (31)
cell
+ Ss; Wy Xua(tg) (Vee(tg) + I(tg) + Vaucltg) — Vnee(tg) — Venise) Xve(rg);
where Vnez(r,) is the field of the net atomic charges gg”, and V,n¢t is the analytical
matrix of the same potential evaluated for a pair of basis functions.
C. Evaluation of Total Energy
The total energy is given by:
Etotal = Exin + Exc + Ee-e + Ee-n + En-n (7)
67
where the exchange-correlation energy Ez, and the Coulomb energy E,_. are eval-
uated by numerical integration over the gridpoints:
cell
Exc= ) | Wy €ec(ry) a(t); (32)
1 cell
Eee = 9 > W, I(r) (ty). (33)
The same accuracy enhancing features applied to the construction of Hy»(k) can
also be applied to the Coulomb energy evaluation. This is done most simply by
evaluating the Coulomb energy as part of the sum of the one electron orbital ener-
gies:
Eone = S| 2 O(€nk — €Fermi)Enk- (34)
nk
The total energy is then recast in term of Eone:
Etotal = Exc + En-n + 5 (Hone + Ekin + Een _~ Veum,zc): (35)
The subtraction of the exchange-correlation potential from Eone is done by:
cell
Veum,ec = > W, Vaclrg) p(t). (36)
The electron-nuclear attraction energy is evaluated analytically as:
Een = Ss; VuoR Dur (37)
uvR
and the matrix elements of the electron-nuclear attraction are given by:
VueR = jer Xu(r)Vauc(r)Xvr(r), (38)
68
which can be evaluated analytically using the dual-space approach. The kinetic
energy is also evaluated via the real space density matrix:
Exin = )_ Tur Dur: (39)
uvR
Finally the nuclear-nuclear repulsion is evaluated using the standard Ewald method.
The overlap and kinetic energy matrix elements in k space are calculated by
Fourier transform of the matrix elements in R space:
Suo(k) = eFREr“&) NT eR Gp (40)
ae
and
Tuo(k) = ef E-&4) Se FFT, (41)
where
Suk = / d®r Xu(r)XvpR(r) (42)
and
Tavn =—3 / d®rX y(r)V?Xyr(r). (43)
The evaluations of S,,z and T,,R are performed analytically using recursion rela-
tions described in Appendix B.
III. Dual-Space Approach
A. Ewald Method
In calculating the electrostatic forces we need to sum over the infinite set of
lattice vectors. In order to make this sum convergent, we have used the Ewald
method and its extension to a continuous distribution of charges.
69
Following the derivation of Tosi,?* a non-converging lattice sum like the nuclear
attraction potential:
atoms lattice
. Za
Vauc(t) = Woe? 3
O° 2s Peo ®
is made convergent by subtracting and adding a screening field Vgauss(7) created
by an infinite set of atom centered functions:
auss = Za Sy! galt ge :
Veaues(r) > fe yf SX ate (44)
If the screening functions are chosen as normalized Gaussian densities, g.(r) defined
by:
a3 _o-?
then the field can be rewritten as:
Veauss(") = 5. Za rf wee Rl) (46)
aR a
in term of the error function, erf(z),
2 ft 2
erfie) = = | dte™'. (47)
The combined field of the nuclear charges minus the screening density is given by:
‘= 1 ~ erf(Valr ~ 4 ~ R))
Vieum( )= 22 ( 3, Ee RI ). (48)
This is now convergent since for large value of R the field created by the Gaussian
density ga(r' — &. — R) goes as 1/|r —& — R| in the limit that |r—€, — R| >> 1//a.
70
The compensating screening field contribution is transformed to reciprocal space by
using the relationship:
1 An eiGr
_ 2
2 lr— Rl Veell Geo G
where the G vectors are reciprocal lattice vectors and Vzeq is the volume of the unit
cell. The result is
eG? /4a piG(r—£a)
4n
Vasum(r) = ) Ze Veett a G? (50)
a #0
which converges rapidly for large G vectors because of the exponential factor.
The screening exponent a is chosen such that both the real space sum and
reciprocal space sum converge equally rapidly. The relevant scale for choosing a is
1/Veeti
By omitting the G =0 term in the sum over reciprocal vectors we have ne-
glected two contributions. The first contribution is the 1/G? singularity which is
easily removed by adding a constant background density of value a Za/Veelt
that compensates the original nuclear charges. The second contribution is a con-
stant whose value depends on the exact nature of the charge distribution at the
surface of the crystal. Since in real crystals the surface charge distribution (nu-
clear plus electronic) is generally unknown, the constant is set to zero. Thus the
reciprocal potential is only defined up to a constant value. The constant value is
conventionally chosen such that the average value of the potential Vnu-(r) is equal
to zero. From equations (48) and (50), the average value of Vosum(r) is zero be-
cause the G=0 has been omitted. The average value of Vrsum is evaluated by using
equation (49) into equation (48) and keeping only the G=0 term which gives the
71
average value when divided by Veeu:7°
< Vrsum(r) >= [@rVavem(r)
Veell
: 1-— eC /40 piG(r—£.)
~ = (220 re ell os, G?
< Vasum(r) > = os Za)
Waa (51)
The final expression for evaluating the potential generated by the nuclear
charges is:
_ l—erf(Valr—f.-R\)
Vauc() = X Za (x: E -_ ba - Ri — a)
—G? /4a ,iG(r—£a)
+e (i ee ) (52)
Veell G0
In a similar fashion, the potential Vnet(r) from the net atomic charges Q’,
is obtained from equation (52) by replacing the nuclear charges Z, by the charges
Qe
The nuclear-nuclear repulsion energy is most simply written as E,_n =
> « ZaVnuc(€a). However, the singular self-repulsion terms for R = 0 need to be
removed in the sum over real space vectors. The final expression is:
_ 1—erf(/alfa—-&—R\) Va us
Buon = fate “ fale — =F) _ VB a
en” /4a 6iG(éa —&s)
G? ,
(53)
Veell G#0
which is the standard form using Ewald method.*®
72
B. Application to Electron Density
So far we have applied the Ewald method to the nuclear point charges. We now
describe how the lattice sums are evaluated for the continuous electron distribution.
As defined previously the electron-nuclear attraction energy per cell is given by:
Ee-n = > VuoR DuvR (37)
uvR
with
Vier = / dr Xu(r)Xor(r)Veue(n). (38)
In this form the matrix element V,,,R can be interpreted as the energy of the basis
function pair density, puyr(r) = Xu(r)Xvr(r), in the field Viu-(r) of the periodic
nuclear charges. By using equation (52), Vu»r can be rewritten as:
VuwR = S> Za ((Goo(r’—€. -R') ||PuvR(r)) 7 (Ja(r'—€,—R')||PuvR(r)))
aR’
tye “(4 19) fate paynlr) (54)
Veell G#0
where goo(r’-fa—R) is a delta function density centered at £,+R and the Coulomb
interaction between the two densities, g(r’) and (7), is given by
(9a(r')||Po¢r)) = / d°r / dr Solr’ )es(r a(n )polr). (55)
Ire"
Because the density puyr(r) does not contain a delta function-like singularity,
the sum over reciprocal vectors in equation (54) is always convergent, independent
of the value of the screening exponent a. In practice there is no need to introduce
73
the screening densities if puyr(r) is slowly varying on the length of the unit cell. In
that case equation (54) reduces to
4 Zae Fhe
VueR = a S> dua a jer eo oy R(r). (56)
Veelt G0
The criteria used in choosing between equation (54) and equation (56) is the ex-
ponent Yuy=Q,+a, of the pair density pu,R where a, and a, are the Gaussian
exponents of the two atomic orbitals. (For simplicity of discussion, we will ignore
the fact that the atomic orbitals are contracted basis functions involving many
primitive Gaussian functions.) So for the long range (LR) densities when yu» equation (56) will be used, whereas for short range densities (SR) when yuy > a@
equation (54) will be used instead.
C. Alternate Expression for Viy7
In practice we found it better to rewrite equations (54) by using an alternative
interpretation for the matrix elements V,,r. The matrix elements can be interpreted
as the energy of the nuclear charges Z, sitting in the potential V[puyr|(r) created
by the periodic density }° p, puvr(r — FR’):
VuvR = > ZaV [PuvR| (Ea) (57)
where
V [puvrl(r) = / dr i arra —*) (58)
|r —r'|
When the density puyr(7r) is slowly varying then equation (56) is obtained again.
When the density puyr(r) is rapidly varying then we need to introduce some screen-
ing. But instead of screening the nuclear point charges, we now need to screen the
74
continuous electron density puyr(r). Because the density puyr(r) can be expressed
in terms of a single Gaussian function
nn la py?
PuvR(r) = ) Cizk ; 7 e7 Yu (r—P)
ijk OP; OP OP;
(59)
where the vector P denotes the product center of two basis functions (see Appendix
B), then there exists an efficient screening density given by:!%
ao a AF 2, Ova
n _ - C.. —2(r—P) 3
Puvn(?) we OP: OP} OPE ° Ca)
ijk
(60)
where 1 is the screening exponent. Both densities have the same integrated value
Cooo = jer PuoR(?) = jer PuvR(): (61)
Also the combined density, puvr(r)—p2,,2(r), generates a non-zero field only inside
the sphere where the combined density is non-zero. Outside the sphere the field
generated is exactly zero. This can be seen by evaluating the field of the combined
density written in the following form
a 8) OF
ijk
2 Yu |r—P|
x i 4 | due, (62)
lp — P| (Yu)? Jajr—P|
where we used the fact that 0<-yu,. In the limit where the density is small, i.e.,
e AUr-P)? cg 1, then the main contribution left is of order e~%("-)’,
Having defined the screening density p®,,, we can add and subtract its con-
tribution from V|puvR](r) to give the alternate expression for VuyR(r):
Vaur = > Ze (Sse +R!) — A aléa + R')) + (— a3)
n van 2 v
+302 an > et dr FT p® 2(r) (63)
a v G2 PuvR
G0
75
where
Aven(r) = fate Peet (64)
and
Asa(0) = fate! Fee (65)
Effective recursion formulas have been derived’ to evaluate the potential Ay, R(r)
of the density puvr(r). For the potential A®,p(r) generated by the screening den-
sity, the recursion formulas need to be modified as shown in Appendix B. Also in
Appendix B, the recursive formulas for the Fourier transform of the density and the
screening density are obtained by decomposing the integrals into one-dimensional
integrals.
Comparing equation (54) with equation (63), it is obvious that the two equa-
tions should be identical. Clearly by definition we have that
(Goo(r—c)||PuvR(r')) = Auvr(C), (66)
and by inspection we must also have that
(Ga(r—c)||Puvr(w)) = Avya(C), (67)
[ar ere Fl pran(r) = [abr Fp ,n() (68)
and
1 1 1
2-50 + (69)
Equation (68) is easily derived from the knowledge that the only dependence on 2
of fdr ec! p® .(r) is of the form e~@'/*%, Equation (67) effectively says that the
potential A®, ,(7r), created by the screening density p®, p(r) is equal to the Coulomb
76
repulsion between the original density pyyr(r) and a normalized screening Gaussian
function ga(r), whose exponent is given by equation (69). Note that those two
identities, which are proven in Appendix B, hold only because the densities are
expanded in term of Gaussian functions. .
In actual calculations we find it more desirable to fix as the screening expo-
nent instead of a. This makes equation (63) more desirable than equation (54). As
shown previously, the combined density puvr(r)— p%, p(r) has the property that the
generated field falls off as e~("-P )” from the center P of the density. This makes
possible the implementation of an efficient cutoff scheme in constructing VR as in
equation (63).
The Coulomb field J(r) created by the total electron density p(r) can be
obtained from:
SR LR
I(r) = 3) VaERDur + >) VisRDur (70)
uvR uvR
where V,5% is equation (63) used for the SR contributions of the density and VL%
is equation (56) used for the LR contributions of the density. Replacing V3% and
VLE by their respective formulas, the final expression for the Coulomb field is:
SR
I(r) = S> So (Auvr(r — B') — AB a(t — B'))Duvr (71)
uvR R!
SR
1 1.
+ 2, SevnDeor( _ Van
4n iar P?*(G) + p*®(G)
Y ° G?
cell Go
where the FT of the SR component of the density is:
SR
p*(G) = So Dave fdr! pS a(r' (72)
uvR
77
and the FT of the LR component of the density is:
DR
pL®(G) = S° DuwR per eO' 5 aR(r’). (73)
uvR
IV. Grid Generation
The basic principle used in designing grids for numerical integration has been
the divide and conquer strategy. The particular strategy we used was first intro-
duced by Becke. The idea is to replace three-dimensional analytical integrations
over the whole space by a set of single center numerical integrations. The single
center integrations are performed over atom centered grids using spherical coordi-
nates. For example, the Coulomb matrix element involving basis functions, X,(r)
and Xyr(r),
Juor = / Pr Xu(r)I(r)Xer(r), (74)
is replaced by a numerical integration:
JuuR = > Waltag a R')Xu(tTeg ~ R')J (rag ~ R')Xvr(rag ~ R’), (75)
R'ag
where R' labels the unit cells, a the atoms in the unit cell and g the gridpoints
associated with that atom. The gridpoints ra, are given by:
Tag = batty, (76)
where €, is the atom position and r, the gridpoint location relative to the atom
position. The weights W, can be separated into two components:
(1) the projection weight, W2,,;(Tag — R’), which depends on all the other atoms
and which controls how the multicenter integration is reduced into atomic
integration by projection,
78
(2) the atomic weight W?,,,,(r,), which only depends on the location of the grid-
point with respect to the atom center.
A. Atomic Projection Function
In order to ensure that the volume is conserved, the atomic projection functions
w are normalized at each of the gridpoints:
proj
So wer lr) = 1. (77)
aR
This only ensures volume conservation in the limit of infinite number of gridpoints,
but it remains a very close approximation for finite grids. Thus the Becke weight
of point r with respect to atom a is given by:
wal(r) = P,(r)
: dear Pa(r — R)’
(78)
where the P,’s are the unnormalized atomic projection function. We want the
projection function P,(r) to have the following properties:
(1) near unit value close to the atom a,
(2) near zero value close to every other atom.
In Becke’s approach this is done by choosing P, as a product of pairwise functions:®
atoms
Pa(r) = [] 25(Ha0(r)). (79)
ba
The pair projection functions are given by:
s9(4) = 5(1— pa(n)) (80)
where
ps(u) = (e(e(u))) plu) = 5-50? (81)
79
and the ja, are the hyperbolic coordinates:
a el
Had(r) = ‘lea — 6)” (82)
where rz =r — €, and r, = r — ). This approach is motivated as follows: the pas
function goes linearly from —1 on atom a to +1 on atom b along the a — b axis.
With the use of the p(j) function this is transformed into a function which has the
same value at atoms a and b but with zero derivatives. The function p(y) is the
simplest polynomial function with those properties but it varies too slowly. The
recursive application of function p is used to obtain the desired sharpness in the
intermediate region (i.e., z ~ 0). Finally equation (106) is used to obtain a function
varying between [0,1] in going from atom b to a.
We have improved on the above scheme in two ways:
(1) As defined, equation (82) doesn’t decouple atoms a and b from each other
as the distance |£, — &| becomes large. Physically we expect that the two
atomic grids should decouple totally when they are far away from each other.
Empirically we have found that limiting the denominator of equation (82) toa
maximum value of 5.6 Bohr has the desired result. Thus we used the following
expressions
[ral — Irs]
Reut
Has(r) = —1 tf |ra| —|rs| < —Reut,
Bat(r) = if [ral —|ral| < Reut, (83)
Has(r) =1 if |ral— |r| > Reut,
whenever |f, — & | > Recut = 5.6 Bohr.
(2) In equation (81) the recursive function can only be applied an integer number
of times. Becke has shown that only the function p3() has the right sharpness.
80
We have found it desirable to add to s3() a function that is non-zero only
near pp = 0 but that also contains an adjustable parameter. So we use the
following expression for s3(j) instead of equation (80):
2s(4) = 5(1 — pals) + 5 sin(p(u)n), (84)
where reasonable values for § are in the range 0.03 — 0.07.
By adjusting Rcut and @ for a series of molecules, we have improved the accuracy
by more than an order of magnitude with our choice of atomic grids.
B. Atomic Spherical Grid
The atomic grids used for the single center integration have spherical symme-
try. The grid is formed as the product of a radial grid times an angular grid. The
angular distribution of gridpoints is based on quadratures developed by Lebedev.1?
The Lebedev grids have O;, symmetry and are very efficient with respect to inte-
grating spherical harmonics on a sphere. The radial gridpoints are obtained from
an exact quadrature for Gaussian function in one dimension. By choosing the radial
positions, r;, and the radial weights, w;, as follows:
it = 77» Wi = PP log(y), (85)
then the following numerical integration:
—ar?
Iai = So wiritte ‘, (86)
can be made very accurate. Accuracy better than one in part in 107° is obtained
for a wide range of a and I ( a between 0.08 and 1000.0, and J from 0 to 5 ) by
81
choosing r; = 0.00001 , y = 1.18 and using 72 radial points. The final atomic grid
have on the order of 3000 to 4000 gridpoints.
In order to have an efficient cutoff scheme for skipping negligible contributions
the grid must be cut into small blocks of about 200 gridpoints each. This is done
by cutting the atomic grids into 6 wedges (one along each six directions). Each
wedge is then cut into a group of shells with thickness less than some maximum
value (4.0 Bohr). This scheme is well adapted to incorporate the effect of symmetry
(see Appendix C).
V. Self-Consistent Procedure
A. Initial Kohn-Sham Orbitals
The self-consistent procedure consists of the following two steps:
(1): form a new effective potential from the current density and generate the as-
sociated Hamiltonian,
(2): obtain the new Kohn-Sham orbitals from diagonalization of the Hamiltonian
and construct a new density.
The cycle can be started by having a trial guess for either the potential or the
density. Typically the sum of the atomic densities is used to construct the initial
density. We have used a different approach where the initial Kohn-Sham orbitals
are chosen as the best bonding orbitals. This was found to be a better starting
point for the SCF procedure. The initial Kohn-Sham orbitals are constructed as
follows:
(1): On each atom in the first unit cell a set of atomic orbitals, {6,,}, is created
from an atomic SCF calculation using the occupation-averaged Hamiltonian.
82
The {6,,} are partitioned into core and valence orbitals {8°°"*}, {6°%"}.
(2): The Bloch coefficients for the core orbitals, C6°"*(k), are obtained directly
from the coefficients, 64n, of the core atomic SCF orbitals
Cauees(k) = ager. (87)
The necessary normalization is done in step (4) below together with the valence
Bloch orbitals.
(3): In general the dimension of the valence space formed by the atomic SCF
orbitals is larger than the valence space for the actual system because of
fractional occupation (e.g., in diamond we have 8 atomic SCF valence orbitals
but only 4 occupied valence orbitals in the band structure ). Thus we need
to reduce the number of valence orbitals. As a criteria we choose the linear
combinations of atomic valence SCF orbitals having the largest overlap. This
is good first approximation at forming the best bonding orbitals.
gral
The overlap matrix between the valence atomic SCF orbitals, 6%", is evaluated
as follows:
Sam = > On Suv(k) vm: (88)
The matrix is then diagonalized ~
S=U\U'. (89)
The eigenvectors with the largest eigenvalue, \,,, are selected and saved
CI (k) =D) Cim(k)Umn- (90)
(4): The guess orbitals, {C9%°**}, are orthonormalized by Schimdt orthogonaliza-
tion. The orbitals are ordered with the core orbitals first (ordered by atomic
83
orbital energy) and then the valence orbitals (ordered by eigenvalue from di-
agonalizing the valence AO overlap matrix) and then Schmidt orthogonalized.
B. Preconditioned Conjugate-Gradient (PCG)
The preconditioned conjugate-gradient (PCG) method is used to directly min-
imize the total energy. Much progress has been made in the application of the CG
methods to self-consistent field calculations using plane wave basis sets.* The issues
encountered were: (1) orbitals normalization constraints, (2) the expense of line
minimization, (3) slow convergence due to the large number of degrees of freedom
and the large bandwidth of the energy spectra. For GTO basis sets, the same issues
are present but some solutions are different.
Recall the standard CG algorithm to minimize a function f(z) in the space
spanned by the multidimensional vector z:
1) start with initial position vector 2“),
2) evaluate the gradient vector, g(™ = —Vf(2‘™) for the current iteration (m),
3) evaluate the CG vector, A({™ = g(™ 4 y(™-Dal™—-1) | where
(m-1) _
ai gi™—1) x glm=4)?
with the initial condition that y© = 0,
4) doa line minimization along the CG direction h{™) and obtain the new position
vector 2(™+1)_
Steps (2) through (4) are repeated until the magnitude of the gradient is small
enough.
In ab initio calculations,* the function to minimize is the total energy Ezota
and the multidimensional position vectors are the orbitals $iz(r). For each orbital
84
\dik >, a gradient vector is formed
lgizn >= —(H — cx) |Oiz > (91)
where H is the Kohn-Sham Hamiltonian and where the presence of the orbital
energy €ik =< iz|H|¢i, > enforces the orbital normalization. The CG direction is
given by
\hik >= |giz > +n lhin > (92)
where
1 _
a 93
and the primed quantities, |hj, >, |gi, >, Yj,, refer to the previous iteration. At
this point the CG vector is made orthogonal to all the occupied orbitals to preserve
the orthogonality constraint between the different orbitals:
Jrik >= [hin > — D0 |b jk >< Gyal hin >. (94)
Jj
The new orbital, |¢3°" >, is obtained by doing a line minimization to find the best
mixing angle 6 between the old orbital |@;, > and the CG vector |hi, >:
bon >= bj > cos(@) + |hiz > sin(8). (95)
The expensive line minimization can be avoided in ab initio calculations by
evaluating the first and second derivative of the energy with respect to the mixing
angle:
OE
9 le=0 = 2Real{ < hizt|H|@i > } (96)
and
OE
567 le=o =< hiz|H|his > — < ikl bin > +4 ecx (97)
85
where ¢;, is small and related to derivative of the field:
Sei, = 2Real{ < hin | (A + eae ae >}. (98)
The angle that minimizes the energy is then given by:
_ OE 1@°E
The first derivative is obtained for free since the Hamiltonian is available. The same
is true for the first two terms of the second derivative. Only the small term 6¢,;
requires extra computation. For calculations with plane waves, the extra cost is
small and the second derivative (including the Sein term) is evaluated analytically.*
For calculations with GTO’s, the cost of building a new field and evaluating ¢e;, is
too expensive. Instead we use the following empirical expression:
8 unocc. | < hin |H|bjx > |?
ben = f y Shallllha > wea (100)
where the sum is over the unoccupied orbitals ~;;,(r) of the current Kohn-Sham
Hamiltonian and the constant f is introduced to account for the approximate nature
of be; x.
Beside avoiding the line minimization, we also used a preconditioning
technique.”” Here the idea is to precondition the gradient in such a way that it
becomes parallel to the direction obtained by a second-order perturbative solution
of the Hamiltonian. For calculations with plane waves this is difficult to achieve
and only the diagonal kinetic energy contribution is corrected. For calculations with
GTO’s, this can be easily obtained by using a Green function like operator:
occ.
j V (cie- < vel lbje >) +
86
where the sum is over the occupied orbitals 4;;,(r) of the current Kohn-Sham Hamil-
tonian and where w describes the energy scale over which the mixing between or-
bitals occurs. One would expect the scale w to be of the order of the gap €gap at the
beginning of the self-consistent loop and it should become smaller close to conver-
gence. Empirically, we have found the following expression for w to be satisfactory:
W = Egap/| < giklhiz > |?, (102)
which has the property that close to convergence
Ww = Egap(Eik— < bye|HlPjx >)’. (103)
VI. Results and Discussion
A. Grid Optimization
Molecular Hartree-Fock calculations were performed to optimize the grid pa-
rameters. The grid parameters were only optimized for first row and hydrogen
atoms. In these calculations the exchange and Coulomb operators were evaluated
over the grid and integrated numerically. The resulting exchange and Coulomb
matrices over atomic orbitals can then be compared to the exact matrices obtained
with standard HF theory. The parameters controlling the grid generation are:
(1) y which controls the spacing of the radial atomic shells (equation 85),
(2) Nang(ri) which controls the number of angular points for each radial shell,
(3) Reut which controls the point of separation between two atoms in a pair
projection function (equation 83),
(4) 6 which fine tunes the shape of the pair projection function (equation 84).
87
The parameter y was first optimized for a single atom. With y chosen to be 1.20,
all Gaussian functions with exponents in range 0.08 to 10000.0 were integrated to
better than 10—’. The first radial shell was chosen to be equal to 10.0 Bohr. All
the other shells were then obtained by dividing the previous radial shell position by
y and the series was terminated at 0.0001 Bohr.
For small molecules up to benzene (see Table 1) the angular parameters
Nang(7i) were adjusted. This was performed using past experience in construct-
ing grid for the pseudospectral method.?® Only a few angular points were required
near the core. The number of angular points was then increased up to a maximum
value of 194 in the bonding region, where large nonsymmetric fluctuations can oc-
cur. For larger distances the number of angular points is reduced again to smaller
values. The radial space was separated into five regions with boundaries at 0.0,
0.18, 0.4, 0.6, 3.3 and 12.0 Bohr and the optimized number of angular points in
each region was found to be 12, 26, 50, 194 and 116.
The pair separation parameter, Rcut was then optimized for larger molecules
(porphine and Cgq). Up to this point the Coulomb field was not screened, i.e., the
nuclear attraction potential was evaluated analytically. The largest errors occurred
for Coulomb matrix elements involving long range basis functions. Adding the
nuclear attraction field to the Coulomb field effectively made the resulting field
short range (for the covalent molecules used here) and the error was reduced by a
factor proportional to the number of atoms in the molecule. In order to integrate
the diverging part of the nuclear attraction potential to sufficient accuracy the value
of the smallest radial shell was reduced to 0.00001 Bohr.
At this point the total energy error for Cg was about 0.6 eV per atom. The
88
8 parameter was then introduced and was adjusted at the same time as Reut and
y. The final total energy error was brought down to about 0.1 eV per atom with
y = 1.18, recut = 5.6 Bohr and 8 = 0.07 for first row atoms and 0.03 for hydrogen
atoms. The final atomic grid has 3740 gridpoints per atom. The effect of the
parameter @ on the pair projection functions can be seen in Figure 1 where pair
projection functions, s3(w), 82() and s;() (defined similarly to equation(85)), are
plotted along the line in between the pair of atoms ab. The effect of 8 = 0.07 is
seen on the line labeled s3 + sg4aq which is markedly tighter than the original s3
functions.
As can be seen in Table 1 the final accuracy of the total energy is very good,
and we expect the same or better accuracy for calculations with the local density
approximation since the exchange-correlation field is shorter range than the HF
exchange operator. Beside the total energy, we are also interested in the energy of
the HF orbitals. For glycine and benzene the orbital energy error was less than 0.3
meV for all states within 30.0 eV of the vacuum level; for the Cgo molecule, the
error was less than 3.5 meV.
B. Calculations on Diamond Crystal
The diamond allotrope of carbon has a close packed fcc structure with two
atoms per unit cell. In diamond each carbon atom is bonded tetragonally to its
four nearest neighbors. This leads to a covalently bonded insulator with an indirect
band gap of 5.47 eV.®
The theoretical electronic properties of diamond have been extensively studied.
We compare our results to three studies, each based on a different methodology and
believed to be well converged in terms of basis set:
89
(1) LCAO, Linear Combination of Atomic Orbitals,
(2) LAPW, Linear Augmented Plane Waves and
(3) PS-PW, Pseudopotential Plane Waves.
These three studies have used the same exchange-correlation potential (Hedin-
Lundqvist?°), the same experimental lattice constant of 3.567 A, and the same
number of special k points*! for the integration over the Brillouin zone.
The LCAO calculation*? uses a real space method with Gaussian type orbitals.
The Coulomb field is evaluated on a physical space grid by using both a spline cutoff
algorithm for the core part of the density and a fast Fourier transform algorithm
followed by linear interpolation for the non-core part of the density. The total
field (Coulomb plus exchange-correlation plus nuclear attraction) is then fitted to S
Gaussian functions of the form r"e~®" where n = 0,—1. Most fitting functions are
located on atoms, with a few located on bond centers. The matrix elements of the
Hamiltonian are then evaluated analytically using the fitting functions. Their basis
set contained 27 Gaussian functions per atom (6s5pld), but they did not report the
exponents of the Gaussian functions and the coefficients used for contracting them.
The LAPW calculation**® used 175 PW’s per unit cell, an atomic radius of 1.42
Bohr and an expansion of up to / = 8 inside the sphere. The PS-PW calculation**
used 300 PW’s using a norm-conserving pseudopotential with angular component
up to 1 = 2.35
Table 3 contains the energy levels for the valence and conduction bands at the
I (000), X (100) and L (5, 5, $) points and the band structure along high symmetry
directions is shown in Figure 2. In both cases, the zero of energy is taken as the
top of the valence band, I'25. The first two columns of Table 3 give the results of
90
our calculations with both the Dunning/Huzinaga (DH) double zeta basis and the
Pople 6-31G basis set. These two basis sets are standard bases used for molecular
studies. The core 1s orbital is represented by a single contracted basis function
expanded in term of 6 or 7 Gaussian functions. The 2s orbital is described in term
of two basis functions, with the inner one contracted with 2 or 3 GTOs and the
other one uncontracted; and similarly for the 2p orbitals. The exact exponents and
contraction coefficients are given in Table 2. Both basis sets were complemented by
a single set of Gaussian d function with exponent of 0.75 and 0.8 respectively.
The agreement between all five calculations in Table 3 is good. The average
energy difference between any pair of calculations is about 0.04 to 0.06 eV except for
the LAPW calculation which has consistently a larger energy difference, 0.15 to 0.18
eV. The average difference between with DH and the 6-31G basis set calculations
is only 0.036 eV.
The largest differences occurs at the X points, 27(1,0,0), of the conduction
band. The lowest point, Xi, is very close to the energy minimum of the conduction
band which is along the [-X direction. At this point the LCAO and PS-PW disagree
by 0.15 eV.
We calculate a band gap of 4.04 eV in good agreement with PS-PW value
of 4.05 eV. This is about 30% smaller than the experimental value of 5.47 eV. It
is well known that LDA calculations consistently underestimate the band gap by
20%-40%. To improve on this may require ab initio methods.
When the d polarization functions are removed from the DH and the 6-31G
basis sets, the total energy increases (less strongly bonded) by 25 meV and 23
meV respectively and the energy levels change on average by 0.18 and 0.16 eV
91
respectively. We should note that we used Cartesian Gaussian functions. This
means that the d polarization set contains six functions, five spherical harmonics
for 1 = 2 angular momentum and one spherical harmonic for | = 0 (corresponding
to the 2? + y? + 2? linear combination). Looking at the Mulliken population for the
DH basis set, the population for the 2”, y”, z? orbitals is 0.0058 and the population
for the ry, xz, yz orbitals is 0.0133. Because of the local Tg symmetry around
2 2? orbitals only contribute through the A, irreducible
a carbon atom the 2”, y
representation, i.e., 2? + y? + z? linear combination, whereas the zy, xz, yz orbitals
all contribute in the triply degenerate T, irreducible representation. The z?+y?+2?
orbital should have a smaller effect on the energy levels than the zy, rz, yz orbitals
because most of the 0.0058 population is due to overlap with the already occupied
l = 0 space whereas the 0.0133 population reflects the addition of new degrees of
freedom for | = 2. Thus we conclude that the presence of d functions is necessary in
the calculations. This is in contrast to the conclusion from the LCAO calculations*?
where the d functions (1 = 2 only) were found unnecessary.
We also have performed an optimization of the lattice constant (see Figure
3) for the 6-31G basis set. The resulting lattice constant is 0.6% smaller than the
experimental value of 3.567 A. This is similar to other LDA calculations where
errors of up to a few percent are reported.*®
C. Calculations on Cgo Crystal
The second allotrope of carbon we studied is the recently discovered crystalline
form of Cgo.°’ This has sparked much interest because of the superconducting be-
havior of the related compounds A3_zB,Cgo with A,B= K,C,,Ry.9°-*! The molec-
ular form of Cgp also has attracted much interest because of its possible chemical
92
and biological properties.
Due to the high symmetry of Ceo (In symmetry) it is not surprising that the
crystalline form of Cgo is the fcc close packed structure.** At high temperature the
Ceo molecules rotate freely around their center and the structure is strictly fcc.
Below 249K, the molecules become rotationally ordered*? and the crystal is then a
simple-cubic lattice with four Cgo molecules at (000), (0, 5,5), (5,0, 5) and (3, 3,0)
each with a different orientation.
Ab initio calculations of the simple-cubic lattice structure, which has 240
atoms per unit cell, are too expensive and all ab initio studies have been performed
using an hypothetical fcc structure having a single Cgo molecule per cell*® which
are oriented in such a way that the resulting space group is of cubic symmetry.
The space group is T?(Fm3) and the point group symmetry around a Cgp cen-
ter is reduced from I, symmetry (with 120 operations) to T;, symmetry (with 24
operations). Because of this reduction in symmetry, the Cgo molecule is free to dis-
tort within the T, symmetry restriction. In practice the distortion is very small**
and can be neglected. Thus both the molecular and crystalline structure of Cgo
can be described by two parameters: R, is the short carbon-carbon bond distance,
R, ~ 1.41 4 , and Rg» is the long C-C bond distance, Rz ~ 1.45 A.
Loosely speaking the shorter bond corresponds to a double C-C bond (¢ and
m bonds) and the longer bond to a single C-C bond (o bond). Because of the
curvature of Cgo, neither the pure sp* hybridization of diamond or the pure sp”
hybridization of graphite are possible. Hence there are no pure o or x bonds in this
allotrope of carbon. The structure of Cg can be described as having 12 pentagons
of singly bonded carbon atoms where each pentagon is connected to its five nearest
93
neighbors with a double C-C bond.
For crystalline Cgo, the experimental values for Ri and Rz are 1.40 A and 1.45
A from NMR data*® and 1.391 A and 1.455 A from x-ray data.*® Theoretical values
are 1.38 to 1.39 A and 1.44 to 1.45 A (with the hypothetical fcc (Fm3) structure
with one Ceo per unit cell) using plane waves LDA.**47
For our calculations on Cgo we have used the values of R; = 1.414 A and R2 =
1.455 A from molecular simulations using a graphite force field.*® The calculations
were performed using both the Dunning/Huzinaga and 6-31G basis sets which have
9 functions per atoms (see Table 2) for a total of 540 basis functions per unit cell.
For the 6-31G basis set, the integration over the Brillouin zone was done with one
or two special k points.*! The resulting orbital energy differences were less than
0.01 eV for energy levels within 20 eV of the Fermi energy. Thus calculations with
two k points are necessary only for evaluating sensitive properties like the density
of states or structural properties.
The band structure of Cgo using the 6-31G basis set and two k points is given
in Figure 4 where the zero of energy is the top of the valence band. The band
structure reveals a direct band gap insulator with a energy gap of €gapx = 0.97 eV
at the X point, 27(1,0,0). This about 50% smaller than the gap of ~ 1.85eV found
by microwave conductivity experiments*® and much smaller than the values of 2.3 to
2.6 eV inferred from direct and inverse photoemission experiments.°°~>? In Figure
4, the largest band dispersion is always at the X point for the three bands shown.
The band width of the valence band, labeled H,, from its I, molecular symmetry, is
Wu. = 0.82 eV and the band width of the two conduction bands are Wri, = 0.61
eV and Wri, = 0.73 eV.
94
Also of interest is the energy gap at the [' point, 2*(0,0,0), which has a value
of €gapr = 1.69 eV. At the [ point the top three levels of the valence band are
degenerate forming a T,, irreducible representation (from the local T, symmetry)
with an angular character of | = 5 (with respect to the center of Cgo).** The first
conduction band also has the same irreducible representation and angular character
and one can expect those two bands to be affected in the same way by the crystal
field. Indeed comparing molecular Cgo with crystalline Cgo, the gap changes by 0.01
eV which is less than 1%. In contrast, the splitting T,-T, at the I point between
the two conduction bands decreases by 0.33 eV in going from the molecular form to
the crystalline form. The same observation also applies to K3Cgo where the band
GaP €gapr changes by 0.02 eV and the T,-T, splitting decreases by 0.29 eV when a
pressure of 3 G Pa is applied.**
The Cgo calculations were repeated with one k point for both the 6-31G and
the Dunning/Huzinaga basis sets. The resulting band gaps and band widths were
very similar to the 6-31G calculation with two k points. The changes were less than
0.01 eV and 0.02 eV respectively. Thus the two basis sets give almost identical
results, just as observed for diamond.
The band parameters from the Cgo calculations are summarized in Table 4,
along with theoretical results obtained by four other groups [all using the hypo-
thetical fec (Fm3) structure]. The two plane wave calculations, PS-PW(1)°* and
PS-PW(2)**, use similar numbers of plane waves, 27000 and 28000, and both use
norm conserving pseudopotentials with s and p components. Note that the size of
the plane wave basis set is about 45 times larger than the GTO basis sets we use.
The difference between the two pseudopotentials is in the size of the core radii over
95
which the norm conversation is applied. For the PS-PW(1) calculation, the core
radii, r, ~ rp ~ 0.4A, are about half the size of the core radii used in the PS-PW(2)
calculation, rs ~ rp ~ 0.8A. Larger core radii lead to a smoother, more well-behaved
pseudopotential,** but also to a larger approximation. The pseudopotentials were
generated for the ground state configuration of carbon, 2s?2p?, and the resulting
pseudopotential only has s and p components. These two calculations agree well
with each other although the geometry and the nature of the exchange-correlation
used in PS-PW(1) was not stated. The Ceperley-Alder exchange-correlation was
used in PS-PW(2).
The PS-LCAO calculation*? also uses a norm conserving pseudopotential to
represent the core 1s electrons. Two uncontracted Gaussian functions were used
to represent the valence 2s and 2p orbitals. The exponents are 0.299 and 2.908 for
the 2s orbital and 0.362 and 2.372 for the 2p orbitals. The smaller exponents are
substantially larger than the exponents in the 6-31G basis set (0.16 for s and p) and
the Dunning/Huzinaga basis set (0.15 for s and 0.11 for p). Small exponents are
known to be essential for calculations on finite molecules in order to describe the
outer part of the valence orbitals. Eliminating these small exponents should lead
to an artificially large band gap like (as observed in the PS-LCAO calculation). In
contrast, we obtained the same band gap for two basis sets where the smallest p
function exponents vary by 40%.
The LCAO calculation®® is an all electron calculation with the carbon 18,2s,2p
orbitals represented by 16 GTOs with exponents in the range 0.15 to 50000. The
GTOs are contracted to 5 or 9 basis functions per carbon atom. The matrix elements
of the Kohn-Sham Hamiltonian were evaluated by a fitting procedure where both
96
the density and field are fitted. However, this calculation uses the Wigner exchange-
correlation potential rather than the Ceperley- Alder exchange-correlation potential.
Just as for the PS-LCAO calculation, their gap €gapr is about 0.2 eV larger than our
value of 1.6 eV (which is close to the two PS-PW values). On the other hand, our
values for the different band widths are consistently larger by 20-30% as compared
to the other four calculations.
The band width is directly related to the interaction between nearest neighbor
Ceo molecules. Taking into account the values of Ri, Re and the lattice parameter,
the smallest distance between two carbon atoms located on different Cgo clusters
is 2.94 A for the PS-PW(2) and our calculations, and 3.09 A for the LCAO and
PS-LCAO calculations. Because of the exponential decay of the orbital the 5%
difference in distance should translate into a much larger effect on the band width.
Assuming a pp hopping integral between nearest neighbor Cgo the band width can
be parametrized as follows:°®
W = W.(d/d,)e~(4-40)/4
where reasonable values of L are in the range 0.54 A to 0.63 A. Thus the 5% increase
in distance translates into a decrease of 20-25% in band width. This may explain
the difference between our band widths and the LCAO band widths.
We have not included the d polarization functions in the Cgo calculations.
Because of the large value of their exponents, 0.75-0.8, the d functions should have
negligible effect on the band width. However, the d functions can affect the gap
for intramolecular interactions which would result in rigid shift of the bands. This
incompleteness of the basis set can be estimated by comparing calculations on molec-
ular Cg. Table 5 compares the energy levels obtained with using the 6-31G basis
97
with the energy levels obtained with a well converged (triple zeta) basis set.°’ The
comparison is very good. The energies of the highest occupied molecular orbital
(HOMO) and of the lowest unoccupied molecular orbitals (LUMO) are both shifted
upward by 0.02 eV in the calculation with the 6-31G basis set. A HOMO-LUMO
gap value of 1.68 eV is obtained in both calculations. For higher energy unoccupied
orbitals, the energy difference between the two calculations is as large as 0.21 eV
for the first 4 eV. For the lower occupied orbitals, the energy difference is at most
0.11 eV for the levels within 10 eV of the HOMO orbital energy. The absolute
value of the HOMO energy we obtained, 5.92 eV, can be used to estimate the first
ionization potential (IP) of Ceo. The experimental first IP of Ceo is 7.6 eV.58—*?
The comparison can be greatly improved by including orbital relaxation effects.>”
Turning our attention back to the crystal calculation, Table 6 compares energy
levels at the [ point with the energy levels for the PS-PW(2) calculations (setting
the top of the valence band at 0.0 eV). Over the whole valence band energy spec-
trum, the worst energy difference is 0.3 eV. Near the Fermi energy the comparison
is even better, with the band gaps within 0.02 eV, and the splitting of the molecular
H, HOMO into E,-T, within 0.03 eV. For the conduction band the difference are
larger, as expected. Not included in Table 6 are five energy levels, at A,4.2, A,6.50,
A,6.97, A,7.21, T,7.43 and T,8.30 eV, which are observed in the plane wave calcu-
lations. Those levels where associated with nonmolecular states, i.e., states where
the density is located at interstitial positions: the center of Ceo clusters, the inter-
stitial fec tetrahedral and octahedral positions. For the 6-31G basis set, no such
level is identified in the first 10 eV. For the Dunning/Huzinaga, two such levels are
observed, A, at 7.82 eV and A, at 8.24 eV. Clearly the smaller exponents in the
98
Dunning/Huzinaga basis are much better at describing such delocalized states.
VII. Conclusions
We have presented a method for doing ab initio electronic structure calcula-
tions on large systems using localized Gaussian type orbitals in the local density
approximation. This method avoids the density fitting procedure by evaluating an-
alytically the Coulomb field using a dual-space approach that scales as N?*. The
dual-space approach is made possible by screening perfectly each component of the
density matrix and by the efficient computations of different Gaussian integrals us-
ing a recursion relation scheme. Avoiding the density fitting procedure makes the
algorithm more compatible with current parallel computer architecture.
The matrix elements of the Kohn-Sham Hamiltonian are evaluated by breaking
up the integration in real space over an atom centered grid using a projection
technique. Test calculations on molecular systems as large as Cgo demonstrate the
accuracy of the grid generated. The total energy error was less than 0.2 meV per
atom using 3700 gridpoints per atom.
Band structure calculations on diamond were shown to be very accurate using
Dunning/Huzinaga or 6-31G double zeta basis sets with d polarization functions.
For both basis sets, the energy levels were converged to within 0.02 eV over a large
spectrum of energy. The optimized lattice parameter was found to be very close to
the experimental value (0.6% smaller) using only 30 basis functions per cell.
For crystalline Cgo, we showed that the energy levels near the Fermi level were
well described (to within 0.1 eV) using a double zeta basis sets (540 basis functions
per cell). The Dunning/Huzinaga and the 6-31G basis sets were found equivalent
to within 0.03 eV for all valence bands. For the conduction bands, the differences
99
were an order of magnitude larger. The smaller exponent in the Dunning/Huzinaga
basis set lead to an extra band at 7.8 eV which was associated with a nonmolecular
state. Adding less than 15 Gaussian functions at the center of the Cgo cluster and
at the fcc octahedral and tetrahedral interstitial positions should make possible the
correct description of the nonmolecular states which were better described in the
plane wave calculations at the price of a very large basis sets (28000 basis functions
per cell).
100
Appendix A. Exchange-Correlation Potential
We use the exchange-correlation potential based on the Quantum Monte Carlo
simulation of the interacting electron gas done by Ceperley and Alder?? which was
parametrized by Perdew and Zunger.”! The parametrization of Perdew and Zunger
gives the correct low-density and high-density limits. The final formula for the
exchange correlation energy is:
€zc(p) = f(p)€x(p) + €c(p) (A.1)
t?3 relativistic correction, and
where e€,(p) is exchange energy, f(p) is the Bachele
€-(p) is the correlation energy. The exchange energy €,(p) is approximated by the
average exchange energy in the HF free electron gas
0.4582
€z(p) = ; (A.2)
where r, is related to the density by
() =() (A3)
rs(p) = inp) .
The relativistic factor is:
3/(1+ 6B)? In(B+(1+6?)2)\’
with 8 = 0.0140/r,. Finally, the parametrization of the correlation energy €,(p) is
given by:
—0.1432 ,>1
cle) = 570500 yr, 4058887, == (4.5)
€-(p) = —0.048 + 0.0311/n(r, )
— 0.0116 r, + 0.002r, In(r,) rT. <1.
101
Appendix B. One-Electron Integrals
With Cartesian Gaussian functions, efficient recursion formulas can be used
to evaluate the different one-electron integrals: overlap integral S,,, kinetic energy
integral T,,, , Fourier transform integral puy»(G), and two-center potential integral
Auv(r). Recursion relations for S,,,Tuv, Auv(7) integrals have been derived using
three-center overlap integrals.’ Traditionally, the most expensive integrals are the
nuclear attraction integrals for which the evaluation has been shown to be very
efficient using recursion relations.
The purpose of this Appendix is to show how to obtain the recursion relations
for the Fourier transform integrals, pu,(G) and p®,(G), of the density pu»(r), and
of the screened density p%,(r), and the two-center potential integrals A®,(r) for
the screened density p®,(r). In the dual-space approach, the screening exponent is
chosen to balance the cost of the Fourier transform integrals with the cost of two-
center potential integrals. Thus we need to have efficient evaluation of the Fourier
transform integrals.
For Fourier transform integrals involving cubic lattices, it is very advanta-
geous to separate the x,y,z components of the integrals because for each pair
of basis functions, we want to evaluate pu.(G) for many G vectors of the form
G = (nz,ny,nz) AG. In this way, the one-dimensional integrals over the z,y,z
components can be reused to evaluate all the possible py,(G) integrals. For non-
cubic-lattice, the one-dimensional integrals are more numerous but the final product
of the three components, the expensive part, doesn’t change.
Surprisingly the recursion formulas for the puy(G) and A,,(r) integrals only
need to be slightly modified for the integrals involving the screened density p®,.
102
Section B.1 gives the basic definitions for Gaussian functions. Sections B.2
and B.3 respectively gives the recursion relations for the S,, and Ty, integrals.
Those relations are only given here for the sake of completeness. Sections B.4 and
B.5 respectively give the recursion relations for the Pur(G), Auv(r) integrals, and
their modifications for the screened density.
B.1 Basic Definitions
The normalized Cartesian Gaussian function is given by
Xur(r) = Nur (@ — 2u)!*(y — yu)! (2 — au) eee rhe) (B.1)
where , = (2u, Yu, Zu) is the origin of the Gaussian function, | = (I;,l,,1,) defines
its angular momentum, and a, its exponent. The normalization constant, Nui,
which makes the integration of |Xui(r)|? over space equal to one, is given by
= (2a, /m)9/? (4a, et ytl2 3
Nut = (ai — 1)N(21, — 1)!(21, — a) (B.2)
The product putm(r) of a pair of Gaussian functions, Xy:(r) and Xym(r), is also
a Gaussian function centered on the line joining the two centers €, and éy:
Pulwm(t) = NutNome ee bebe) Me — 24) (y — yu) (z — zu)”
x (@ — ay)™*(y — yy) ™Y(z — zy) ETOP), (B.3)
where Yu» = Qy + @y is the sum of the two basis function exponents and P is the
product center which is given by
Oubu + ayky
Yuv
P=
103
The density product can also be written in terms of derivatives as in equation (59):
& or oF —Yuv (r—P)?
where the explicit form of the coefficients c? 7c} cf ; will never be needed. The asso-
ciated screening density is given by:
aye O 8 A op pe, Qa
Putom(t) = NuNom >. eF Chey OP: aPi OPE ve CrP) (Tt (B.6)
ajk
where {2 is the screening exponent. Equations (B.5) and (B.6) will only be used to
derive the recursion relations for the integrals involving the screening density.
B.2 Overlap Integrals
The overlap integral between a pair of normalized Gaussian functions is given
by:
Sulum = jer Xui(r)Xum(r). (B.7)
The integral can be separated into its z,y,z components:
Sulum = NuiNom (l., mz|[ly, my|[lz, mz], (B.8)
where the one-dimensional integral over the z component is given by
to° 2 2
(lx, M2] = / dz (x — 2,)'e~%(F-Fe) (gy — gy) MB ee (B—B0)” (B.9)
and similarly for the y and z components. Two special identities can be derived
from the form of equation (B.9):
1) The first identity comes from the invariance of the overlap under translation
of both Gaussian function
Qou[le +1, mz] + 2ay[le,mz +1] =Ip[le —1, me] + me[lz,mz — 1). (B.10)
104
To prove the above equation start with the following identity
+00 2 2
0= / dz 2 ((e ~ ay )'Pe~ eu (tBu) (a — ty)? eee (t-te) ) (B.11)
—cO
which depends on the Gaussian functions being 0 at too. Applying the partial
derivative 2 to each term in (B.11) leads directly to (B.10).
2) The second identity,
(z, — 2,)[lz,mz] = [lz + 1,m,] — [lz,mz + 1], (B.12)
is related to the fact that the integrand in equation (B.9) is only a function
of (z — ,,) and (x — z,) which is easily obtained by placing (x, — 2,,) under
the integration and by writing it as (« — z,) — (x — 2,).
When equations (B.10) and (B.12) are combined to zero out the [l,,mz +1]
contribution, the recursion relation that describes how to increment the value of I,
is obtained:
l, m
I, —1, x
Dy. mz] +
[lz + 1, mz] = (Pe —2u)[lz,mz] + Daa!
Iz,mz—1]. (B.13)
Equations (B.10) and (B.12) can also be combined to zero out the [J, + 1,mz,]
contribution. This results in the recursion relation that describes how to increment
the value of m,:
[le -1,mz]+—[l,mz—1]. (B14)
In practice the above recursion relations are used as follows:
1) Evaluate [0,0] = (1/Yuv)'/? Eep(—ayay(ru — ty)? /Yuv)
2) Evaluate all [1,,0] using equation (B.13) (where the last term can be skipped
since it is zero ).
105
3) Evaluate all [1,,m,] using equation (B.14):
Loop over all /,.
Loop over all m, > 0.
Use equation (B.14).
4) Repeat steps 1-3 for the y,z components.
5) Get Sutum from equation (B.8).
B.3 Kinetic Energy Integrals
The kinetic energy matrix integral between two normalized Gaussian functions
is given by
Tulum = -5 jer dulV’ bum, (B.15)
in atomic units. After integration by part and separation of the z,y,z components,
equation (B.15) can be rewritten as
Nu Num
where
boo 2 2
I, = (le, mz] -| dx (x — 2,)*e~ (2-24) (g — 2, )™e—%(F-2) (B17)
—oo
which is the one-dimensional overlap integral obtained previously and
J, = [l,m] (B.18)
_ Fee 0 In —a,(2—2,)" 0 mz ,—a,(2—2,)"
= [. dz x — @,)'"e ) Pa — ty)" e ’
which is the one-dimensional kinetic energy integral. Similar expressions are defined
for the y,z components.
1)
2)
106
Two special identities can be derived from the form of equation (B.18):
The first identity comes from the invariance of the kinetic energy integral
under translation of both Gaussian function
2au[l2 + 1, mz] + 2a,[l.,mz + 1] = l2[l, —1,mz] + mz[l.,m_ — 1], (B.19)
which is derived in the same way as in the overlap integral.
The second identity is related to the fact that the integrand in equation (B.18)
is only a function of (x — z,) and (2 — 2,)
(lz, mz] _ Ollz, mz]
6]
(ay —2x)[lz, mz] = [lz +1, mz] —[l.,m2+1]+ On, On,” (B.20)
where the derivatives are given by:
Olle, tm) _ Qaulle + 1,mz] —le[le — 1, mz] (B.21)
Oz,
and
Ole, mal = 2ay[lz,mz +1] —mgll2,mz — 1). (B.22)
Equation (B.20) is obtained by placing (x, — zy.) under the integration and
by writing it as (x — 2) —(z—2,). The first term, 2 — zy, leads to
+00 Fe)
dz (x — 2) 5-((@ = ay)iterestenn))
~oO
) 2
a _ maz i—a,(z—-Zy)
x a ((e Ly) "e ): (B.23)
where an integration by part of the first derivative gives directly equation
(B.20).
107
Equations (B.19) and (B.20) can be combined to zero out the [l,, mz +1] and
[?2,mz +1] contributions. This gives the recursion relation that describes how to
increment the value of I,:
l m
z » Thy || = a Ly l,, z ——[l, — Ng — z) Ms, —
(?. + 1,mz] = (Pz — 2u)] m=] +57 I ime] + ol m, — 1]
4a,a, l,
+ 7 (i, +1,mz] —- (?, —1,mz] }. (B.24)
Equations (B.19) and (B.20) can also be combined to give the recursion relation
that describes how to increment the value of m,:
l, zx
[l,m + 1] = (Pz ~ ev) fle, m2] + = [le — 1m] + Flies -q
Aad, Mz
,, zr ~~ L,, xz -—1 . B.
a ( me +1) ~ Ula, ! (B.25)
In practice the above recursion relations are used as follows:
1) Evaluate [0, 0] using
[0, 0] = (0, 0/22" (1 — 24 e»)*)
UY Yur
2) Evaluate all [/,, 0] using equation (B.24).
3) Evaluate all [/,,m,] using equation (B.25):
Loop over all [,.
Loop over all m, > 0.
Use equation (B.25).
4) Repeat steps 1-3 for the y,z components.
5) Get Tyiom from equation (B.16).
B.4 Fourier Transform Integrals
108
The plane wave integral between a pair of normalized Gaussian functions is
defined as
putem(G) = / Br Xuilt)Xvm (ner
= utNom|[le|e*?** |ma] [ly |e*C"# my] (1, |e* 7 [mz], (B.26)
where
: +00 . 2 2
[1 |e*C** |m,] = / dx e'G#*(g—x,,)'Pe~ tut -F4) (g_ 9, Me“ Av(t—-Be) | (B97)
Two special identities can be derived from the form of equation (B.27):
1) The first identity comes from the invariance of the integral under translation
of both Gaussian functions (except for the e*©** term)
Qay[le + Lle*F**|m,] + 2ay [lp|e*C2*|mz +1] = (B.28)
iG, (I, |e*C2* |me] + le[le — 1\e*?**|mz] + mz[le|e*F*7|m, — 1).
To prove the above equation, start with the following identity
+00 6] . 2 2
0 -/ dz 5 (or (e-au)itereniere (2—a2y)"7%e— eee) ), (B.29)
which depends upon the Gaussian functions being 0 at too. Applying the
partial derivative a to each term in (B.29) leads directly to (B.28).
2) The second identity,
(xy — 2,)[lz|e*C*? |mz] = [Ip + 1le*F**|m,] — [l,|e*C*”|m, +1], (B30)
is related to the fact that the integrand in equation (B.27) is only a function
of (x — z,) and (z — z,) which is easily obtained by placing (2, — z,,) under
the integration and by writing it as (z — z,) —(# — zy).
109
When equations (B.28) and (B.30) are combined to zero out the [I,|e*©=*|m, + 1]
contribution, the recursion relation that describes how to increment the value of I,
is obtained:
7 em
[le + 1le**?| me] = (Ps — tu + > [te |e*°** |g] (B.31)
l, iGor Me iGzt
+ yan (1, — lle |mz| + Dan (1,|e lm, — 1).
Equations (B.28) and (B.30) can also be combined to zero out the [l, + 1le*@=*|m,]
contribution. This results in the recursion relation that describes how to increment
the value of m,:
. ie ;
[Ip|e@**|m, +1] =(P, —2y + - [le |e*@** |g] (B.32)
l, L x tG,r
4 5 [lz — 1le*C**|mz] + my [le|e#*?|m_ — 1).
In practice the above recursion relations are used as follows:
1) Evaluate [0|e*@2*|0] = [0, Oje?@=P2 e~Ge/4%mv,
2) Evaluate all [/,,0] using equation (B.31) .
3) Evaluate all [l,,m,] using equation (B.32):
Loop over all [,.
Loop over all m, > 0.
Use equation (B.32).
4) Repeat steps 1-3 for the y,z components.
5) Get put,»m(G) from equation (B.26).
B.4.1 Fourier Transform Integrals for Cubic Lattices
In dealing with crystalline systems, the plane wave integrals are used to eval-
uate the electron density in reciprocal space. The plane wave integrals are required
110
for a large number of G vectors. But for cubic lattices, the G vectors can be written
in the following form:
Gn = (nz, ny, nz)AG (B.33)
where the AG’s are the vector steps in each direction. In that case, the 1D plane
wave integrals
[lele"C**|mz], —[lyle*°#¥|m,], and [ls [e'** |r, ] (B.34)
can be evaluated in advance, i.e., before entering the loop over the 3D G vectors.
That keeps the operations count within the loop over the 3D vectors to a minimum.
The algorithm is as follows:
1) Evaluate range of nz,n, and n, needed for 1D integrals.
1) Evaluate all 1D integrals using the recursion relations.
2) Evaluate all 3D integrals:
Loop over all 3D G vectors.
Find (nz, ny, nz).
Multiply the three 1D integrals together.
B.4.2 Screened Fourier Transform Integrals
The recursion relation for the Fourier transform integrals involving the screen-
ing density are the same as for the un-screened density except that the recursion
relations are started with:
[Ole*@=* [0] = [0, O]e*@= Pe e~ Gz /49 (B.35)
instead of
[Ole*F=*|0] = [0, O32 P2 e~ Ge /4 700 (B.36)
111
To prove the above assertion, start with the screening density, pulum(r), writ-
ten as in equation (B.6) and look only at the z component of the integration:
Q 1 3 ,
I? = ( )2 Sof Dp: [as tOrreOe-Po
Yur
Qs iGez,-0 2? 2 Oo ia.P,
1 a 1
_ (= )8 eo G2/40 Soe apie (B.37)
For the normal density, pui,um(r), just replace 2. everywhere by yyy to obtain:
— ¢™_y\4 -G? /4yue z a iG, P,
I, = (—) € 7 d i apie (B.38)
The proof is completed by noting that the only difference between equations (B.37)
—G? /40
and (B.38) is in the prefactor term, e versus e~G2/4%u0
B.5 Two-Center Potential Integrals
The two-center potential integral between a pair of normalized Gaussian func-
tions is given by
Xui(r! )Xym(r'
Autm(") = / dep! Kult )Xom(r') (B.39)
: nd
The recursion relation is obtained by inserting the following identity:
1 2 °° 2 4\2
—_—_ = — du e% (T-7) BA
rr" al a (3-40)
in the above equation and by using the three-center overlap recursion relation similar
to equation (B.13). The end result is:?
n 1 nr
AP n(C) =< Ku i+6(0)| Gy lXom(t) >)
112
= (Pi — ui) < Xu46,| r= erate m >”) (B.41)
+ (P; - Ci) < Xue; a |Xym >oeth)
C|
+ li
uv u, Fi |r _ C| vym Ub oO; |r _ C| uUym
Ir —C|
(C) is the integral we are interested in, the C' is the point where the
i 1
+ ” (< Xut| — sy |Xo,m-6; >™ _ < Xy i———S |Xym— 6; sen),
Qyuv lr —C|
where Ae ym
potential is evaluated and the starting point of the recursion are the following:
1 uv
< Xa,t=o| 7G Mvm=o >= a(7e* )? < Xu l= o|Xu, m=0 > F. n(l) (B.42)
with T = yu»(P —C)? and
F,(T) = / dt #2" Tt (B.43)
Many methods have been used to evaluate the F,(T)’s based on either two formulas
* or a seven formulas scheme.® With the advent of
seven-term Taylor expansion
better computers, it becomes much cheaper to use a two formulas method based
on a simple interpolation scheme.®° This method involves fewer operations for each
value of F,(T) desired, but requires storing 10000 values of F,,(T') at preselected
values of T. In order to improve the accuracy, the interpolation is done using the
variable x = 1/(1 +) which covers the interval [0,1] and over which the 10000
values of z are spaced equally. In order to reproduce the asymptotic behavior of
F(T),
tole
_ (2n—1)! 1
the function F,(T(z)) x T"+1/? is stored for T > 1 instead of F,(T(z)). The final
absolute accuracy is better than 10~° for any value of T.
113
In addition the careful evaluation of F,(T), the evaluation of the two-center
potential. integrals can be sped up by computing the integrals in a rotated frame
where the atoms &, and , lie along one of the axes.*' In that frame, 2/3 of the
prefactors of the form (P; — é4,:) and (P; — C;) are now Zero.
B.5.1 Screened Two-Center Potential Integral
The recursion relations for the two-center potential integrals involving the
screening density are the same as for the un-screened density except that the func-
tion Fr(yu»(P — C)?) is replaced by (Q./7uv)'/? Fx(Q(P — C)?) in equation (B.42).
To prove the above assertation, start with the two-center potential integral
Ae, (c) = fate’ Prtum(*")
In— 0
and replace the screening density by equation (B.6):
(can a Sl
Q = ..
ijk
Q .3 / e7 2 (r—P)?
x(—)?2 | dr ——__—
oe lr—C|
where
— yz
Cizk = NutNome; C5 Cj.
Using the following relation:
~Q(r—P)? »)
3 é _ 40 _ 2
fea = 5 Fo(a(P - 0)?)
equation (B.46) becomes:
a a ak og
Q _ . $ _ ny
ijk
(B.45)
(B.46)
(B.47)
(B.48)
(B.49)
114
Applying one of the partial derivatives leads to:
op (Q(P —C)?)= [a ‘at Ont (B.50
oP: ° ~ OP: © 50)
[1/2] pi-2r
=il(-)) S° AG a mye Fi -2(0 (P —C)’).
In the above equation the function F,(0(P — C)*) is always multiplied by a factor
of the form 2”. For the un-screened density put,ym(r) the function Fi(yu» (p—C )?)
is always multiplied by a factor of the form y?,. The proof is completed by noting
that there is an extra factor of (- a — yi ? in equation (B.49).
As stated in equation (67), the screened potential A®,,(C) is equal to the
ul,vm
integral (ga(r—c)||Pul,vm(r')) [ defined in equation (55) ] where the exponent of the
screening Gaussian density ga(r — C) [ defined in equation (45) | is related to
by equation (69). For s basis functions, i.e., 1=0 and m=0, the identity can be
reduced to:
; st en 2 (7-C)? o~ yun (2° Py s ea (r'—P)?
St far [as 28 fate OO wat
—r'|
which can be proven by using identity (B.40) on both sides and performing the
integrations over r and r'. The proof for higher angular momentums is done by
applying the derivatives 6°/OP!, 07 / OPJ and 6* /OP# to equation (B.51) and using
equations (B.5) and (B.6) that defines the screened and un-screened density.
115
Appendix C. Symmetry
C.1 Convention for Symmetry Operations
The application of symmetry operation | on a 3D vector r is given by:
r, = U,(1) r+ ti, (C.1)
where U,(1) is the 3 x 3 rotation matrix for symmetry operation 1.
The application of a symmetry operation on atom a position &, is given by
Eat + Rat = Up (I) €a + ti, (C.2)
where 4; labels the rotated atom position translated into the first unit cell and Ra;
is the lattice vector used for doing the translation.
C.2 Rotation of Cartesian Basis Functions
In dealing with basis functions, we will make use of the above equation with
a basis function index instead of an atom index
where £, and &,; are atom positions of the original basis function u and of the
rotated basis function u;. In rotating Cartesian basis functions, the effect of the
rotation on the angular part of the basis function must also be taken care of. For S
basis functions, the rotation matrix U‘) is just a unit scalar. For P basis functions,
the rotation matrix UENe;) is identical to the 3 x 3 matrix U,(l) used to rotate 3-
dimensional vectors. For D basis functions, if the six Cartesian functions, x”, y’,
z*, ey, 2z, yz, are written as (ij) then the 6 x 6 rotation matrix is given by:
(D) (eo yy P) yp?)
Osgyet) = wo oe OLAENO) (C.4)
kl
116
where the sum is over all the permutations of the pair (kl) and N;;) is the normal-
ization constant of the Cartesian function represented by (17).
In a similar manner, the ten F basis functions z°, y*, z°, 2?y, 27z, y*x, y?z,
z*a2, 2*y, zyz, are written as (ijk) and the 10 x 10 rotation matrix is given by:
(F) _ | Namn) (P) y7(P) _77(P)
UCisky(tmn) = 4) Fy, Wey nU ym) U(eyn)° (C5)
(i7k) (Imn)
After all the angular rotation matrices U(5), U(P), U() and U‘*) are built,
the total rotation matrix, T, that transforms all the basis functions at once can be
generated.
C.3 Rotation of Matrix Elements
The effect of symmetry on the Hamiltonian matrix elements H,,R =<
u|H|vR > is given by
Hwy er =T Hur Ty; (C6)
where T; is the rotation matrix for the basis functions associated with symmetry
operation I and u', v' and R’ are the rotated indices of the original product (uvR).
The full expression for R’ is
R' = 0,(l) R+ Roi — Rut. (C.7)
The above equation is valid not only for the Fock operator but also for any other
operator including the identity (which leads to evaluating overlap matrix).
For density matrix elements, Dyyr, the effect of symmetry is given by
Duwp =TE Dur Tr, (C.8)
117
where Ty is the rotation matrix associated to the inverse of symmetry operation I.
The above can be easily derived from the following identity
Trace(Duuk Suk) = Trace( Duy R! Suy R’): (C.9)
Beside using the rotations described in equations (C.6) and (C.7), we will also make
use of their opposites
Ayr = Ty Ayy R Ty (C.10)
and
DuvR = Ty Duy R Th. (C.11)
C.4 Application of Symmetry to Grid Generation
For each atom in the first unit cell an atomic grid is generated. By using sym-
metry only the grid for the non-equivalent atoms is saved. Those atomic grids are
further reduced by using the symmetry operations that leave the atom unchanged.
The weights of the remaining gridpoints are changed to include the contribution of
all the other gridpoints that can be obtained by symmetry.
10.
11.
12.
13.
14.
15.
16.
17.
18.
118
References
. J.H. Weaver, J.L. Martins, T. Komeda, Y. Chen, T.R. Ohno, G.H. Kroll, N. Troul-
her, R.E. Haufler and R.E. Smalley, Phys. Rev. Lett. 66, 1741 (1991).
. R. Car and M. Parrinello, Phys. Rev. Lett. 55, 2471 (1985).
M.C. Payne, M.P. Teter, D.C. Allan, T.A. Arias and J.D. Joannopoulos, Rev. Mod.
Phys. 64, 1045 (1992).
I. Shavitt in Methods in Computational Physics, edited by B. Alder, S. Fernbach
and M. Rotenberg (Academic Press, New York, 1963), Vol2.
H. Taketa, $8. Huzinaga and K. O-ohata, J. Phys. Soc. Jap. 21, 2313 (1966).
L.E. McMurchie and E.R. Davidson, J. Comp. Phys. 26, 218 (1978).
S. Obara and A. Saika, J. Chem. Phys. 84, 3963 (1986).
A.D. Becke, J. Chem. Phys. 88, 2547 (1988).
. G. te Velde and E.J. Baerends, J. Comp. Phys. 99, 84 (1992)
M.R. Pederson and K. A. Jackson, Phys. Rev. B 41, 7453 (1990).
V.I. Lebedev, Sib. Mat. Zh. 18, 132 (1977); Zh. Vychisl. Mat. Mat. Fiz. 16,
293 (1976); Zh. Vychisl. Mat. Mat. Fiz, 15, 48 (1975).
P.P. Ewald, Ann. Phys. (Leipzig) 64, 253 (1921).
C. Pisani, R. Dovesi and C. Roetti, “Hartree-Fock Ab Initio Treatment of Crys-
talline Systems,” Lecture Notes in Chemistry Vol. 48 (1988).
H. Sambe and R.H. Felton, J. Chem. Phys. 61, 3862 (1974).
B.I. Dunlap, J. Andzelm and J.W. Mintmire, Phys. Rev. A 42, 6354 (1990).
G. te Velde and E.J. Baerends, Phys. Rev. B 44, 7888 (1991)
T.P. Hamilton and P. Pulay, J. Chem. Phys. 84, 5728 (1986)
M.P. Teter, M.C. Payne and D.C. Allan, Phys. Rev. B 40, 12255 (1989).
19
20
al.
22.
23.
24.
20.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
39.
119
P. Hohenberg and W. Kohn, Phys. Rev. B 136, 864 (1964).
W. Kohn and L.J. Sham, Phys. Rev. A 140, 1133 (1965).
J.P. Perdew and A. Zunger, Phys. Rev. B 23, 5048 (1981).
D.M. Ceperley and B.I. Alder, Phys. Rev. Lett. 45, 566 (1980).
G.B. Bachelet, D.R. Hamann and M. Schluter, Phys. Rev. B 26, 4199 (1982).
F.E. Harris in Theoretical Chemistry Advances and Perspective edited by H. Eyring
and D. Henderson (Academic Press, New York, 1975).
M.P. Tosi, Solid State Physics 16, 107 (1964).
R.N. Euwena and G.T. Surratt, J. Phys. Chem. 36, 67 (1975).
P.E. Gill, W. Murray and M.H. Wright, Practical Optimization, 1981.
R.A. Friesner, J. Phys. Chem. 92, 3091 (1988).
C.D. Clark, P.J. Dean and P.V. Harris, Proc. R. Soc. London, Ser. A 277, 312
(1964).
L. Hedin and B.I. Lundqvist, J. Phys. C 4, 2064 (1971).
D.J. Chadi and M.L. Cohen, Phys. Rev. B 8, 5747 (1973)
S.C. Erwin, M.R. Pederson and W.E. Pickett, Phys. Rev. B 41, 10347 (1990).
S.-H. Wei, H. Krakauer and M. Weinert, Phys. Rev. B 32, 7792 (1985).
W.E. Pickett and C.S. Wang, Phys. Rev. B 30, 4719 (1984).
G.B. Bachelet, H.S. Grenside, G.A. Baraff and M. Schluter, Phys. Rev. B 24, 4736
(1981); B 24, 4736 (1981).
N. Troullier and J.L. Martins, Phys. Rev. B 43, 1993 (1991).
K. Kratschmer, L.D. Lamb and K. Fostipopoules, Science 242, 1139 (1988).
A.F. Hebard et al., Nature (London) 350, 600 (1991).
M.J. Rosseinsky et al., Phys. Rev. Lett. 66, 2230 (1991).
40.
41.
42.
43.
44.
45.
46.
47.
48.
49.
50.
51.
52.
53.
54.
55.
56.
57.
58.
120
S.P. Kelty, C.-C. Chen and C.M. Lieber, Nature (London) 352, 233 (1991).
K. Tanigaki et al., Nature (London) 352, 220 (1991).
P.A. Heiney et al., Phys. Rev. Lett. 66, 2911 (1991).
5. Saito and A. Oshiyama, Phys. Rev. Lett. 66, 2637 (1991).
N. Troullier and J.L. Martins, Phys. Rev. B 46, 1754 (1992).
C.S. Yannoni et al., J. Am. Chem. Soc. 113, 3190 (1991).
W.LF. David et al., Nature (London) 353, 147 (1991).
Q.-M. Zhang, J.-Y. Yi and J. Bernholc, Phys. Rev. Lett. 66, 2633 (1991).
Y. Guo, N. Karasawa and W.A. Goddard, Nature (London) 351, 464 (1991).
T. Rabenau, A. Simon, R.K. Kremmer and E. Sohmen, Z. Phys. B 90, 69 (1993)
T. Takahashi et al., Phys. Rev. Lett. 68, 1232 (1992)
J.H. Weaver, P.J. Benning, F. Stepniak and D.M. Poirier, J. Phys. Chem. Phys.
53, 1707 (1992)
R.W. Lof, M.A. van Veenendaal, B. Koopmans, H.T. Jonkman and G.A. Sawatzky,
Phys. Rev. Lett. 68, 3924 (1992)
J.L. Martins and N. Troullier, Phys. Rev. B 46, 1766 (1992)
E.L. Shirley and $.G. Louie, Phys. Rev. Lett. 71, 133 (1993).
W.Y. Ching, M.-Z. Huang, Y.-N. Xu, W.G. Harter and F.T. Chan, Phys. Rev.
Lett. 67, 2045 (1991).
S. Satpathy, V.P. Antropov, O.K. Andersen, O. Jepsen, O. Gunnarsson and AI.
Liechtenstein, Phys. Rev. B 46, 1773 (1992).
B.I. Dunlap, D.W. Brenner, J.W. Mintmire, R.C. Mowrey and C.T. White, J. Phys.
Chem. 95, 8737 (1991).
D.L. Lichtenberger, K.W. Nebesny, C.D. Ray, D.R. Huffman and L.D. Lamb, Chem.
121
Phys. Lett. 176, 203 (1991).
59. J.A. Zimmerman, J.R. Eyler, S.B. Bach and S.W. McElvany, J. Chem. Phys. 94,
3556 (1991).
60. R.A. Friesner J. Chem. Phys. 86, 3522 (1987).
61. M.N. Ringnalda M. Belhadj and R.A. Friesner, J. Chem. Phys. 93, 3397 (1990).
122
Table 1. Accuracy of HF total energy molecular calculations using numerical in-
tegration.
Molecule Number of Numberof Total Energy Error/atom
atoms bas. func. error (mH) (mH)
Oxygen 2 30 0.004 0.002
Water 3 25 -0.007 -0.002
Carbon Dioxide 5 45 -0.018 -0.006
Methane 12 35 -0.016 -0.003
Benzene 12 120 0.022 0.002
Glycine 10 100 0.024 0.002
Glutamine 20 200 0.072 0.004
Porphine 38 430 0.160 0.004
Ceo (In) 60 540 0.206 0.003
Ceo (Ta) 60 540 0.255 0.004
123
Table 2. Exponents and contraction coefficients for carbon atom in Dunning/ Huzinaga
and Pople 6-31G double zeta basis sets.
Dunning /Huzinaga‘ Pople /6-31G“)
Exponent Coefficient Exponent Coefficient
first s orbital
Ss 4233. 0.001220 3047.5249 0.0018347371
S 634.9 0.009342 457.36952 0.014037323
S 146.1 0.045452 103.94868 0.068842622
S 42.5 0.154657 29.210155 0.23218444
S 14.19 0.358866 9.286663 0.46794135
S 5.148 0.438632 3.163927 0.36231199
5 1.967 0.145918
second s orbital
Ss 5.148 -0.168367 7.8682723 -0.11933242
NS) 0.4962 1.060091 1.8812885 -0.16085415
S 0.54424926 1.1434564
first p orbital
P 18.16 0.018539 7.8682723 0.068999067
P 3.986 0.115436 1.8812885 0.31642396
P 1.143 0.386188 0.54424926 0.74430829
P 0.3594 0.640114
third s orbital
S 0.1533 1.0000000 0.16871448 1.000000000
second p orbital
P 0.1146 1.0000000 0.16871448 1.000000000
(a) S. Huzinaga, Ed., Gaussian Basis Sets for Molecular Calculations, Elsevier:
Amsterdam, 1984.
(b) W.J. Hehre, R. Ditchfield and J.A. Pople, J. Chem. Phys. 56, 2257 (1972);
P.C. Hariharan and J.A. Pople Theor. Chim. Acta 28, 213 (1973); M.S. Gordon,
Chem. Phys. Lett. 76, 163 (1980); M.J. Frisch, J.A. Pople and J.S. Binkley, J.
Chem. Phys. 80, 3265 (1984).
124
Table 3. Diamond energy levels in the valence and conduction bands for several
different methods.
This Work
Root’) DH 6-316 LCAO@ LAPW) Pps-pwth
Ty 21.35 -2142 -2134 — -21.06 -21.38
Tos! 0.00 0.00 0.00 0.00 0.00
Tas) 5.53 «5.50 5.57 5.51 5.51
Ty 13.55 13.52 13.56 13.13 13.56
X, 12.64 -12.70 -12.66 -12.48 12.67
xX, 6.30 -6.33 6.34 6.18 6.31
X, 4.70 4.66 4.79 4.68 4.64
X, 16.60 16.62 16.71 16.41 16.81
Ly 15.51 -15.57 --15.52—s~-15.33 -15.53
Ly 13.38 -1344 -13.41 -13.14 13.47
Ly: 2.82 -2.83 -2.81 2.73 -2.81
Ls 8.36 8.38 8.45 8.33 8.37
Ly 9.01 9.01 9.07 8.75 8.97
(a) All energies in eV with I’, shifted to 0.0 eV. All calculations done with a = 3.567
A, Hedin-Lundqvist exchange-correlation potential and 10 special k points.
(b) Dunning/Huzinaga basis set (9s5p/3s2p) plus 0.75 d function.
(c) Pople 6-31G basis set (10s4p/3s2p) plus 0.8 d function.
(d) Linear combination of atomic orbitals, Ref. 32.
(e) Linear augmented-plane-wave, Ref. 33.
({) Pseudopotential plane wave, Ref. 34.
125
Table 4. Comparison of Cgo fcc (Fm3) crystalline calculations for several different
methods.
This Work’*) PS-PW(1)) PS-PW(2)') LCAO@) PS-LCAOC)
Geometry (A)
R, 1.414 - 1.382 1.40 1.40
R, (fA 1.455 - 1.444 1.46 1.46
Lattice (9) 14.00 - 13.88 14.20 14.20
Energies (eV)
Gap X (*) 0.96 /0.96 1.04 1.18 1.34 1.5
Gap r 1.69/1.69 1.6-1.7 ‘1.71 1.87 1.9
Wa, 0.84/0.83 0.6 0.58 0.55 0.42
Wrin 0.62/0.62 0.5 0.5 0.5 0.4
Wri, 0.74/0.73 0.6 0.5 - -
Eezch.—corr. CP - CP Ww cP
Pseudopotential No Yes Yes No Yes
(a) With 6-31G basis set and Dunning/Huzinaga basis set.
(b) Pseudopotential plane waves, small core radii ~ 0.4.4, Ref. 54.
(c) Pseudopotential plane waves, large core radii ~ 0.8A, Ref. 44.
(d) Linear combination of atomic orbitals, Ref. 55.
(e) Pseudopotential linear combination of atomic orbitals, Ref. 43.
(f) Carbon-carbon distances in short bond and long bond for Cgp.
(g) Lattice constant.
(h) Direct gaps at X and T points. Lowest gap is at the X point.
(i) Widths of the valence band, Wy, and first two conduction bands, W71, and
Wrig-
(j) Exchange-correlation potentials: CP for Ceperley-Alder and W for Wigner.
126
Table 5. Energy levels (eV) in molecular Cgo using Ceperley-Alder exchange cor-
relation potential.
Symmetry 6-31G LCGTO™ Aec(®)
Gg -0.12 -0.11 -0.01
Hu -0.61 -0.56 -0.05
Hg -1.94 -2.07 0.13
T2u -2.04 -2.25 0.21
Tlg -3.25 -3.15 -0.10
Tlu (©) -4.24 -4.26 0.02
Hu) = -5.92 -5.94 0.02
Gg -7.05 -7.12 0.07
Hg -7.17 -7.24 0.07
Hu -8.70 -8.78 0.08
Gu -8.87 -8.83 -0.04
Hg -9.17 -9.09 -0.08
T2u -9.31 -9.38 0.07
Gu _-10.18 -10.10 0.08
T2g = -10.57 -10.58 0.01
Hg ~~ -10.73 -10.70 -0.03
Gg -10.88 -10.85 -0.03
Hg = --11.32 -11.26 -0.06
Gu -11.67 -11.75 0.08
Tlu = -11.84 -11.84 0.00
Ag -12.24 -12.44 0.20
T2u = -12..42 -12.47 0.05
Tlu = -13.12 -13.05 0.07
Hu —_-13.36 -13.25 0.11
(a) Linear combination of Gaussian Type Orbitals basis set (11s7p/5s3p) with 0.6
d functions, R; = 1.39A and R» = 1.45A, Ref. 57.
(b) Orbital energy difference €g_31¢ — €nceTo
(c) Lowest unoccupied molecular orbital.
(d) Highest occupied molecular orbital.
127
Table 6. Energy levels (eV) at the I’ point in crystalline Cgo for Ceperley-Alder
exchange correlation potential.
This...Work This... Work
Label DH“) 6-31G) PS-PW) Label DH 631G PS-PW
Ag -18.74 ~—--18.73 -18.77 Tg -5.05 -5.08 -4,93
Tu = -18.15 = -18.13 -18.15 Eg -4.90 -4.92 -5.00
Tg -17.30 -17.29 -17.26 Tg -486 -4.88 -4.59
Eg -17.27 -17.25 -17.22 Tu -4.30 -4.32 -4.11
Tu -16.24 -16.22 -16.10 Au -4.30 9 -4.31 -4.09
Au -15.34 = -15.36 -15.38 Au -3.48 —-3.47 -3.60
Tu -15.30 = -15.29 -15.33 Fg -3.39 -3.41 -3.19
Eg -14.21 -14.21 -14.14 Tu -3.37 -3.38 -3.45
Tg -14.06 -14.07 -13.98 Tg -3.33 -3.36 -3.12
Ag -13.60 = -13.61 -13.58 Eu = -3.01 ~—_ -3.03 -2.77
Tg -13.56 -13.56 -13.54 Tu -3.01 -3.02 -2.76
Eu -12.10 = -12.12 -12.02 Tu --2.54 —--2.53 -2.63
Tu -12.08 -12.09 -11.99 Eg -1.83 -1.83 -1.85
Tu -12.00 -12.01 -11.87 Ag -1.56 -1.52 -1.62
Tu -10.92 -10.94 -10.92 Tg -1.04 = -1.03 -1.08
Tg -9.92 -9.94 -9.83 Tg -0.94 -0.95 -0.97
Eg -9.80 -9.82 -9.75 Eu -0.25—_ -0.26 -0.23
Tg -9.75 —_ -9.78 -9.56 Tu 0.00 0.00 0.00
Ag -9.44 -9.45 -9.42 Tu 1.68 1.69 1.71
Ag -8.36 -8.37 ~8.25 Tg 2.25 2.26 2.37
Tg -8.24 -8.27 -8.13 Tu 3.63 3.64 3.47
Eu -7.57 — -7.59 -7.37 Eg 4.11 4.08 4.05
Tu -7.47 — -7.49 -7.27 Tg 414 4.14 4.10
Tu -7.25 — -7.28 -7.11 Eu 483 4.81 4.97
Ag -6.94 -6.95 -7.11 Ag 5.21 5.24 5.15
Tu -6.55 -6.58 -6.53 Tu 5.37 5.39 5.52
Au -6.00 —_-6.03 -5.83 Au 5.80 5.75 5.88
Tu -5.97 -6.01 -5.80 Tg 6.15 6.15 6.24
Eg -5.51 -5.55 -5.37 Tu 6.33 6.22 6.42
Tu -5.46 -5.50 -5.56 Tg 8.23 7.17 7.45
Tg 5.40 -5.39 -5.39 Tu 7.90 7.90 7.82
Tg 5.23 -5.25 -5.26 Eg 8.84 8.65 8.32
Ag -5.06 -5.09 -4.95 Tu 8.58 8.98 8.75
(a) Dunning/Huzinaga double zeta basis set.
(b) Pople 6-31G double zeta basis set.
128
(c) Pseudopotential plane waves, Ref. 44.
(d) Highest valence band (shifted to 0.0 eV).
129
Figure Captions
Figure 1. The diatomic projection functions along the line between the two atoms
for different recursion order: s;, 82 and 83. The projection function labeled 53 +8aad
includes the adjustable sine-like function.
Figure 2. The band structure of fcc diamond near the Fermi energy. The top of
the valence is shifted to 0.0 eV.
Figure 3. The total energy of diamond lattice as function of lattice parameter.
The experimental lattice parameter is 3.567 A which compares well to the calculated
value of 3.546 A.
Figure 4. The band structure of Cgo for the fcc (Fm3) structure near the Fermi
energy. The top of the valence is shifted to 0.0 eV.
130
1.257 T qT qT | v qT “T qT ee | T T T d
rr aN 2 1
on re. TTS |
c 0.75 NO 7 S343. 7
AS) L \ 7
® 0.50 : :
2 L Ne 1
Ss : PRS. :
& 0.00F So —
Q i 4
. A
-0.25+ 1 1 1 ] n 1 ! 1 1 Jo 1 i 1
- 1 -0.5 0 0.5 1
atom a
Figure 1.
131
Diamond Band Structure
(A9) AdIOU
Energy (mH)
Diamond Lattice
132
Optimization
-0.50
-1.00
-1.50
-2.00
-2.50
Le a OS
-3.00
i ! J
i L Jt l 1 L i 4 | i I 1 I I L Ll 1 L I 1 1 1 1
3.48
3.50
3.52 3.54 3.56 3.58
Lattice Parameter (Angstroms)
Figure 3.
3.60
3.62
133
C60 Band Structure
(A9) ABseuz
134
Appendix 1
Electronic Structure and Valence-Bond Band Structure
of Cuprate Superconducting Materials
135
R . S . °
Electronic Structure and Valence-Bond Band
Structure of Cuprate Superconducting Materials
YueEyin Guo, JEAN-MARC LANGLOIS, AND WILLIAM A. GODDARD III
Copyright © 1988 by the American Association for the Advancement of Science
136
Electronic Structure and Valence-Bond Band
Structure of Cuprate Superconducting Materials
YuEJIN Guo, JEAN-MARC LANGLOIS, WILLIAM A. GODDARD III
From ab initio calculations on various clusters representing the La;-,Sr,Cu,O, and
Y)Ba,Cu;O, classes of high-temperature superconductors, it is shown that (i) all
copper sites have a Cul! (4°) oxidation state with one unpaired spin that is coupled
antiferromagnetically to the spins of adjacent Cu" sites; (ii) oxidation beyond the
cupric (Cu!) state Jeads not to Cull!
but rather to oxidized oxygen atoms, with an
oxygen pz hole bridging two Cu" sites; (iii) the oxygen pz hole at these oxidized sites
is ferromagnetically coupled to the adjacent Cu"! d electrons despite the fact that this is
opposed by the direct dd exchange; and (iv) the hopping of these oxygen pz holes (in
CuO sheets or chains) from site to site is responsible for the conductivity in these
systems (N-clectron band structures are reported for the migration of these localized
charges).
T: ELUCIDATE THE MECHANISM RE-
sponsible for high-temperature su-
perconductiviry in various cuprates,
we carried out first principles quantum
chemical calculations on models of the
Laz_,Ba,Cu,O, (denoted 2-1-4) and the
Y,Ba2CujO; (denoted 1-2-3) classes of sys-
tems (J, 2). The resulting wavefunctions
indicate that electrical conduction in these
systems is dominated by hopping of oxygen
pt holes from site to site in the CuO sheets
and chains, and we report the band struc-
tures based on these valence-bond localized
states. In addition, there are important mag-
netic couplings berween spins on adjacent
copper atoms and between the conduction
electrons (oxygen pr holes) and the copper
spins that are critical in the superconductiv-
ity.
Electronic structure of the reduced (Cu!)
system. The electronic wavefunctions were
calculated with the generalized valence bond
(GVB), Hartree-Fock (HF), or configura-
tion interaction (CI) methods (3), Calcula-
tions (4~7) were carried out on finite clus-
ters as indicated in Fig. 1. In each case we
included explicitly all electrons on the atoms
shown plus the point charge approximation
to all other ions within about 8 A. The
sphere size was adjusted slightly (up to
0.2 A) so that balanced sets of ions were
included, with the outer boundary always
being oxygen. The charge on the outer layer
was scaled so that the whole cluster is neu-
tral. The number of explicit electrons corre-
sponds to having O77 and Cu** or Cu** at
each atom shown. The wavefunctions were
calculated self-consistently for each state.
We find in all cases (four-, five-, and six-
coordinate) that the optimum wavefunc-
Arthur Amos Noyes Laboratory of Chemical Physics.
California Institute of Technology, Pasadena, CA 91125.
896
tions have nine electrons in @ orbitals on
each copper, with the singly occupied orbit-
al pointing at the four short bonds. For
example, the singly occupied a-like orbitals
for the 1-2-3 chain are shown in Fig. 2, a
and b (cluster Cu3Oj9). The toral popula-
tion in d orbitals on the three copper sites is
a?'5 49-7, and d"5, The electrons formally
considered on O*” are partially shared with
the copper so that the total charge on the
copper is +0.41, + 0.23, and +0.41 for this
cluster (rather than +2, +2, and +2). Simi-
lar results are obtained for the 1-2-3 stub
(Cu3Oj2), where the d populations are A905,
d*"7 and d* (sotal charges 0.44, 0.45,
and 0.44). For the 2-1-4 system (Cu;0)}),
the d populations are d°"! and d*"! (total
charges -0.23 and —0.23).
With one singly occupied orbital per cop-
per, the Cu, systems lead to nwo low-lying
spin states (S = 0 and S = 1), and the Cu;
systems lead to three low-lying states
(S = 1/2, S = 1/2, and S = 3/2). We solved
self-consistently for each of these states us-
ing the GVB wavefunction and fitted the
resulting energies to a Heisenberg spin
Hamiltonian,
H=-2> YySi-S qQ)
For the Cuz cases, the singlet is lower
because the orbitals overlap — slightly
(o = 0.05), leading to an “antiferromagne-
tic” exchange integral J, where J = ~205 K
for the Cu,O,; sheet and J = ~244 K for
the Cu2O, chain. From the Cu; systems we
find that IJyzi < 0.1 K for second nearest
neighbors (o = 0.0006). In one and nwo
dimensions, a system described by Eq. 1
with negative J leads to an ordering (Nécl)
temperature of Tx = 0 K (8). Hence, long-
range order is not expected at finite T.
Experimental evidence on La,>CuQ,-, sug-
gests that there are strong short-range cou-
plings; however, evidence for long-range
order is conflicting, with one report of
Tn = 220 K (9). For the Cu; srub
(Cu3O43), the day: orbitals of the center
(chain) copper are orthogonal to the d,:~y:
orbitals of the top and bottom (sheet) cop-
per sites, leading to weak ferromagnetic
coupling Jug = +11 K).
Although each system has an array of
singly occupied copper d orbitals, it does not
lead to electrical conduction. The reason is
that the charge transfer state (Cul! «++ Cul!
— Cul-++ Cull!) is very high in energy
(about 10 eV for the Cu,O,,; cluster). In
band language, the system has a large Hub-
bard U parameter (one center electron-elec-
tron repulsion), leading to an exceedingly
narrow band.
The oxidised (“Cull”) system, The cuprates
exhibiting superconductiviry all are oxidized
further than Cu°*. Assuming no oxygen
vacancies, the copper of Lay.gsSro,jsCujOg
is 15 percent oxidized, and the copper of
Y;BayCu;O7 is 33 percent oxidized. How-
ever, other systems (for example, La,Ba-
CusO,3 and LasSrCugO)5) have similar lev-
els of oxidation but do not exhibit supercon-
ductivity (down to 5 K) (10). For all sys-.
tems discussed here we find that oxidation
of the Cul! (d%) leads nor to Cu!!! (48)
O-Cul'_O-Cu!'-o =
O-Cu"!_O-Cu!!_o (2)
but rather to oxidation out of an oxygen ptr
singly occupied orbital (see Fig. 2c) located
between nvo Cul! sites,
O-Cu!!~O-Cu!!~O >
O-Cu!-O*-Cul_O (3)
Thus, in the oxidized systems, all coppers
have the Cu"! (4°) oxidation state, bur each
oxidation leads to a singly occupied oxygen
pm orbital that is spin-coupled to the various
singly occupied copper d orbitals. We will
show below that hopping of such oxygen pw
holes from site to site leads to electrical con-
duction if the hole is on the proper oxvgen.
The Laz_,Sr¢CujO, system has two-di-
mensional sheets of copper and oxygen as in
0, °
| |
o—cuo—ctuo
oO, Os
where each Cu-O bond is 1.89 A. In addi-
tion, there are apex oxygens 2.40 A above
and below each copper. For x = 0, we find
that oxidation of the in-plane (sheet) oxygen
SCIENCE, VOL. 239
° °
: 0 i ©
a La, Sr, CuO, O>Gu— Os tu—0
SHEET oOo; 8:
° fe}
Cu,0,,+216 ions
is} °
2 2 ~
b Y,BaCu,0, Os pou O57 Gu O55 Gu—0
CHAINS
Cu,0,+254 ions
c Y, Ba CuO, Osrgu— On
STUB H
O%
O— Gu 0
° -
10
ox cio
Cu,0, +252 ions
Fig. 1. Clusters used in GVB calculations. The
array of point charges extends out to ~8 A from
cach copper (adjusted so that the outer shells are
oxygens). The charges on the point charges are
the nominal values (O?~, Y°*, Ba?*) excepr for
Cu where Cu*?: was used for Lay gsSro,.1sCuO,
and Cu*?? was used for Y;Ba;Cu3Oy. In addi-
tion, for La,_,Sr,CuQ, an averaged charge
Z = 3 ~ Vax was used for the LaSr system. The
number of explicit electrons was based on the
nominal charges [for example, (CusOj9)!4~ with
three Cu"). In all calculations, all occupied orbi-
tals are strongly bound (lowest IP = 5.4 eV).
is favored by 0.39 eV. Each of these oxygens
has two pr orbitals perpendicular to the Cu-
O-Cu axis, bur the pr orbiral in the plane of
structure I is the preferred one for oxidation
(by 0.4 eV). This leads to good electrical
conduction as discussed below. On the oth-
er hand, such a subtle change as replacing
part of the La** with Sr* can shift the
relative ionization potentials (IP) of the
apex and sheet oxygens significantly. Thus,
with x = 0.3, we find thar the apex oxygen
is preferentially oxidized (by 0.45 eV). We
believe that it is subtle shifts in relative IP
that Icad to a loss in superconductivity
above x = 0.3 and that this is responsible
for the semiconductor character of systems
such as La,BaCus0;3 and LasSrCugO 1s
(which have sheets or chains similar to the
2-1-4 and 1-2-3 systems). Thus, subtle
changes in the cations (charges or place-
ment) or in the structure (for cxample,
owing to higher pressure) might change
these semiconductors into superconductors.
For the Y,Ba2Cu3O7-_, system, there are
three types of oxygens, Oc, Os, and Og for
chain, sheet, and bridge, as indicated in Fig.
lc, but we find a very strong preference for
oxidizing the chain oxygens. For y = 0,
there must be one oxygen hole per central
copper and hence an idealized structure
19 FEBRUARY 1988
137
would have every bridging Oc oxidized.
However, with sufficient positive charge
along the chain, it becomes favorable to
oxidize sheet oxygens. Holes in Oc can hop
easily to adjacent Oc, leading to high mobil-
ity if the chain is perfect. Holes in Os would
lead to conduction in the sheets much as in
La2..,$r,Cu,O4. These two-dimensional
sheets should be less sensitive to defects than
the chains. Holes in Og probably serve to
communicate berween the Oc and Os sites.
Thus, we expect an equilibrium population
of holes among Og, Oc, and Os, with Oc
and Os important in electrical conduction.
Magnetic coupling in oxidised states. With
three singly occupied orbitals, the spins in a
Cu!!_O*—Cu" triad can be coupled in three
ways (S = 3/2 and two S = 1/2). Since the
singly occupied oxygen pr orbital is orthog-
onal to the Cu" orbirals, there is ferromag-
netic coupling berween the oxygen pa and
adjacent copper d orbitals. The spin cou-
pling benween the Cu"! spins still prefers
singlet; however, the Cu-O exchange is
much stronger, leading to a ferromagnetic
(quartet) ground state. Thus the optimal
magnetism of a chain is as in
O-Cu"'t-O* t-Cu!! t-O-Cu"}-O (I)
We solved self-consistently for all possible
spin states of the various clusters and fitted
the results with Eq. 1 to obtain coupling
terms of Jocy = 383 K, Jag = ~205 K, and
Jéa = -139 K for sheers of Laz.,St,CuyOu,
and Jocu = 405 K, Jug = —244 K, and
Jéa = -164 K for chains of Y,BazCu;O3.
Here Jocy is for adjacent oxygen 2p and
copper 3d orbitals, Jaq is for two copper
atoms with the intervening oxygen oxidized
(the center and left Cu of structure 11), and
Jaa is for a copper pair where the intervening
oxygen is not oxidized (the central and right
Cu of structure H). Note that the Jyg with an
intervening O* is smaller than that for a
normal oxygen (164 K versus 244 K); this is
because superexchange decreases when the
central atom (oxygen) is less negative (12).
In the accompanying report (12) it is
shown that the intetplay between the O-Cu
and Cu-Cu magnetic couplings is responsi-
ble for the superconductiviry in these svs-
tems.
Electron correlation, It is important to em-
phasize that the many-body electron correla-
tions implicit in the GVB wavefunction are
essential to a proper description of these
clusters (3). For example, an ordinary HF
calculation on the Cu;O, cluster yields a
closed-shell singlet state 6.89 eV above the
triplet, while the GVB wavefunction puts
the singlet 0.04 eV below the triplet. The
problem with the HF description for these
systems is that separate singly occupied d
orbitals as in Fig. 2, a and b, are not allowed.
Thus, for Cu.O,, the GVB wavefunction
has the form
Vengiet = SA[P(d Lox + debz)(aB ~ Ba))
(4)
where o, and oz denote a,:~ like orbitals
centered mainly on the left and right copper
(as in Fig. 2 for Cu3), ® contains all other
orbitals and spins, and sf is the antisymmet-
rizer (determinant operator). All orbitals are
calculated self-consistently with no restric-
tion on shape, overlap, or character of the
orbitals. Similarly, the triplet state is de-
scribed as
WERE, = [O(droR — drb_)(aB + Ba)]
(5)
where all orbitals are solved for self-consis-
° ° °
| | |
O—Cu? —0—Cul—O—Cu?—0
{ | j
° ° °
ONE
* »
a w
*3.0 ae
ONE
Zz ™ *
i »
| b
-3.0
-7.0 Y 27.0
° ° °
O—Cut —0—Cyut—0*—Cut—O
° ° °
ONE
; - » mm
“3.0
“7.0
» cd
Y¥
Fig. 2. The GVB orbitals (amplitudes) of Cus;Oj9
(@) and (b) show two of the three singly occupied
(d-like) orbitals located on the copper centers.
Atoms in the plane are indicated with asterisks.
Positive contours are solid, whereas negative con-
tours are dotted; the increments are 0.10 atomic
units. (¢) The new singly occupied (oxygen 2pn—
like) orbital obtained by ionizing the CusOj0
cluster. No noticeable change occurs in the other
orbitals shown in (a) and (b). id) The corre-
sponding doubly occupied orbital at an adjacent
oxygen.
7.0
REPORTS 897
Fig. 3. The upper oxygen 2pm band for the Cu-O
sheets of LaxCu,O, (based on GVB energy band
calculations). The contour lines are labeled in
clectron volts. The Fermi energy for Lay 4s
Sto,1sCu,O, is at the boundary of the dotted and
undotted regions.
tently; however, the final orbitals are very
similar (indistinguishable in a plot such as
Fig. 2) from the optimum orbitals of the
singlet. As a result, the Heisenberg-type
description (Eq. 1) is suitable. In contrast,
the HF functions have the form
DH et = A[D(bpb,)(aB ~ Ba)}] (6)
DE et = A[D(dpdby ~ dudg)(aB + Ba)]
(7)
where all orbitals are calculated self-consis-
tently. For the optimum triplet state, the
final orbitals have the form
os = (by + dp)
by = (dp — or)
(8)
(9)
(ignoring normalization), leading to a wave-
function identical to woxe.. However, for
the singlet state, the HF wavefunction has
the form
gbg = (bLobR + deo.) + (SOL + dedp)
(10)
which includes equal amounts of covalent
and ionic character. These ionic terms corre-
spond to equal mixtures of Cu"!!.Cu! charae-
ter into the Cu!'-Cul! wavefunction, leading
to an artificially high energy. This HF wave-
function also leads to strong mixtures be-
tween the copper do and oxygen pa orbitals
(d2-2 and py in Fig. 2). To remove this
difficulty for HF wavefunctions, one can
relax the spin symmetry restriction and use a
wavefunction of the form
QUHF = st1d(b.0)(dRB)} Qy)
(the unrestricted HF or UHF wavefunc-
tion). This leads to much lower energy, but
the wavefunction is a mixture of singlet and
triplet character and hence one cannot di-
rectly obtain parameters for Eq. 1. (The
898
138
UHF wavefunction leads to singly occupied
orbitals that are essentially pure dy:_.: in
character.) The GVB wavefunction corre-
sponds to converting Eq. 11 into a singlet
state (leading to Eq. 4) and then reoptimiz-
ing the orbitals self-consistently. In terms of
band concepts, the HF description has a
half-filled band ($, occupied and $, empty),
whereas the GVB description has every
band orbital half-occupied (@, and oa). For
the infinite system, localized, singly occu-
pied orbitals such as in Fig. 2 lead to very
narrow energy bands. It is well known that
such systems are not well described with
normal band theory, and an empirical modi-
fication (the Hubbard Hamiltonian) is used
to obtain UHF-like wavefunctions. For the
GVB method, an N-body approach to band
calculations is required.
Valence bond band structure—localized ver-
sus delocalized states. Consider the Cu3;Oj9
cluster modeling the chains of Y;Ba>Cu;Oy.
Ionizing this cluster leads to two equivalent
localized wavefunctions, V_ and Vp,
. (a 0\
Y OSU R— Gur R—Cu—O i
IV
¥, O-—-Cu-- 0 —-Cu-- O-—Cu-—0
where structure IV is shown in Fig. 2, ¢ and
d. The wavefunction is localized because the
shape of a doubly occupied orbital needs to
be more expanded (electron repulsion) than
that of a singly occupied orbital and because
in W, the other bonds polarize to stabilize a
charge on the left, whereas in Wp they
polarize to stabilize a charge on the right.
These nvo wavefunctions overlap with a
sizeable matrix element between them,
Ar = (YLIAIVR)
(12)
(13)
In the resonating-GVB description of these
states, the rotal wavefunction is written
Wy = (VL + VRVVI 4S (14)
¥, = (¥_ + VeV/VI- 5S (15)
We calculate these states using the R-GVB
program (13), which allows every orbital of
W_ to overlap every orbital of Wp. This
many-electron resonating generalized va-
lence bond (R-GVB) hole description of the
ion states is equivalent to a polaron-hopping
description in which the hole is completely
dressed with polarization for cach site. The
R-GVB states in structures III and IV lead
to a resonance stabilization of 0.33 eV for
W, and destabilization of 0.43 eV for Wy.
This leads to a one-dimensional band with
bandwidth = 1.62 eV. As a result, holes in
the oxygen 2pm bonds of the Cu-O chains or
Cu-O sheets lead to high mobility charge
carriers for electrical conductivity.
N-Electron band theory, We report here a
band calculations based on such localized N-
electron valence bond wavefunctions rather
than the traditional one-electron molecular
orbitals (MO) or band orbitals. The N-
electron band states W, are described as a
linear combination of localized N-clectron
valence bond wavefunctions,
Wy = Cus + Cyd, t+ = LC;
(16)
where cach term 9; is an N-electron wave-
function describing a fully polarized descrip-
tion of the hole on a particular site 1. Here
the coefficients are calculated by solving
D Hi Cx =X SyCud (17)
J J
where the Hy are the N-electron matrix
elements (13) between oxidized sites i and j
and Sj; is the N-electron overlap for sites i
and j, as in Eqs. 12 and 13. This process is
analogous to a tight-binding band calcula-
tion except that the matrix elements arise
from N-electron valence bond calculations,
(YLIAIVR), where H is the total N-electron
Hamiltonian rather than from one-electron
matrix elements, (PL IH" Fidp:, where HHF
is an effective one-particle Hamiltonian. A
wavepacket constructed from the band
states, V, in Eq. 16, describes the motion of
a fully dressed polaron.
This contrasts with the tight-binding MO
description where the coefficients describe a
particular one-electron band orbital
he = Cudr + Cube to0+ = 2D Cadi
(18)
(in terms of localized atomic-like orbitals
;), which is occupied along with the other
band orbitals to construct a many-clectron
wavefunction,
WMO = slf(Ws)*(b2)? ++ *) (19)
For the CuO sheets of LaxCu,Os, we
calculated the valence band shown in Fig. 3.
This represents the Bloch N-clectron states
for a hole moving in the CuO sheets. The
most stable ion state is at the M_ point
(highest state in a standard band calcula-
tion). The least stable state is the X point.
The total bandwidth is 1.38 eV.
For Laz_,Sr,Cu,O4, the maximum transi-
tion temperature T, occurs for x = 0.15.
Assuming all these holes go into the oxygen
2p band leads to the Fermi energy at 0.16
eV below the top of the band, as indicated
in Fig. 3. The density of states at the Fermi
SCIENCE, VOL. 239
energy is N(O) = 1.14 eV7! per copper
atom. For Y,Ba,Cu;O; the band arising
from the chains would be half full if all holes
Were in this band. This would lead to
N(0) = 0.21 eV7! per chain copper. These
results are used in the accompanying report
(12) to derive the T, for superconductivity
in cuprates.
The Hubbard model. As pointed out
above, standard HF methods lead to a verv
bad description of systems such as these Cul!
systems having weakly overlapping orbitals.
The result is a strong mixture of the singly
occupied do orbitals (dy:_.: pointing from
Cu to O) with oxygen po orbitals, leading to
a partially filled band of mixed copper do
and oxygen po character. With GVB, the
clectron correlation effects Icad to singly
occupied d,:..: orbitals on each copper. In
terms of band concepts this GVB descrip-
tion corresponds to half occupation of every
orbital of the band constructed from aye
on cach center, whereas HF would start
with half occupation of the band. The Hub-
bard approximation (4) to UHF builds ina
similar improvement upon standard HF
band theory by introducing a repulsive one-
center term to split the HF band into up-
spin and down-spin bands on separate sub-
lattices (leading to spin waves). The GVB
approach treats the electron correlation
problem rigorously, leading to pure spin
states. However, this leads to the complica-
tion that the band states must be calculated
in terms of N-body wavefunctions (as pre-
sented above) rather than the usual one-
particle orbitals.
Several standard one-electron band calcu-
lations (based on local density functionals)
have been reported (15) on LazCu,Os.
These band calculations suggest an overall
population of d° on each copper, in agree-
ment with the GVB results but, as expected,
they all involve strong mixing of copper do
and oxygen po character, leading to a par-
tially occupied band of o character. A prop-
erly parameterized Hubbard Hamiltonian
might mimic the GVB results (with the
copper do orbitals forming a narrow band
and a highest band that is oxvgen pr-like).
REFERENCES AND NOTES
1. J. G. Bednorz and K. A. Muller. 2. Pins. B 64, 189
(1986). J. M. Tarascon, L. H. Greene, W. R.
McKinnon, G. W. Hull. T. HH. Geballe, Science 235,
1373 (1987).
. MOK. Wu, et al.. Phys. Rev. Lett, 58, 908 (1987).
~ WJ. Hunt. P. j. Hay, W. A. Goddard III, J. Chem.
Phys. 57,738 (1972), W. A. Goddard III and T. C.
McGill, J. Vac. Sei. Technol. 16, 1308 (1979).
4. The geometries of these clusters are based on neu-
tron diffraction data (5). The Dunning (9s5p‘312p)
double zeta contraction (6) of the Huzinaga Gauss-
ian basis set is used for the oxygen atoms. The Los
Alamos core effective potential (7) is used to replace
the 18 core electrons of the copper atom. This Icads
to a (3s.2p.2d) Gaussian basis for the 1] valence
electrons of the copper atom
wh
19 FEBRUARY 1988
139
5. P. Day et al., J. Pins. C 20. L429 (1987); WLI. F.
David et al.. Nature (London) 327, 310 (1987).
6. T. H. Dunning, Jr, and P. J. Hay, in Modern
Theoretical Chemistry (Plenum, New York, 1977),
vol. 3, p. 1.
7. P.J. Hay and W. R. Wadt, J. Chem. Phys. 82, 270
(1985).
8. N. D. Mermin and H. Wagner, Phys. Rev. Lett. 17,
1133 (1966).
D. Vaknin er al., ibid. 58, 2802 (1987).
J. B. Torrance, Y. Tokura, A. Nazzal. S. S. P. Parkin,
in preparation; Y. Tokura, J. B. Torrance. A. I.
Nazzal, T, C. Huang, C. Ortiz, in preparation.
. Superexchange involves mixing of charge transfer
states (structures VI and VII) into the dominant
wavefunction (top structure V diagram). This
charge transfer is much less favorable if the oxygen is
made more positive.
cat
fs-oo efo VI
cfs ooh vu
12. G. Chen and W. A. Goddard III, Science 239, 899
(1988).
13. A. F. Voter and W. A. Goddard ITI, J. Am. Chem.
Soc. 108, 2830 (1986). The matrix elements as
eo.
defined in Eq. 13 would depend on cluster size. We
use instead
= EN-1gNOE pen
Hy, = EX-'SS1 ~ HY! (20)
where E®~' is the energy of the cluster with one
hole, and H®-! and S*°" are for systems with the
hole localized on sites 1 and j, respectively. The sign
chosen in H, is so that EF, = —JP,, where IP, is the
ionization potential out of Bloch state &, leading to
band diagrams analogous to one-electron band dia-
grams.
. J. Hubbard, Proc. R. Soc. London Ser. A, 281, 401
(1964).
. LF. Martheiss, Pls. Rev. Lett. 58, 1028 (1987); K.
Takegahara, H. Harima, A. Yanase, Jpn. J. Appl.
Phys. 26, L352 (1987); T. Oguchi, ibid. p. L417.
. We thank the Office of Naval Research and the
donors of the Petroleum Research Fund (adminis-
tered by the American Chemical Socien’) for partial
support of this research, We thank C. Mi Kao and
G. Chen for assistance and uscful discussions. The
GVB and R-GVB calculations were carried out on
the Alliant FX8/8 and DEC VAX 8650 computers
in the Caltech Materials Simulation Facility [funded
by the National Science Foundation-Materials Re-
search Groups (grant DMR-84-21119); the Office
of Naval Research/Defense Advanced Research Pro-
jects Agency (contract N00014-86-K-0735); the
Department of Energy—Energy Conversion and Uti-
lization Technology (JPL code 49-242-E0403-0-
3550), the National Science Foundation—Chemistry
(grant CHE-8318041), and the Office of Naval
Research (contract N00014-84-K-0637)}.
7 December 1987; accepted 20 January 1988
REPORTS 899
140
Appendix 2
Superconducting Properties of Copper Oxide
High-Temperature Superconductors
141
Proc. Natl. Acad. Sci. USA
Vol. 86, pp. 3447-3451, May 1989
Chemistry
Superconducting properties of copper oxide
high-temperature superconductors |
(YBaCuO/LaSrCuO/magnon pairing/generalized val bond
/Heisenberg coupling term)
GUANHUA CHEN, JEAN-MARC LANGLOIS, YUEJIN GUO, AND WILLIAM A. GODDARD III
Arthur Amos Noyes Laboratory of Chemical Physics, California I
of Tech
d CA 91125
1 Py
bye F
Contributed by William A. Goddard Il, February 9, 1989
ABSTRACT The equations for the magnon pairing theory
of high-temperature copper-oxide-based superconductors are
solved and used to calculate several properties, leading to re-
sults for specific heat and critical magnetic fields consistent
with experimental results. In addition, the theory suggests an
explanation of why there are two sets of transition tempera-
tures (7, ~ 90 K and 7, ~ 55 K) for the Y;Ba,Cu30¢,, class of
superconductors. It also provides an explanation of why
La,_,Sr,CuO, is a superconductor for only a small range of x
(and suggests an experiment to independently test the theory).
These results provide support for the magnon pairing theory
of high-temperature superconductors. On the basis of the the-
ory, some suggestions are made for improving these materials.
We recently proposed the magnon pairing mechanism (1) to
explain the high-temperature superconductivity in ceramic
copper oxide superconductors (2—4). This model was de-
rived from the results of generalized valence bond (GVB)
calculations (5) and was used to predict some qualitative fea-
tures of these systems. With the magnon pairing mechanism
(1) we have now calculated several properties of the super-
conducting phase. The results on specific heat, critical mag-
netic fields, Hall effect, penetration depth, coherence length,
and dependence upon doping generally agree with experiment
and, in some cases, explain rather puzzling results. Several
predictions are made that could be tested with further experi-
ments.
REVIEW OF THE MODEL
The GVB calculations (5) indicate that the Laz_,Sr,Cu,O,,
Y,Ba,Cu30¢6+,, and T1,Ba7Ca,_;Cu,,02,44 classes of super-
conductor oxides (2-4) have the following essential features:
(i) Cu"(d%) sites lie in an essentially square Cu—O sheet
having linear Cu—-O—Cu bonds with the singly occupied
Cu d orbital localized in the plane of four short Cu—O bonds
(Royo * 1.9 A).
(ii) The spins of adjacent d orbitals are coupled antiferro-
magnetically (by means of the intervening oxygen) with a
Heisenberg coupling term, Jgq, ranging from —100 K to —250
K (depending on the system).
(iii) Oxidation of the system beyond cupric (Cu!) leads
not to Cu"! but rather to holes localized in the pa (nonbond-
ing) orbitals of oxygens (0? -» O~). These pr orbitals are
localized in the plane containing the short Cu--O bonds to
the adjacent copper.
(iv) The migration of an oxygen hole from one site to an-
other leads to energy bands with a reasonably high density of
states (No = 1.2 states per eV per sheet of Cu) and to high
electrical conductivity.
(v) The magnetic coupling of the singly occupied oxygen
orbital (the hole) with the two adjacent copper atoms is fer-
The publication costs of this article were defrayed in part by page charge
payment. This article must therefore be hereby marked “advertisement”
in accordance with 18 U.S.C. §1734 solely to indicate this fact.
3447
romagnetic, with a Heisenberg coupling term of Jog, =
+330 to +400 K (depending on the system). This leads to
ferromagnetic coupling of the spins of the two Cu atoms,
despite the antiferromagnetic Juz. Because of the more posi-
tive oxygen, the value of J 3, for the Cu atoms adjacent to the
hole is about 30% smaller than Jaq.
A qualitative view of the magnon pairing model of super-
conductivity is as follows. Adjacent Cu spins have a tenden-
cy to be opposite when separated by a normal O?7 but tend
to repolarize parallel (ferromagnetic) when separated by an
oxygen hole (O°). As the conduction electron (O~ hole)
moves from site to site, it tends to leave behind a wake
where adjacent copper spins are ferromagnetically paired.
As a second conduction electron is scattered, it is favorable
to be scattered into the wake of the first electron, since there
is already ferromagnetic polarization of the copper spins,
leading to a favorable interaction. The net result is an attrac-
tive interaction between conduction electrons which leads to
superconductivity. In the next section we outline the ap-
proach used to calculate the quantitative aspects of super-
conductivity.
THE SUPERCONDUCTING STATE
Interaction Potential. For a two-dimensional CuO sheet
with Cu-Cu distances of a and 5 in the x and y directions, the
lowest-order interaction between two Opr holes is given by
Vei-Kpet-ny = ~PQDW, (1}
where W = J24/AE,
PQ) =1+ 5 cos Qa + ; cos Ob
+2 c0:(02}c0:(8),
and Q = k — kK. In deriving Eq. 1, we use the random phase
approximation for the Cu spins and write the average excita-
tion energy for a Cu spin-flip as AE = 8rBVqq|. Here 7 is
related to the average spin correlation between adjacent sites
(1), and B = Jgg + Jag)/2Jaq ~ 0.85 accounts for the de-
crease in yl when the neighboring oxygen is O7 rather
than 077.
Coupling the (k, ~—k) pair into a triplet state leads to an
attractive net interaction potential of
Vix = —[P(Q) — P(Q‘)]W, (2]
where Q’ =k + kK.
Factorization of the Interaction Potential. To solve the gap
equation A, = —SeVigdg(1 — 2f%)/2Ex, where E = (éf +
Abbreviations: GVB, generalized valence bond; BCS, Bardeen-
Cooper-Schrieffer.
142
3448 Chemistry: Chen et al
\A, 7) and f, = 1/(e 8% + 1), we decompose the Q-depen-
dent factor in Eq. 2 to obtain
Vic = -W > Vk) ¥(K). [3]
This leads to the new gap equation
8 = 2 By Ej [4a]
where
= Wy. Q = 2fi)
Bi = Wh; Dy WiWHK) [4b]
and
A= > gi ¥i(k). [4c]
For each temperature we solve Eq. 4 iteratively until self-
consistency is achieved.
We find two sets of solutions to Eq. 4. Assuming the gap
to be real, we find a self-consistent solution of the form A® =
e(¥, + V2), leading to a gap that goes through zero (V3 and
, are small compared with VY, and ¥). On the other hand,
allowing a complex gap leads to a self-consistent solution of
the form Af = g(¥, + iV), where [A,°] depends on k but
never goes to zero. In each system, the complex gap solution
leads to an energy stabilization about 20% greater than for
the real gap, and we report properties based only on Af. It is
possible that near 7, and near surfaces or grain boundaries
the solutions would be more complicated.
Transition Temperature. As T approaches 7,, the gap and
hence the g; must go to zero. This leads to B;; = 5,;and hence
to
(5)
T, = 1.13 aVeal exo| ~ Sete)
ANoW pa)
where A = Ay = Ap >> As = Ay, No is the density of electron
States, and aJgq is the effective range of the K, integration
(perpendicular to the Fermi surface). The upper bound on a
is 4, and we estimate a = 2.
RESULTS
Using Eq. 4, we solved numerically to obtain the A(k) at var-
ious temperatures for three classes of systems,
Proc. Natl. Acad. Sci. USA 86 (1989)
La2_,Sr,Cu,0, (denoted 2-1-4),
Y,Ba,Cu3;0.4, (denoted 1-2-3),
and
(T10),Ba2Ca,-1Cu,02,42 (with n layers of Cu—O sheets,
denoted Tl-n layer),
and used these results to calculate specific heats, critical
fields, and other properties at various temperatures.
The 2-1-4 system involves six-coordinate Cu sites (with
apex oxygens above and below the Cu—O sheets), while the
1-2-3 and Tl-n systems (with n = 2) involve five-coordinate
sites with apex oxygen on only one side of the Cu—O sheet.
In all cases we used the experimental crystal structure and
carried out the GVB calculations as in ref. 5, with CuO,
clusters (p = 11, 9, and 9 for 2-1-4, 1-2-3, and TI-2, respec-
tively) using an array of additional ions (216, 200, and 299,
respectively) to represent the electrostatic field of the crys-
tal. These calculations were carried out for two charge states
to yield Jg¢ for the neutral and Jocy and JZ, for the positive
ion, as in Table 1. The electron transfer matrix elements
were calculated as in ref. 5 to obtain similar band structures
for all three systems, leading to the density of states No. We
used a = 2 (see above) for all systems. The A comes from
Eq. 3 but depends sensitively on the concentration of holes
in the Cu—O sheets (x,), as discussed in the next section.
The average spin correlation of adjacent Cu spins, 7, is
difficult to calculate because of the dynamic nature of the
coupling between the oxygen and copper spins. Given all
other parameters, the value of + needed for Eq. 5 to yield T;,
= 93 K for Y,;Ba2Cu;07 is 0.0167. Using + = 0.0167 for
Lay gsSto.1sCuO, would lead to 7 = 32.6 K, in reasonable
agreement with the observed 7, = 37 K. This close agree-
ment provides support for the overall magnon pairing mech-
anism; however, in the balance of this paper we will use the
value 7 = 0.0159 for La2_,Sr,CuO, (but will assume 7 to be
independent of x), adjusted to yield 7, = 37 K atx = 0.15 (x,
= 0.10, see below).
In general, for Bi-, Tl-, and Pb-containing materials (6),
(AO)mM2Ca,-1;Cu,O2n+2, the m = 1 case leads to six-coordi-
nate Cu—O sheets (as in 2-1-4) and the four known such
systems lead to J, = 0, 12, 50, and 90 K, whereas the n = 2
systems have two five-coordinate Cu—O sheets (as in 1-2-3,
with additional four-coordinate Cu—O sheets for m > 2) and
the five known n = 2 systems lead to 7, = 90, 90, 90, 90, and
110 K, while the four known n = 3 systems lead to JT, = 110,
110, 122, and 122 K, and the a = 4 system leads to 7, = 122
K. We calculated only the m = 2 case but presume the results
to be relevant for n = 2.
Using the calculated parameters (Jaa, Jocu, No, 8) for the
Table 1. Quantities used in calculating superconducting properties
Parameter LazasSto.1sCuyO, Y,BazCu307 Ti,Ba,Ca;Cu20g
Jaa K —204.7 ~94.0 -175.4
Jocu. K 383.2 330.1 396.0
B* 0.841 0.860 0.872
Xs 0.10 0.25 (0.25)t
A 0.558 1.17 1.17
Not 1.23 1.19 1.42
a (2.0)8 (2.0)8 (2.0)8
T, (+ = 0.0159), K (=37.0) ~=9? ~=167
T, (r = 0.0167), K ~33 (~93) ~1608
*B = (Sag + J54)/2Jed-
tAssumed, based on Y,Ba,Cu30y.
tUnits are eV~! per sheet Cu atom.
SAssumed.
SUsing A = 0.91 (for x, = 0.18) and all other quantities the same leads to 7, = 123 K.
143
Chemistry: Chen er al
Tl-2 layer system and assuming other parameters estimated
as for Y;Ba,Cu;07 (x, = 0.25, 7 = 0.0167) leads to T, = 160
K for Ti-2. This is in reasonable agreement with experiment,
providing additional support for the magnon pairing model.
We believe that the fluctuations in 7, for these Tl-n systems
are due to variations in x, and suggest that x, = 0.25 (the
same value as for 1-2-3) would yield 7. = 160 K for Tl-n
(with m = 2). The highest observed T, of 122 K suggests x, =
0.18 for the Tl-n systems (n 2 2).
Dependence on Doping. The number of holes in the Cu--O
sheets x, is critical to the superconductivity. The valence
band maximum is at the M point, (a/a, a/b), and thus as x,
— 0, the Fermi surface collapses around the M point. From
Eq. 1 this leads to Vinp ~> 0 and hence to A > 0 as x, + 0.
1-2-3 system. For Y,Ba2Cu30¢4,x, Hall measurements (7)
have been reported for various x and used to estimate x};, the
concentration (per cell) of carriers contributing to the con-
ductivity. We assume that hole in the chains (which are dis-
ordered and incomplete) cannot contribute significantly to
Xy, So that the Hall coefficient measures the concentration
of holes per sheet (x,), x = 2x. For x = 1, the experiments
lead to x, ~ 0.25 and 7, = 92 K, while for smaller x, the
experiments lead to smaller x, and decreased 7,. In Fig. 1 we
show the predicted 7, versus x, (using the same values for J,
No, 7, a, and B but allowing A to change with x,) and compare
with the experimental data. The close correspondence pro-
vides support for the magnon pairing model. It is important
to note that such a rapid change of 7, with x, would not be
expected for a simple Bardeen—Cooper-Schrieffer (BCS)
system (singlet pairs rather than triplet pairs).
The observation that the Y,Ba.Cu;0,., system tends to
have either a high T, around 93 K or a lower T, around 60 K
we associate with the special stabilization of x, = 0.25 for x
near 1 (because holes in the oxygen chain cannot get closer
than alternate sites) and x, = 0.125 for x near 0.5 (again relat-
ed to the capacity for chains to carry holes). Assuming all
calculated parameters except x, (and hence A) are un-
changed, the theory predicts 7, = 45 K for x, = 0.125, while
T, = 93 K for x, = 0.25.
2-1-4 systems. In Fig. 2 (solid line) we show how the pre-
dicted 7, varies with x, for the 2-1-4 system. These results
are in agreement with experiment for x < 0.1 but not for
higher values. However, as predicted earlier (5), the GVB
calculations lead to a relative ionization potential (IP) from
the sheet O versus the apex O that varies linearly with dop-
ing, IP, ~ IP, = 0.38 — (0.38/0.13)x, so that for x > 0.13, the
most stable location of the hole is predicted to be at the apex
oxygen, not at the sheet oxygen. (This occurs because La?*
— Sr’* leads to a less attractive potential near the apex oxy-
gen.) This suggests that for x > 0.1, an increasing fraction of
120
YBa 2Cu 306,
100}-
_ 80+
<=
r 60
4or i,
zoe +
0 a ! i 1 1
0.00 0.16 0.20 0.30
DOPING (x ,)
Fic. 1. Predicted dependence (solid line) of 7. on x, for 1-2-3.
Experimental results from Hall effect measurements (7) are shown
as crosses with magnitude representing experimental errors.
Proc. Natl Acad. Sci. USA 86 (1989) 3449
IF x #Xx
4 ‘.
/ $ ' ‘Y
52 0.30
DOPING (x)
Fic. 2. The solid line shows the predicted dependence of T, on
x, for 2-1-4, while the broken line shows the predicted T, versus x
using the relation between x and x, from the text. Experimental 7,
values are given versus x (x, has not been measured).
the holes are on the apex oxygens rather than the sheet oxy-
gens. At atmospheric pressure, oxygen vacancies are ob-
served for x > 0.15. We suggest that this is because a high
concentration of holes on the apex oxygens favors oxygen
vacancy formation. ;
In Fig. 2 (broken line) we show the 7, predicted from mag-
non pairing theory, assuming that x, = x for x = 0.06, x, = 0
for x = 0.34, with x,/x varying linearly between these limits
(assuming no vacancy formation). This approximates the
variation of x, with x expected if IP, = IP, at x = 0.17 (calcu-
lated crossing point at x = 0.13). As indicated in Fig. 2, the
predictions are in good agreement with experiment (8), ex-
plaining the experimental results that 7, =~ 0 K for x < 0.06
and for x > 0.3. These predictions could be tested indepen-
dently by measuring the number of holes (x,) contributing to
the Hall current as a function of x. This is a difficult experi-
ment because apex holes also contribute to the normal con-
ductivity for the 2-1-4 system.
Specific Heat. We calculated the electronic specific heat
for the 2-1-4 system and found a change at T, of (AC,/T.) =
8.4 mJ /mol-K? and an electronic specific heat for the normal
state of y, = 5.79. Combining these results leads to AC,/YeTe
= 1.45, which is close to the value for BCS theory, AC,/ 77;
= 1.43.
A direct measurement of AC/T, was made by Batlogg er
al, (9), using quasi-adiabatic methods. They found a specific
heat gap of AC/C = 1% at T, and (AC/T exp = 7.6 + 1.8
mJ/mol-K?, in excellent agreement with theory. However,
we should point out that the experimental situation is not
settled. Similar experiments by Maple er al. (10) revealed no
transition in AC, while similar experiments by Nieva er al.
(11) led to AC/C = 8%. This latter experiment leads to a best
estimate of y, = 18 + 2 mJ/mol-K?.
Critical Magnetic Fields. Above the critical magnetic field,
H,, the free energy of the normal state is lower than that of
the superconducting state (because of the Meissner effect).
Assuming a magnetic field in the c direction (perpendicular
to the CuO sheet), we calculated H, as a function of T.
For the 1-2-3 system these calculations lead to H,(0) = 0.74
T and (dH.a,a)7, = ~0.80 T/K (where T denotes tesla).
To compare with experiment, it is necessary to estimate
the magnetic penetration length A, and the coherence length
é. Using the London formula Ay = (m*c?/42n,e?)'”, the cal-
culated effective mass (m*/m, = 1.9 for 2-1-4) and assuming
n, to be the number of O holes in the sheets, we obtain A, =
1400 A from 1-2-3 (x, = 0.25) and Ay = 2200 A for 2-1-4 (x, =
0.10). To obtain the coherence length é, we used the calculat-
ed H,(0), the above value for. A,, and the Ginzburg-Landau
relation for H.(0) = bo/(297V2 A). The result is €= 23 A for
144
3450 Chemistry: Chen ef al
1-2-3 and € = 38 A for 2-1-4. The conversion to the experi-
mentally measured critical field H.2 is H.2 = V2 Ad/é.
The only measurement (on single crystals) with the field
perpendicular to the CuO plane is by Worthington et al. (12)
for 1-2-3. For a sample with 7, = 87 K, they found
dHex —
(#s )- 0.71 T/K for T<78K
and
d. >
(42+) = -0.46T/K for 78 < T < 87,
These values are in reasonable agreement with our predicted
value of —0.80, providing support for the current model. The
high-temperature knee in H. versus T at T ~ 78 K is not
understood. Jt might be an experimental artifact (the mea-
sured T, = 87 K is below the usual value of T, = 93 K). It
might result from a sensitivity to surface regions where the
properties may be different. For example, the calculated
slope depends on x,, which might be different near the sur-
face. [For the real gap solution, we calculate (dH..,/dT) =
—0.65 T/K.]
Magnitude of Jas. There are no directly determined values
of Jaa for the CuO systems. However, accurate experiments
have been reported on K2NiF,, which has the same layered
structure as La,;Cu,O,. The Neél temperature (137) for
K.NiF, is Ty = 97 K, and neutron scattering leads to an (in-
plane) Heisenberg coupling term of Jgg = ~52 K (13) and
~56 K (141) (the out-of-plane value is less than 0.2 K (14)}.
To determine the accuracy of our calculations for Jy, we
carried out the same GVB calculations on Ni2F,,; clusters,
leading to Jag = ~51 K. This is in good agreement with ex-
perimental results, suggesting that the exchange integrals
calculated with GVB are accurate (to about 10%).
Estimates have previously been made for Jgg of the CuO
systems. Lyons er al, (15+, 16), using Raman light scatter-
ing, found an inelastic peak at 0.37 eV for La,Cu,O, and 0.32
eV for Y,;Ba,Cu30¢ (both semiconductors, not superconduc-
tors). They found that this peak rapidly disappears as the
systems are doped (x > 0) and interpreted this inelastic tran-
sition as a double Cu spin-flip. Linear magnon theory leads
to AE = 5.4 Jag for this process, suggesting Jgg = —790 K for
2-1-4 and Jag = —680 K for 1-2-3. There is no direct evidence
showing that the observed process involves the Cu spins,
and we believe that the large discrepancy with the calculated
Jaq Values argues against this interpretation. We suggest that
the undoped system may have a small number of oxygen va-
cancies, leading to extra electrons in the system, which
would lead to local Cu! (d!°) sites. From GVB calculations
on the clusters used for 2-1-4 but with one O vacancy, we
calculate the electron transfer process Cu!—Cu! to Cu!—
Cu!" to have an excitation energy of 0.4 eV and suggest that
the transition observed in Raman light scattering is associat-
ed with such electron transfer excitations. Small amounts of
doping would remove the extra electron from these Cu!
sites, leading to disappearance of the peak with small x, as
observed. For the 2-1-4 system this might be testable directly
by experiments at high O2 pressure, which might decrease
the number of oxygen vacancy sites and by our suggestion
lead to the disappearance of the 0.4 eV peak near x = 0.
Maximum 7,. In Eq. 5, the maximum 7, would occur if the
term in the exponent were zero, leading to TP™ = 1.13 alg.
However, the derivation of this equation includes only the
lowest-order interactions and is not valid for the limit of very
large coupling. To estimate the form for T, at stronger cou-
tThe J in this paper is defined as twice our value.
Proc. Natl. Acad. Sci. USA 86 (1989)
pling, one can use (17) T, = [(w)/1.20Jexp[—1.04(1 + n)/y),
valid for 7 © 1, where in our case 9 = ANo(Jpa)?/87BVaal-
The upper limits are (w) < 4Jgq (for a = 4) and the exponen-
tial term < e7!. This leads to T. < 1.23 Jag (Eq. 5 would give
the same result if a — 1.09 as 47 — ©). For 7 >> 1, the
correct formula (17) is Te = 0.18((w?)y)'”, where (w*) = 2/n
J a’ F(w) w dw; estimating the integral in (w?) using various
forms for F(w) leads to 7. < Jag. Using T?™ = 1.23 Jag with
our calculated values of Jyy = 205 K leads to T7™ < 250 K.
(We consider this to be an upper bound; a more conservative
estimate of T&* < Jag = 200 K given in ref. 1 is probably
closer to the real limit.) This provides hope that the recent
enormous advances in increasing 7, may continue (currently
the highest confirmed 7, is ~125 K). On the other hand, it
suggests that the current class of systems based on CuO
sheets will not achieve room temperature 7;,.
Non-Cu—O High-Temperature Systems. The ceramic
Ba,.,K,BiO; shows superconductivity with 7, = 30 K for x
= 0.4 (18, 19). The magnon pairing mechanism has nothing
to say about this system since there are no localized magnet-
ic atoms. Systems such as BaBiO; with a formal Bi’ * oxida-
tion state tend to disproportionate to a mixed valence state
Ba,Bi!"Bi%O,, with clearly defined (20) Bi! sites [larger
cavity because of the (6s)? pair of valence electrons] and Bi¥
sites (small cavity). Upon doping with K, such an ordered
arrangement is not possible (the Bi atoms are equivalent).
However, it is plausible that conduction involves hopping of
the electrons between Bi"! and Bi!’ sites and/or between
Bi'’ to Bi’, either of which should couple strongly to lattice
vibrations. Thus we believe that this system involves lattice
coupling much as in BCS theory.
NEW HIGH 7, MATERIALS
Cu—O Type. Among compounds with CuO layers, some
are superconducting and some are not (21). We suggested (1,
5) that the major issue here is arranging the cations so that
the oxygen holes are in sheets rather than in apex or other
locations. Thus, systems that are not superconducting might
be converted into superconductors by rearranging the cat-
ions (varying the ionic radius and charge). Indeed, we be-
lieve that the critical issue in all current materials is arrang-
ing the environment of cations so as to maximize x,. For
example, the magnon pairing theory suggests that if the
La2..,Sr,CuO, system could be modified to have x, = 0.25
(leading to A = 1.17, but leaving all other parameters the
same), then TJ, would increase to 139 K!
A second approach would be to eliminate all oxygens not
in the sheets in such a way that the holes must remain in the
sheets. For example, if the apex O?~ were all replaced by
F7, the holes would be expected to lead to OT in the sheets
(rather than F radicals in the apex position), allowing x, = x.
(However, if some F ions end up in the sheet, the supercon-
ducting properties might be much worse.) Such a system
might be designed by finding a structure in which cations
near the apex positions would prefer F to O.
Cu—xX Type. To replace the O with other anions X, we
believe that it is important to retain the CuX, sheets with
linear Cu—-X—-Cu bonds. To obtain the highest possible T,,
the equation for 72 suggests maximizing Vgg|. Based on
the Neél temperatures for the MnO, MnS, and MnSe sys-
tems, we expect Vug| to be about 40% larger for S or Se in
place of O. Thus the limiting temperature might be 77% =
350 K for Cu—S sheets rather than T?™ = 250 K for Cu—O
sheets (bear in mind that these are upper limits). There are,
however, many other factors to consider. Most important,
the Cu-—S sheets must be two-dimensional (with weak cou-
pling between adjacent layers) and must yield S pm holes
upon doping. A three-dimensional system with strong cou-
pling between Cu spins in different layers would lead to long-
145
Chemistry: Chen er al.
range order in the Cu spins (7 large), making it unfavorable
for the conduction electrons (S pz holes) to flip the Cu spins.
Thus one would need layered systems analogous to the
2-1-4, 1-2-3, or Tl-n systems. We know of no such examples
involving S or Se. In addition, the structure must be stable
for large concentrations of holes (x,). The ferromagnetic cou-
pling Jscu is likely to be smaller than Joc, (because of the
longer bond length), so that it may be more difficult to
achieve the limiting temperature for S systems.
Other Metals. To find replacements for Cu, there are sev-
eral factors to consider. First, any such replacement will
probably lead to a longer M—O bond and hence a smaller
Jom. Other things being equal, a large Jz, is favored by both
small bond distance and more electropositive metals, sug-
gesting Sc-Cr. However, for metals with n unpaired d elec-
trons (e.g., Cu?* has 2 = 1, Ni?* has = 2, Mn?* hasan = 5),
the Jag decreases as n increases. Thus, these criteria favor d!
and d? systems, making Sc?*, Ti?*, V**, or Cr°* reasonable
possibilities. However, for a d? system in a pseudooctahe-
dral environment, the three f2,-like orbitals are of similar en-
ergy, making such systems less favorable (since d,, must be
stabilized). In addition to Cu**, potential d® systems would
include Ag?* and Au?*; however, considerations of size,
electronegativity, and spin-orbit coupling all seem to favor
Cu. Thus these considerations suggest that Cu?* is the opti-
mal choice for the metal.
SUMMARY
The agreement with experiment for various properties pre-
dicted by using the magnon pairing model of superconductiv-
ity provides strong support for the validity of this model for
the Cu—O systems. All quantities are related to the funda-
mental parameters of the system (Jag, Jocu, band structure).
Some approximations have been made in the solutions to
these equations. Nevertheless, the fundamental parameters
are well defined, and hence improved calculational approxi-
mations will eventually lead to precise predictions of all
properties. In this theory, the superconducting properties
are related to fundamental structural, chemical, and physical
properties, allowing one to use qualitative reasoning in con-
templating how to improve the properties.
Note Added in Proof. After submission of this paper, a news report
(22) appeared stating that Z. Kgkol, J. Spatek, and J. Honig (Purdue
University) have evidence suggesting that La,_,Sr,NiO, might be
superconducting, with a T, between 4 and 70 K. We have not calcu-
lated the properties for this system, but assuming that holes lead to
O?> —» OF (as in Cu) rather than to Ni?* —» Ni?* and assuming Jgani
= “%Jggcu = —~100 K (see above) with all other parameters (includ-
ing x, and r) as for Lay gsSto.4sCu,O,, we predict 7, = 66 K for
Lay.ssSto.isNiO.. Thus it is plausible that the Ni system leads to
superconductivity if the holes go on the NiO, sheets. Because of the
lower Jaq, the maximum T, for a NiO, system should be about half
of that for a CuO, system (i.¢e., ~100 K instead of ~200 K).
In addition, a second report (23) shows that Nd, gsCeo.15Cu,0, is a
superconductor (7, = 24 K) but that conduction is dominated by
electrons rather than holes. GVB calculations on this system lead to
Jag = —137 K for the undoped system. For the doped system (extra
electron), the GVB calculations lead to a resonating state involving
Cu! (d"°) and Cu" (d%), with a nearest-neighbor resonance stabiliza-
tion of 0.32 eV. This leads to a description of conduction involving a
Proc. Natl. Acad. Sci. USA 86 (1989) 3451
heavy magnon (one down-spin, several up-spin) whose motion is
impeded by the antiferromagnetic coupling. This leads to magnon-
mediated attractive coupling between the heavy magnons, much as
for oxygen holes. However, we have not yet succeeded in estimat-
ing the T,.
This research was supported by the Office of Naval Research
with assistance from the Donors of the Petroleum Research Fund,
administered by the American Chemical Society. The calculations
were carried out on the Alliant FX8/8 computer and also on a DEC
VAX 8650 computer. These computer facilities were provided by
the Defense Advanced Research Projects Agency/Office of Naval
Research, National Science Foundation (Division of Materials Re-
search, Materials Research Groups), Department of Energy/Energy
Conversion and Utilization Technologies, and the National Science
Foundation (Division of Chemistry). This is contribution no. 7881
from the Arthur Amos Noyes Laboratory of Chemical Physics.
1. Chen, G. & Goddard, W. A., III (1988) Science 239, 899-902.
2. Bednorz, J. G. & Miller, K. A. (1986) Z. Phys. B 64, 189-193.
3. Wu, M. K., Ashburm, J. R., Torng, C. J., Hor, P. H., Meng,
R. L., Gao, L., Huang, Z. J., Wang, Y. Q. & Chu, C.S.
(1987) Phys. Rev. Lett. 58, 908-910.
4. Sheng, Z. Z. & Hermann, A. M. (1988) Nature (London) 332,
55-58.
Guo, Y., Langlois, J.-M. & Goddard, W. A., ITI (1988) Science
239, 896-899.
Sleight, A. W. (1988) Science 242, 1519-1527.
Wang, Z. Z., Clayhold, J., Ong, N. P., Tarascon, J. M.,
Greene, L. H., McKinnon, W. R. & Hull, G. S. (1987) Phys.
Rev. B 36, 7222-7225.
8. Torrance, J. B., Tokura, Y., Nazzal, A.1., Bezinge, A.,
Huang, T. C. & Parkin, S. S. P. (1988) Phys. Rev. Lett. 61,
1127-1130.
9. Batlogg, B., Ramirez, A. P., Cava, R. J., van Dover, R. B. &
Rietman, E. A. (1987) Phys. Rev. B 35, 5340-5342.
10. Maple, M. B., Yang, K.N., Torikachvili, M. S., Ferreira,
J.M., Neumeier, J. J., Zhou, H., Dalichaouch, Y. & Lee,
B. W. (1987) Solid State Commun. 63, 635-639.
ll. Nieva, G., Martinez, E. N., dela Cruz, F., Esparza, D. A. &
D'Ovidio, C. A. (1987) Phys. Rev. B 36, 8780-8782.
12. Worthington, T. K., Gallagher, W. J. & Dinger, T. R. (1987)
Phys. Rev. Lett. 59, 1160-1163.
13. Birgeneau, R. J., Als-Nielsen, J. & Shirane, G. (1977) Phys.
Rev. B 16, 280-292.
14. Birgeneau, R.J., Skalyo, J. & Shirane, G. (1970) J. Appl.
Phys. 41, 1303-1310.
15. Lyons, K. B., Fleury, P. A., Remeika, J. P., Cooper, A. S. &
Negran, T. J. (1988) Pays. Rev. B 37, 2353-2356.
16. Lyons, K. B., Fleury, P. A., Schneemeyer, I. F. & Waszczak,
J. V. (1988) Phys. Rev. Lett. 60, 732-735.
17. Allen, P. B. & Dynes, R. C. (1975) Phys. Rev. B 12, 905-922.
18. Cava, R. J., Batlogg, B., Krajewski, J. J., Farrow, R., Rupp,
L. W., Jr., White, A. E., Short, K., Peck, W. R. & Kometani,
T. (1988) Nature (London) 332, 814-816.
19. Jin, S., Tiefel, T. H., Sherwood, R. C., Ramirez, A. P.,
Gyorgy, E. M., Kammlott, G. W. & Fastnacht, R. A. (1988)
Appl. Phys. Lett. 53, 1116-1118.
20. Cox, D. E. & Sleight, A. W. (1976) Solid State Commun. 19,
969-973.
21. Torrance, J. B., Tokura, Y., Nazzal, A. & Parkin, S. S. P.
(1988) Pays. Rev. Lett. 60, 542-545.
22. Pool, R. (1989) Science 243, 741.
23. Tokura, Y., Takagi, H. & Uchida, S. (1989) Nature (London)
337, 345-347.
so Uy