Mathematical Formulation of the Quantum Theory of Electromagnetic Interaction

R.P. Feynman

(Received June 8, 1950)

Abstract

The validity of the rules given in previous papers for the solution of problems in quantum electrodynamics is established. Starting with Fermi’s formulation of the field as a set of harmonic oscillators, the effect of the oscillators is integrated out in the Lagrangian form of quantum mechanics. There results an expression for the effect of all virtual photons valid to all orders in . It is shown that evaluation of this expression as a power series in gives just the terms expected by the aforementioned rules.

In addition, a relation is established between the amplitude for a given process in an arbitrary unquantized potential and in a quantum electrodynamical field. This relation permits a simple general statement of the laws of quantum electrodynamics.

A description, in Lagrangian quantum-mechanical form, of particles satisfying the Klein-Gordon equation is given in an Appendix. It involves the use of an extra parameter analogous to proper time to describe the trajectory of the particle in four dimensions.

A second Appendix discusses in the special case of photons, the problem of finding what real processes are implied by the formula for virtual processes.

Problems of the divergences of electrodynamics are not discussed.

1 Introduction

In two previous papers¹ rules were given for the calculation of the matrix element for any process in electrodynamics, to each order in . No complete proof of the equivalence of these rules to the conventional electrodynamics was given in these papers. Secondly, no closed expression was given valid to all orders in . In this paper these formal omissions will be remedied.²

In paper II it was pointed out that for many problems in electrodynamics the Hamiltonian method is not advantageous, and might be replaced by the over-all space-time point of view of a direct particle interaction. It was also mentioned that the Lagrangian form of quantum mechanics³ was useful in this connection. The rules given in paper II were, in fact, first deduced in this form of quantum mechanics. We shall give this derivation here.

The advantage of a Lagrangian form of quantum mechanics is that in a system with interacting parts it permits a separation of the problem such that the motion of any part can be analyzed or solved first, and the results of this solution may then be used in the solution of the motion of the other parts. This separation is especially useful in quantum electrodynamics which represents the interaction of matter with the electromagnetic field. The electromagnetic field is an especially simple system and its behavior can be analyzed completely. What we shall show is that the net effect of the field is a delayed interaction of the particles. It is possible to do this easily only if it is not necessary at the same time to analyze completely the motion of the particles. The only advantage in our problems of the form of quantum mechanics in C is to permit one to separate these aspects of the problem. There are a number of disadvantages, however, such as a lack of familiarity, the apparent (but not real) necessity for dealing with matter in non-relativistic approximation, and at times a cumbersome mathematical notation and method, as well as the fact that a great deal of useful information that is known about operators cannot be directly applied.

It is also possible to separate the field and particle aspects of a problem in a manner which uses operators and Hamiltonians in a way that is much more familiar. One abandons the notation that the order of action of operators depends on their written position on the paper and substitutes some other convention (such that the order of operators is that of the time to which they refer). The increase in manipulative facility which accompanies this change in notation makes it easier to represent and to analyze the formal problems in electrodynamics. The method requires some discussion, however, and will be described in a succeeding paper. In this paper we shall give the derivations of the formulas of II by means of the form of quantum mechanics given in C.

The problem of interaction of matter and field will be analyzed by first solving for the behavior of the field in terms of the coordinates of the matter, and finally discussing the behavior of the matter (by matter is actually meant the electrons and positrons). That is to say, we shall first eliminate the field variables from the equations of motion of the electrons and then discuss the behavior of the electrons. In this way all of the rules given in the paper II will be derived.

Actually, the straightforward elimination of the field variables will lead at first to an expression for the behavior of an arbitrary number of Dirac electrons. Since the number of electrons might be infinite, this can be used directly to find the behavior of the electrons according to hole theory by imagining that nearly all the negative energy states are occupied by electrons. But, at least in the case of motion in a fixed potential, it has been shown that this hole theory picture is equivalent to one in which a positron is represented as an electron whose space-time trajectory has had its time direction reversed. To show that this same picture may be used in quantum electrodynamics when the potentials are not fixed, a special argument is made based on a study of the relationship of quantum electrodynamics to motion in a fixed potential. Finally, it is pointed out that this relationship is quite general and might be used for a general statement of the laws of quantum electrodynamics.

Charges obeying the Klein-Gordon equation can be analyzed by a special formalism given in Appendix A. A fifth parameter is used to specify the four-dimensional trajectory so that the Lagrangian form of quantum mechanics can be used. Appendix B discusses in more detail the relation of real and virtual photon emission. An equation for the propagation of a self-interacting electron is given in Appendix C.

In the demonstration which follows we shall restrict ourselves temporarily to cases in which the particle’s motion is non-relativistic, but the transition of the final formulas to the relativistic case is direct, and the proof could have been kept relativistic throughout.

The transverse part of the electromagnetic field will be represented as an assemblage of independent harmonic oscillators each interacting with the particles, as suggested by Fermi.⁴ We use the notation of Heitler.⁵

2 Quantum electrodynamics in Lagrangian form

The Hamiltonian for a set of non-relativistic particles interacting with radiation is, classically, , where is the Hamiltonian of the particles of mass , charge , coordinate , and momentum , and their interaction with the transverse part of the electromagnetic field. This field can be expanded into plane waves

(1)

where and are two orthogonal polarization vectors at right angles to the propagation vector , magnitude . The sum over means, if normalized to unit volume, , and each , can be considered as the coordinate of a harmonic oscillator. (The factor arises for the mode corresponding to and to is the same.) The Hamiltonian of the transverse field represented as oscillators is

where is the momentum conjugate to . The longitudinal part of the field has been replaced by the Coulomb interaction⁶ ,

where . As is well known, when this Hamiltonian is quantized one arrives at the usual theory of quantum electrodynamics. To express these laws of quantum electrodynamics one can equally well use the Lagrangian form of quantum mechanics to describe this set of oscillators and particles. The classical Lagrangian equivalent to this Hamiltonian is where

(2)

When this Lagrangian is used in the Lagrangian forms of quantum mechanics of C, what it leads to is, of course, mathematically equivalent to the result of using the Hamiltonian in the ordinary way, and is therefore equivalent to the more usual forms of quantum electrodynamics (at least for non-relativistic particles). We may, therefore, proceed by using this Lagrangian form of quantum electrodynamics, with the assurance that the results obtained must agree with those obtained from the more usual Hamiltonian form.

The Lagrangian enters through the statement that the functional which carries the system from one state to another is where

(3)

The time integrals must be written as Riemann sums with some care; for example,

(4)

becomes according to C, Eq. (19)

(5)

so that the velocity which multiplies is

(6)

In the Lagrangian form it is possible to eliminate the transverse oscillators as is discussed in C, Section 13. One must specify, however, the initial and final state of all oscillators. We shall first choose the special, simple case that all oscillators are in their ground states initially and finally, so that all photons are virtual. Later we do the more general case in which real quanta are present initially or finally. We ask, then, for the amplitude for finding no quanta present and the particles in state , at time , if at time the particles were in state and no quanta were present.

The method of eliminating field oscillators is described in Section 13 of C. We shall simply carry out the elimination here using the notation and equations of C. To do this, for simplicity, we first consider in the next section the case of a particle or a system of particles interacting with a single oscillator, rather than the entire assemblage of the electromagnetic field.

3 Force harmonic oscillator

We consider a harmonic oscillator, coordinate , Lagrangian interacting with a particle or system of particles, action , through a term in the Lagrangian where is a function of the coordinates (symbolized as ) of the particle. The precise form of for each oscillator of the electromagnetic field is given in the next section. We ask for the amplitude that at some time the particles are in state , and the oscillator is in, say, an eigenstate of energy (units are chosen such that ) when it is given that at a previous time the particles were in state . and the oscillator in . The amplitude for this is the transition amplitude [see C, Eq. (61)]

(7)

where represents the variables describing the particle, is the action calculated classically for the particles for a given path going from coordinate at to at , is the action for any path of the oscillator going from at to at , while

(8)

the action of interaction, is a functional of both and , the paths of oscillator and particles. The symbols and represent a summation over all possible paths of particles and oscillator which go between the given end points in the sense defined in C, Eq. (9). (That is, assuming time to proceed in infinitesimal steps, , an integral over all values of the coordinates and corresponding to each instant in time, suitably normalized.)

The problem may be broken in two. The result can be written as an integral over all paths of the particles only, of :

(9)

where is a functional of the path of the particles alone (since it depends on ) given by

G _ (m n ) = < ϕ _ m | exp i ∫ q(t) γ(t) d t | ϕ _ n > _ S _ 0 = & ... t; ϕ _ n(q _ 0) d q _ 0 a^(-1) d q _ 1 a^(-1) d q _ 2 ··· a^(-1) d q _ j

(10)

where we have written the out explicitly (and have set , , , ). The last form can be written as

(11)

where is the kernel [as in I, Eq. (2)] for a forced harmonic oscillator giving the amplitude for arrival at at time if at time it was known to be at . According to C it is given by

(12)

where is the action calculated along the classical path between the end points , and is given explicitly in C.⁷ It is

Q = ω/(2 sin ω(t^'' - t^'))[(q _ j^2 + q _ 0^2) cos ω(t^'' - t^') - 2 q _ j q _ ... _ t^'^t^'' ∫ _ t^'^t γ(t) γ(s) sin ω(t^'' - t) sin ω(s - t^') d s d t]

(13)

The solution of the motion of the oscillator can now be completed by substituting (12) and (13) into (11) and performing the integrals. The simplest case is for for which case⁸

so that the integrals on , are just Gaussian integrals. There results

a result of fundamental importance in the succeeding developments. By replacing by its absolute value we may integrate both variables over the entire range and divide by 2. We will henceforth make the results more general by extending the limits on the integrals from to . Thus if one wishes to study the effect on a particle of interaction with an oscillator for just the period to one may use

(14)

imagining in this case that the interaction is zero outside these limits. We defer to a later section the discussion of other values of , .

Since is simply an exponential, we can write it as , consider that the complete “action” for the system of particles is and that one computes transition elements with this “action” instead of (see C, Sec. 12). The functional which is given by

(15)

is complex, however; we shall speak of it as the complex action. It describes the fact that the system at one time can affect itself at a different time by means of a temporary storage of energy in the oscillator. When there are several independent oscillators with different interactions, the effect if they arc all in the lowest state at and , is the product of their separate contributions. Thus the complex action is additive, being the sum of contributions like (15) for each of the several oscillators.

4 Virtual transitions in the electromagnetic field

We can now apply these results to eliminate the transverse field oscillators of the Lagrangian (2). At first we can limit ourselves to the case of purely virtual transitions in the electromagnetic field, so that there is no photon in the field at and . That is, all of the field oscillators are making transitions from ground state to ground state.

The , corresponding to each oscillator , is found from the interaction term [Eq. (2b)], substituting the value of given in (1). There results, for example,

(16)

the corresponding results for , , replace , by .

The complex action resulting from oscillator of coordinate , is therefore

The term exchanges the cosines for sines, so in the sum , the product of the two cosines, is replaced by or . The terms , give the same result with replacing . The sum is since it is the sum of the products of vector components in two orthogonal directions, so that if we add the product in the third direction (that of K) we construct the complete scalar product. Summing over all K then, since we find for the total complex action of all of the transverse oscillators,

I _ (t r) = i Underscript[∑, n] Underscript[∑, m] ∫ _ t^'^t^'' d t ∫ _ ... m^. (s ))] <br /> · cos(K · ( x _ n^. (t ) - x _ m^. (s ) )) d^3 K/(8 π^2 k) .

(17)

This is to be added to to obtain the complete action of the system with the oscillators removed.

The term in can be simplified by integration by parts with respect to and with respect to [note that has a discontinuous slope at , or break the integration up into two regions]. One finds

(18)

where

R = -i Underscript[∑, n] Underscript[∑, m] ∫ _ t^'^t^'' d t ∫ _ t^'^t^ ... _ n (t) · x^. _ m (s)) · cos K · (x _ n (t) - x _ m(s)) d^3 K/8 π^2 k

(19)

and

(20)

comes from the discontinuity in slope of at . Since

this term just cancels the Coulomb interaction term . The term

I _ transient = -Underscript[∑, n] Underscript[∑, m] e _ n e _ m ∫ d^3 K/(4 ... ; (x _ n(t^') - x _ m(t^')) - 2 exp(-i k (t^'' - t^')) cos K · (x _ n(t^') - x _ m(t^''))]}

(21)

is one which comes from the limits of integration at and , and involves the coordinates of the particle at either one of these times or the other. If and are considered to be exceedingly far in the past and future, there is no correlation to be expected between these temporally distant coordinates and the present ones, so the effects of will cancel out quantum mechanically by interference. This transient was produced by the sudden turning on of the interaction of field and particles at and its sudden removal at . Alternatively we can imagine the charges to be turned on after adiabatically and turned off slowly before (in this case, in the term , the charges should also be considered as varying with time). In this case, in the limit, is zero.⁹ Hereafter we shall drop the transient term and consider the range of integration of to be from to , imagining, if one needs a definition, that the charges vary with time and vanish in the direction of either limit.

To simplify we need the integral

(22)

where is the length of the vector . Now

where the equation serves to define [as in II, Eq. (3)]. Hence, expanding in exponentials find

J = -(4 π r)^(-1) ((| t | - r)^(-1) - (| t | + r)^(-1)) + (4 i r)^(-1) (δ(| t | - r) ... = -(2 π)^(-1) (t^2 - r^2)^(-1) + (2 i)^(-1) δ(t^2 - r^2) = -1/2 i δ _ +(t^2 - r^2)

(23)

where we have used the fact that

and that since both and are necessarily positive.

Substitution of these results into (19) gives finally,

(24)

The total complex action of the system is then¹⁰ . Or, what amounts to the same thing; to obtain transition amplitudes including the effects of the field we must calculate the transition element of :

(25)

under the action of the particles, excluding interaction. Expression (24) for must be considered to be written in the usual manner as a Riemann sum and the expression (25) interpreted as defined in C [Eq. (39)]. Expression (6) must be used for at time .

Expression (25), with (24), then contains all the effects of virtual quanta on a (at least non-relativistic) system according to quantum electrodynamics. It contains the effects to all orders in in a single expression. If expanded in a power series in , the various terms give the expressions to the corresponding order obtained by the diagrams and methods of II. We illustrate this by an example in the next section.

5 Example of application of expression (25)

We shall not be much concerned with the non-relativistic case here, as the relativistic case given below is as simple and more interesting. It is, however, very similar and at this stage it is worth giving an example to show how expressions resulting from (25) are to be interpreted according to the rules of C. For example, consider the case of a single electron, coordinate , either free or in an external given potential (contained for simplicity in¹¹ , not in ). Its interaction with the field produces a reaction back on itself given by as in (24) but in which we keep only a single term corresponding to . Assume the effect of to be small and expand as . Let us find the amplitude at time of finding the electron in a state with no quanta emitted, if at time it was in the same state. It is

where if is the energy of the state, and

(26)

Here , etc. In (26) we shall limit the range of integrations by assuming , and double the result.

The expression within the brackets on the right-hand side of (26) can be evaluated by the methods described in C [Eq. (29)]. An expression such as (26) can also be evaluated directly in terms of the propagation kernel [see I, Eq. (2)] for an electron moving in the given potential.

The term in the non-relativistic case produces an interesting complication which does not have an analog for the relativistic case with the Dirac equation. We discuss it below, but for a moment consider, in further detail expression (26) but with the factor replaced simply by unity.

The kernel is defined and discussed in I. From its definition as the amplitude that the electron be found at at time , if at it was at , we have

(27)

that is, more simply is the sum of over all paths which go from space time point 1 to 2.

In the integrations over all paths implied by the symbol in (26) we can first integrate over all the variables corresponding to times from to , not inclusive, the result being a factor according to (27). Next we integrate on the variables between and not inclusive, giving a factor and finally on those between and giving . Hence the left-hand term in (26) excluding the factor is

(28)

which in improved notation and in the relativistic case is essentially the result given in II.

We have made use of a special case of a principle which may be stated more generally as

< χ _ (t^'' ) | F(x _ 1, t _ 1 ; x _ 2, t _ 2 ; ··· x _ k, t _ k) ... s; ψ(x _ t^') d^3 x _ t^'' d^3 x _ 1 d^3 x _ 2 ··· d^3 x _ k d^3 x _ t^'

(29)

where is any function of the coordinate , at time , at up to , , and, it is important to notice, we have assumed .

Expressions of higher order arising for example from are more complicated as there are quantities referring to several different times mixed up, but they all can be interpreted readily. One simply breaks up the ranges of integrations of the time variables into parts such that in each the order of time of each variable is definite. One then interprets each part by formula (29).

As a simple example we may refer to the problem of element of the transition element

arising, say, in the cross term in and in an ordinary second order perturbation problem (disregarding radiation) with perturbation potential . In the integration on and which should include the entire range of time for each, we can split the range of into two parts, and . In the first case, , the potential acts earlier than , and in the other range, vice versa, so that

< χ _ t^'' | ∫ U(x _ t, t) d t ∫ V(x _ s, s) d s | ψ _ t^' > = ... x _ t, t) K(x _ t, t ; x _ t^', t^') ψ(x _ t^') d^3 x _ t^'' d^3 x _ s d^3 x _ t d^3 x _ t^'

(30)

so that the single expression on the left is represented by two terms analogous to the two terms required in analyzing the Compton effect. It is in this way that the several terms and their corresponding diagrams corresponding to each process arise when an attempt is made to represent the transition elements of single expressions involving time integrals in terms of the propagation kernels .

It remains to study in more detail the term in (26) arising from in the interaction. The interpretation of such expressions is considered in detail in C, and we must refer to Eqs. (39) through (50) of that paper for a more thorough analysis. A similar type of term also arises in the Lagrangian formulation in simpler problems, for example the transition element

arising say, in the cross term in A and B in a second-order perturbation problem for a particle in a perturbing vector potential . The time integrals must first be written as Riemannian sums, the velocity (see (6)) being replaced by so that we ask for the transition element of

(31)

In C it is shown that when converted to operator notation the quantity is equivalent (nearly, see below) to an operator,

(32)

operating in order indicated by the time index (that is after ’s for and before all ’s for ). In nonrelativistic mechanics is the momentum operator divided by the mass . Thus in (31) the expression becomes . Here again we must split the sum into two regions and so the quantities in the usual notation will operate in the right order such that eventually (31) becomes identical with the right-hand side of Eq. (30) but with replaced by the operator

standing in the same place, and with the operator

standing in the place of . The sums and factors ϵ have now become .

This is nearly but not quite correct, however, as there is an additional term coming from the terms in the sum corresponding to the special values, , and . We have tacitly assumed from the appearance of the expression (31) that, for a given , the contribution from just three such special terms is of order . But this is not true. Although the expected contribution of a term like for is indeed of order , the expected contribution of is [C, Eq. (50)], that is, of order ϵ. In nonrelativistic mechanics the velocities are unlimited and in very short times ϵ the amplitude diffuses a distance proportional to the square root of the time. Making use of this equation then we see that the additional contribution from these terms is essentially

when summed on all . This has the same effect as a first-order perturbation due to a potential . Added to the term involving the momentum operators we therefore have an additional term¹²

(33)

In the usual Hamiltonian theory this term arises, of course, from the term in the expansion of the Hamiltonian

while the other term arises from the second-order action of . We shall not be interested in non-relativistic quantum electrodynamics in detail. The situation is simpler for Dirac electrons. For particles satisfying the Klein-Gordon equation (discussed in Appendix A) the situation is very similar to a four-dimensional analog of the non-relativistic case given here.

6 Extension to Dirac particles

Expressions (24) and (25) and their proof can be readily generalized to the relativistic case according to the one electron theory of Dirac. We shall discuss the hole theory later. In the non-relativistic case we began with the proposition that the amplitude for a particle to proceed from one point to another is the sum over paths of , that is, we have for example for a transition element

(34)

where for we have written , that is more precisely,

As discussed in C this form is related to the usual form of quantum mechanics through the observation that

(35)

where is the transformation matrix from a representation in which is diagonal at time to one in which is diagonal at time (so that it is identical to for the small time interval ). Hence the amplitude for a given path can also be written

(36)

for which form, of course, (34) is exact irrespective of whether can be expressed in the simple form (35).

For a Dirac electron the is a matrix (or if we deal with electrons) but the expression (34) with (36) is still correct (as it is in fact for any quantum-mechanical system with a sufficiently general definition of the coordinate ). The product (36) now involves operators, the order in which the factors are to be taken is the order in which the terms appear in time.

For a Dirac particle in a vector and scalar potential (times the electron charge ) , , the quantity is related to that of a free particle to the first order in as

(37)

This can be verified directly by substitution into the Dirac equation.¹³ It neglects the variation of and with time and space during the short interval . This produces errors only of order in the Dirac case for the expected square velocity during the interval ϵ is finite (equaling the square of the velocity of light) rather than being of order as in the non-relativistic case. [This makes the relativistic case somewhat simpler in that it is not necessary to define the velocity as carefully as in (6); is sufficiently exact, and no term analogous to (33) arises.]

Thus differs from that for a free particle, , by a factor which in the limit can be written as

$exp {-i ∫ [A _ 4(x(t), t) - x^.(t) · A(x(t), t)] d t}$

(38)

exactly as in the non-relativistic case.

The case of a Dirac particle interacting with the quantum-mechanical oscillators representing the field may now be studied. Since the dependence of on , is through the same factor as in the non-relativistic case, when , are expressed in terms of the oscillator coordinates , the dependence of on the oscillator coordinates is unchanged. Hence the entire analysis of the preceding sections which concern the results of the integration over oscillator coordinates can be carried through unchanged and the results will be expression (25) with formula (24) for . Expression (25) is now interpreted as

(39)

where the amplitude for a particular path for particle is simply the expression (36) where is the kernel for a free electron according to the one electron Dirac theory, with the matrices which appear operating on the spinor indices corresponding to particle () and the order of all operations being determined by the time indices.

For calculational purposes we can, as before, expand as a power series and evaluate the various terms in the same manner as for the non-relativistic case. In such an expansion the quantity () is replaced, as we have seen in (32), by the operator , that is, in this case by operating at the corresponding time. There is no further complicated term analogous to (33) arising in this case, for the expected value of is now of order rather than .

For example, for self-energy one sees that expression (28) will be (with other terms coming from those with () replaced by and with the usual in back of each because of the definition of in relativity theory)

(40)

where and a sum on the repeated index μ is implied in the usual way; . One can change to and to In this manner all of the rules referring to virtual photons discussed in II are deduced; but with the difference that is used instead of and we have the Dirac one electron theory with negative energy states (although we may have any number of such electrons).

7 Extension to positron theory

Since in (39) we have an arbitrary number of electrons, we can deal with the hole theory in the usual manner by imagining that we have an infinite number of electrons in negative energy states.

On the other hand, in paper I on the theory of positrons, it was shown that the results of the hole theory in a system with a given external potential were equivalent to those of the Dirac one electron theory if one replaced the propagation kernel, , by a different one, , and multiplied the resultant amplitude by factor _ν involving . We must now see how this relation, derived in the case of external potentials, can also be carried over in electrodynamics to be useful in simplifying expressions involving the infinite sea of electrons.

To do this we study in greater detail the relation between a problem involving virtual photons and one involving purely external potentials. In using (25) we shall assume in accordance with the hole theory that the number of electrons is infinite, but that they all have the same charge, . Let the states represent the vacuum plus perhaps a number of real electrons in positive energy states and perhaps also some empty negative energy states. Let us call the amplitude for the transition in an external potential , but excluding virtual photons, , a functional of (1). We have seen (38)

(41)

where

by (38). We can write this as

where and , the other values of μ corresponding to space variables. The corresponding amplitude for the same process in the same potential, but including all the virtual photons we may call,

(42)

Now let us consider the effect on of changing the coupling of the virtual photons. Differentiating (42) with respect to which appears only¹⁴ in we find

(43)

We can also study the first-order effect of a change of :

(44)

where is the field point at which the derivative with respect to , is taken¹⁵ and the term (current density) is just . The function means that is, with . A second variation of gives, by differentiation of (44) with respect to ,

δ^2 T _ e^2[B]/δ B _ μ (1) δ B _ ν (2) = -< χ _ t^'' | Unders ... _ (α, 1)) δ^4 (x _ β^(n) (s) - x _ (β, 2)) exp i (R + P) | ψ _ t^' > .

Comparison of this with (43) shows that

(45)

where .

We now proceed to use this equation to prove the validity of the rules given in II for electrodynamics. This we do by the following argument. The equation can be looked upon as a differential equation for . It determines uniquely if is known. We have shown it is valid for the hole theory of positrons. But in I we have given formulas for calculating whose correctness relative to the hole theory we have there demonstrated. Hence we have shown that the obtained by solving (45) with the initial condition as given by the rules in I will be equal to that given for the same problem by the second quantization theory of the Dirac matter field coupled with the quantized electromagnetic field. But it is evident (the argument is given in the next paragraph) that the rules¹⁶ given in II constitute a solution in power series in of the Eq. (45) [which for reduce to the given in I]. Hence the rules in II must give, to each order in , the matrix element for any process that would be calculated by the usual theory of second quantization of the matter and electromagnetic fields. This is what we aimed to prove.

That the rules of II represent, in a power series expansion, a solution of (45) is clear. For the rules there given may be stated as follows: Suppose that we have a process to order in (i.e., having virtual photons) and order in the external potential . Then, the matrix element for the process with one more virtual photon and two less potentials is that obtained from the previous matrix by choosing from the potentials a pair, say acting at 1 and acting at 2, replacing them by , adding the results for each way of choosing the pair, and dividing by , the present number of photons. The matrix with no virtual photons being given to any by the rules of I, this permits terms to all orders in to be derived by recursion. It is evident that the rule in italics is that of II, and equally evident that it is a word expression of Eq. (45). [The factor in (45) arises since in integrating over all and we count each pair twice.The division by is required by the rules of II for, there, each diagram is to be taken only once, while in the rule given above we say what to do to add one extra virtual photon to others. But which one of the is to be identified at the last photon added is irrelevant. It agrees with (45) of course for it is canceled on differentiating with respect to the factor for the photons.]

8 Generalized formulation of quantum electrodynamics

The relation implied by (45) between the formal solution for the amplitude for a process in an arbitrary unquantized external potential to that in a quantized field appears to be of much wider generality. We shall discuss the relation from a more general point of view here (still limiting ourselves to the case of no photons in initial or final state).

In earlier sections we pointed out that as a consequence of the Lagrangian form of quantum mechanics the aspects of the particles’ motions and the behavior of the field could be analyzed separately. What we did was to integrate over the field oscillator coordinates first. We could, in principle, have integrated over the particle variables first. That is, we first solve the problem with the action of the particles and their interaction with the field and then multiply by the exponential of the action of the field and integrate over all the field oscillator coordinates. (For simplicity of discussion let us put aside from detailed special consideration the questions involving the separation of the longitudinal and transverse parts of the field.) Now the integral over the particle coordinates for a given process is precisely the integral required for the analysis of the motion of the particles in an unquantized potential. With this observation we may suggest a generalization to all types of systems.

Let us suppose the formal solution for the amplitude for some given process with matter in an external potential is some numerical quantity . We mean matter in a more general sense now, for the motion of the matter may be described by the Dirac equation, or by the Klein-Gordon equation, or may involve charged or neutral particles other than electrons and positrons in any manner whatsoever. The quantity depends of course on the potential function ; that is, it is a functional of this potential. We assume we have some expression for it in terms of (exact, or to some desired degree of approximation in the strength of the potential).

Then the answer to the corresponding problem in quantum electrodynamics is summed over all possible distributions of field , wherein is the action for the field the sum on carrying the usual minus sign for space components.

If is any functional of we shall represent by this superposition of over distributions of for the case in which there are no photons in initial or final state. That is, we have

(46)

The evaluation of directly from the definition of the operation is not necessary. We can give the result in another way. We first note that the operation is linear,

(47)

so that if is represented as a sum of terms each term can be analyzed separately. We have studied essentially the case in which is an exponential function. In fact, what we have done in Section 4 may be repeated with slight modification to show that

(48)

where is an arbitrary function of position and time for each value of .

Although this gives the evaluation of for only a particular functional of the appearance of the arbitrary function makes it sufficiently general to permit the evaluation for any other functional. For it is to be expected that any functional can be represented as a superposition of exponentials with different functions (by analogy with the principle of Fourier integrals for ordinary functions). Then, by (47), the result of the operation is the corresponding superposition of expressions equal to the right-hand side of (48) with the various ’s substituted for .

In many applications can be given as a power series in :

(49)

where are known numerical functions independent of . Then by (47)

(50)

where we set (from (48) with ). We can work out expressions for the successive powers of by differentiating both sides of (48) successively with respect to and setting in each derivative. For example, the first variation (derivative) of (48) with respect to gives

 _ 0 | -i A _ μ(3) exp(-i ∫ j _ ν(1) A _ ν(1) d τ _ 1) | _ 0 ... 2 ∫ ∫ j _ ν(1) j _ ν(2) δ _ +(s _ (1 2)^2) d τ _ 1 d τ _ 2) .

(51)

Setting gives

Differentiating (5l) again with respect to and setting shows

(52)

and so on for higher powers. These results may be substituted into (50). Clearly therefore when in (46) is expanded in a power series and the successive terms are computed in this way, we obtain the results given in II.

It is evident that (46), (47), (48) imply that satisfies the differential equation (45) and conversely (45) with the definition (46) implies (47) and (48). For if is an exponential

(53)

we have from (46), (48) that

(54)

Direct substitution of this into Eq. (45) shows it to be a solution satisfying the boundary condition (53). Since the differential equation (45) is linear, if is a superposition of exponentials, the corresponding superposition of solutions (54) is also a solution.

Many of the formal representations of the matter system (such as that of second quantization of Dirac electrons) represent the interaction with a fixed potential in a formal exponential form such as the left-[hand] side of (48), except that is an operator instead of a numerical function, Equation (48) may still be used if care is exercised in defining the order of the operators on the right-hand side. The succeeding paper will discuss this in more detail.

Equation (45) or its solution (46), (47), (48) constitutes a very general and convenient formulation of the laws of quantum electrodynamics for virtual processes. Its relativistic invariance is evident if it is assumed that the unquantized theory giving is invariant. It has been proved to be equivalent to the usual formulation for Dirac electrons and positrons (for Klein-Gordon particles see Appendix A). It is suggested that it is of wide generality. It is expressed in a form which has meaning even if it is impossible to express the matter system in Hamiltonian form; in fact, it only requires the existence of an amplitude for fixed potentials which obeys the principle of superposition of amplitudes. If is known in power series in , calculations of in a power series of can be made directly using the italicized rule of Sec. 7. The limitation to virtual quanta is removed in the next section.

On the other hand, the formulation is unsatisfactory because for situations of importance it gives divergent results, even if is finite. The modification proposed in II of replacing in (45), (48) by is not satisfactory owing to the loss of the theorems of conservation of energy or probability discussed in II at the end of Sec. 6. There is the additional difficulty in positron theory that even is infinite to begin with (vacuum polarization). Computational ways of avoiding these troubles are given in II and in the references of footnote 2.

9 Case of real photons

The case in which there are real photons in the initial or the final state can be worked out from the beginning in the same manner.¹⁷ We first consider the case of a system interacting with a single oscillator. From this result the generalization will be evident. This time we shall calculate the transition element between an initial state in which the particle is in state and the oscillator is in its th eigenstate (i.e., there are photons in the field) to a final state with particle in , oscillator in th level. As we have already discussed, when the coordinates of the oscillator are eliminated the result is the transition element where

where , are the wave functions [8] for the oscillator in state , and is given in (12). The can be evaluated most easily by calculating the generating function

(55)

for arbitrary , . If expression (11) is substituted in the left-hand side of (55), the expression can be simplified by use of the generating function relation for the eigenfunctions [8] of the harmonic oscillator

Using a similar expansion for the one is left with the exponential of a quadratic function of and . The integration on and is then easily performed to give

(56)

from which expansion in powers of and and comparison to (3.11) gives the final result

(57)

where is given in (14) and

(58)

and the sum on is to go from 0 to or to whichever is the smaller. (The sum can be expressed as a Laguerre polynomial but there is no advantage in this.)

Formula (57) is readily understandable. Consider first a simple case of absorption of one photon. Initially we have one photon and finally none. The amplitude for this is the transition element of or . This is the same as would result if we asked for the transition element for a problem in which all photons are virtual but there was present a perturbing potential and we required the first-order effect of this potential. Hence photon absorption is like the first order action of a potential varying in time as that is with a positive frequency (i.e., the sign of the coefficient of in the exponential corresponds to positive energy). The amplitude for emission of one photon involves , which is the same result except that the potential has negative frequency. Thus we begin by interpreting as the amplitude for emission of one photon as the amplitude for absorption of one.

Next for the general case of photons initially and finally we may understand (57) as follows. We first neglect Bose statistics and imagine the photons as individual distinct particles. If we start with and end with this process may occur in several different ways. The particle may absorb in total of the photons and the final photons will represent of the photons which were present originally plus new photons emitted by the particle. In this case the which are to be absorbed may be chosen from among the original in different ways, and each contributes a factor , the amplitude for absorption of a photon. Which of the photons from among the arc emitted can be chosen in different ways and each photon contributes a factor in amplitude. The initial photons which do not interact with the particle can be re-arranged among the final in ways. We must sum over the alternatives corresponding to different values of . Thus the form of can be understood. The remaining factor may be interpreted as saying that in computing probabilities (which therefore involves the square of ) the photons may be considered as independent but that if are actually equal the statistical weight of each of the states which can be made by rearranging the m equal photons is only . This is the content of Bose statistics; that equal particles in a given state represents just one state, i.e., has statistical weight unity, rather than the statistical weight which would result if it is imagined that the particles and states can be identified and rearranged in different ways. This holds for both the initial and final states of course. From this rule about the statistical weights of states the derivation of the blackbody distribution law follows.

The actual electromagnetic field is represented as a host of oscillators each of which behaves independently and produces its own factor such as . Initial or final states may also be linear combinations of states in which one or another oscillator is excited. The results for this case are of course the corresponding linear combination of transition elements.

For photons of a given direction of polarization and for sin or cos waves the explicit expression for β can be obtained directly from (58) by substituting the formulas (16) for the γ’s for the corresponding oscillator. It is more convenient to use the linear combination corresponding to running waves. Thus we find the amplitude for absorption of a photon of momentum K, frequency polarized in direction e is given by including a factor times

(59)

in the transition element (25). The density of states in momentum space is now . The amplitude for emission is just times the complex conjugate of this expression, or what amounts to the same thing, the same expression with the sign of the four vector (?) reversed. Since the factor (59) is exactly the first-order effect of a vector potential

of the corresponding classical wave, we have derived the rules for handling real photons discussed in II.

We can express this directly in terms of the quantity , the amplitude for a given transition without emission of a photon. What we have said is that the amplitude for absorption of just one photon whose classical wave form is (time variation corresponding to positive energy ) is proportional to the first order (in ϵ) change produced in on changing to . That is, more exactly,

(60)

is the amplitude for absorption by the particle system of one photon, . (A superposition argument shows the expression to be valid not only for plane waves, but for spherical waves, etc., as given by the form of .) The amplitude for emission is the same expression but with the sign of the frequency reversed in . The amplitude that the system absorbs two photons with waves and is obtained from the next derivative,

the same expression holding for the absorption of one and emission of the other, or emission of both depending on the sign of the time dependence of and . Larger photon numbers correspond to higher derivatives, absorption of emission of requiring the derivaties. When two or more of the photons are exactly the same (e.g., ) the same expression holds for the amplitude that are absorbed by the system while are emitted. However, the statement that initially of a kind are present and of this kind are present finally, does not imply and . It is possible that only were absorbed by the system and emitted, and that remained from initial to final state without interaction. This term is weighed by the combinatorial coefficient and summed over the possibilities for as explained in connection with (57). Thus once the amplitude for virtual processes is known, that for real photon processes can be obtained by differentiation.

It is possible, of course, to deal with situations in which the electromagnetic field is not in a definite state after the interaction. For example, we might ask for the total probability of a given process, such as scattering, without regard for the number of photons emitted. This is done of course by squaring the amplitude for the emission of photons of a given kind and summing on all . Actually the sums and integrations over the oscillator momenta can usually easily be performed analytically. For example, the amplitude, starting from vacuum and ending with photons of a given kind, is by (56) just

(61)

The square of the amplitude summed on requires the product of two such expressions (the in the β of one and in the other will have to be kept separately) summed on :

In the resulting expression the sum over all oscillators is easily done. Such expressions can be of use in the analysis in a direct manner of problems of line width, of the Bloch-Nordsieck infra-red problem, and of statistical mechanical problems, but no such applications will be made here.

The author appreciates his opportunities to discuss these matters with Professor H. A. Bethe and Professor J. Ashkin, and the help of Mr. M. Baranger with the manuscript.

A The Klein-Gordon equation

In this Appendix we describe a formulation of the equations for a particle of spin zero which was first used to obtain the rules given in II for such particles. The complete physical significance of the equations has not been analyzed thoroughly so that it may be preferable to derive the rules directly from the second quantization formulation of Pauli and Weisskopf. This can be done in a manner analogous to the derivation of the rules for the Dirac equation given in I or from the Schwinger-Tomonaga formulation [2] in a manner described, for example, by Rohrlich.¹⁸ The formulation given here is therefore not necessary for a description of spin zero particles but is given only for its own interest as an alternative to the formulation of second quantization.

We start with the Klein-Gordon equation

(A1)

for the wave function ψ of a particle of mass in a given external potential . We shall try to represent this in a manner analogous to the formulation of quantum mechanics in C. That is, we try to represent the amplitude for a particle to get from one point to another as a sum over all trajectories of an amplitude where is the classical action for a given trajectory. To maintain the relativistic invariance in evidence the idea suggests itself of describing a trajectory in space-time by giving thc four variables as functions of some fifth parameter (rather than expressing , , in terms of ). As we expect to represent paths which may reverse themselves in time (to represent pair production, etc., as in I) this is certainly a more convenient representation, for all four functions may be considered as functions of a parameter (somewhat analogous to proper time) which increase as we go along the trajectory, whether the trajectory is proceeding forward or backward in time.¹⁹ We shall then have a new type of wave function a function of five variables, standing for the four . It gives the amplitude for arrival at point with a certain value of the parameter . We shall suppose that this wave function satisfies the equation

(A2)

which is seen to be analogous to the time-dependent Schrödinger equation, replacing the time and the four coordinates of space-time replacing the usual three coordinates of space.

Since the potentials are functions only of coordinates , and are independent of , the equation is separable in and we can write a special solution in the form where , a function of the coordinates only, satisfies (A1) and the eigenvalue conjugate to is related to the mass of the particle. Equation (A2) is therefore equivalent to the Klein-Gordon Eq. (A1) provided we ask in the end only for the solution of (A1) corresponding to the eigenvalue for the quantity conjugate to .

We may now proceed to represent Eq. (A2) in Lagrangian form in general and without regard to this eigenvalue condition. Only in the final solutions need we apply the eigenvalue condition. That is, if we have some special solution of (A2) we can select that part corresponding to the eigenvalue by calculating

(A3)

and thereby obtain a solution ψ of Eq. (A1).

Since (A2) is so closely analogous to the Schrödinger equation, it is easily written in the Lagrangian form described in C, simply by working by analogy. For example if is known at one value of its value at a slightly larger is given by

(A4)

where means , and the sign of the normalizing factor is changed for the component since the component has the reversed sign in its quadratic coefficient in the exponential, in accordance with our summation convention . Equation (A4), as can be verified readily as described in C, Sec. 6, is equivalent to first order in ϵ to Eq. (A2). Hence, by repeated use of this equation the wave function at can be represented in terms of that at by:

ϕ(x _ (ν, n), u _ 0) = ∫ exp - (i ϵ)/2 Underoverscript[∑, i = 1, ... _ (ν, 0), 0) Underoverscript[∏, i = 0, arg3] (d^4 τ _ i/4 π^2 ϵ^2 i) .

(A5)

That is, roughly, the amplitude for getting from one point to another with a given value of is the sum over all trajectories of where

(A6)

when sufficient care is taken to define the quantities, as in C. This completes the formulation for particles in a fixed potential but a few words of description may be in order.

In the first place in the special case of a free particle we can define a kernel for arrival from to at as the sum over all trajectories between these points of . Then for this case we have

(A7)

and it is easily verified that is given by

(A8)

for and by , by definition, for . The corresponding kernel of importance when we select the eigenvalue is²⁰

(A9)

(the last extends only from since is zero for negative ) which is identical to the defined in II.²¹ This may be seen readily by studying the Fourier transform, for the transform of the integrand on the right-hand side is

so that the integration gives for the transform of just with the pole defined exactly as in II. [21] Thus we are automatically representing the positrons as trajectories with the time sense reversed.

If is the amplitude for a given trajectory for a free particle, then the amplitude in a potential is

(A10)

If desired this may be studied by perturbation methods by expanding the exponential in powers of .

For interpretation, the integral in (A10) must be written as a Riemann sum, and if a perturbation expansion is made, care must be taken with the terms quadratic in the velocity, for the effect of is not of order but is . The “velocity” becomes the momentum operator operating half before and half after , just as in the non-relativistic Schrödinger equation discussed in Sec. 5. Furthermore, in exactly the same manner as in that case, but here in four dimensions, a term quadratic in arises in the second-order perturbation terms from the coincidence of two velocities for the same value of .

As an example, the kernel for proceeding from to in a potential differs from to first order in by a term

(A11)

the here meaning . The kernel of importance on selecting the eigenvalue is obtained by multiplying this by and integrating from 0 to . The kernel depends only on and in the integrals on and ; , can be written, on interchanging the order of integration and changing variables to and , . Now the integral on converts to by (A9), while that on converts to , so the result becomes

(A12)

as expected. The same principle works to any order so that the rules for a single Klein-Gordon particle in external potentials given in II, Section 9, are deduced.

The transition to quantum electrodynamics is simple for in (A6) we already have a transition amplitude represented as a sum (over trajectories, and eventually ) of terms, in each of which the potential appears in exponential form. We may make use of the general relation (54). Hence, for example, one finds for the case of no photons in the initial and final states, in the presence of an external potential , the amplitude that a particle proceeds from to is the sum over all trajectories of the quantity

exp - i[1/2 ∫ _ 0^u _ 0 (d x _ μ/d u)^2 d u + ∫ _ 0^u _ 0 d x _ μ/d u B ... d u d x _ ν(u^')/d u^' × δ _ +((x _ μ(u) - x _ μ(u^'))^2) d u d u^'] .

(A13)

This result must be multiplied by and integrated on from zero to infinity to express the action of a Klein-Gordon particle acting on itself through virtual photons. The integrals are interpreted as Riemann sums, and if perturbation expansions arc made, the necessary care is taken with the terms quadratic in velocity. When there are several particles (other than the virtual pairs already included) one use a separate for each, and writes the amplitude for each set of trajectories as the exponential of times

1/2 Underscript[∑, n] ∫ _ 0^u _ 0^(n) (d x _ μ^n)/d u)^2 d u + Underscript[&# ... _ ν^(m)(u^')/d u^' × δ _ +((x _ μ^(n)(u) - x _ μ^(m)(u^'))^2) d u d u^'

(A14)

where are the coordinates of the trajectory of the nth particle.²² The solution should depend on the as .

Actually, knowledge of the motion of a single charge implies a great deal about the behavior of several charges. For a pair which eventually may turn out to be a virtual pair may appear in the short run as two “other particles.” As a virtual pair, that is, as the reverse section of a very long and complicated single track we know its behavior by (A13). We can assume that such a section can be looked at equally well, for a limited duration at least, as being due to other unconnected particles. This then implies a definite law of interaction of particles if the self-action (A13) of a single particle is known. (This is similar to the relation of real and virtual photon processes discussed in detail in Appendix B.) It is possible that a detailed analysis of this could show that (A13) implied that (A14) was correct for many particles. There is even reason to believe that the law of Bose-Einstein statistics and the expression for contributions from closed loops could be deduced by following this argument. This has not yet been analyzed completely, however, so we must leave this formulation in an incomplete form. The expression for closed loops should come out to be where , the contribution from a single loop, is

(A15)

where is the sum over all trajectories which close on themselves of with given in (A6), and a final integration on is made. This is equivalent to putting

(A16)

The term is subtracted only to simplify convergence problems (as adding a constant independent of , to has no effect).

B The relation of real and virtual processes

If one has a general formula for all virtual processes he should be able to find the formulas and states involved in real processes. That is to say, we should be able to deduce the formulas of Section 9 directly from the formulation (24), (25) (or its generalized equivalent such as (46), (48)) without having to go all the way back to the more usual formulation. We discuss this problem here.

That this possibility exists can be seen from the consideration that what looks like a real process from one point of view may appear as a virtual process occurring over a more extended time.

For example, if we wish to study a given real process, such as the scattering of light, we can, if we wish, include in principle the source, scatterer, and eventual absorber of the scattered light in our analysis. We may imagine that no photon is present initially, and that the source then emits light (the energy coming say from kinetic energy in the source). The light is then scattered and eventually absorbed (becoming kinetic energy in the absorber). From this point of view the process is virtual; that is, we start with no photons and end with none. Thus we can analyze the process by means of our formula for virtual processes, and obtain the formulas for real processes by attempting to break the analysis into parts corresponding to emission, scattering, and absorption.²³

To put the problem in a more general way, consider the amplitude for some transition from a state empty of photons far in the past (time ) to a similar one far in the future (). Suppose the time interval to be split into three regions , , in some convenient manner, so that region is an interval around the present time that we wish to study. Region , (), precedes , and , (), follows . We want to see how it comes about that the phenomena during can be analyzed by a study of transitions between some initial state at time , (which no longer need be photon-free), to some other final state at time . The states and are members of a large class which we will have to find out how to specify. (The single index is used to represent a large number of quantum numbers, so that different values of will correspond to having various numbers of various kinds of photons in the field, etc.) Our problem is to represent the over-all transition amplitude, , as a sum over various values of , of a product of three amplitudes,

(B1)

first the amplitude that during the interval the vacuum state makes transition to some state , then the amplitude that during the transition to is made, and finally in the amplitude that the transition from to some photon-free state 0 is completed. The mathematical problem of splitting is made definite by the further condition that for given , must not involve the coordinates of the particles for times corresponding to regions or , must involve those only in region , and only in .

To become acquainted with what is involved, suppose first that we do not have a problem involving virtual photons, but just the transition of a one-dimensional Schrödinger particle going in a long time interval from, say, the origin to the origin , and ask what states we shall need for intermediary time intervals. We must solve the problem (B1) where is the sum over all trajectories going from at to at of where . The integral may be split into three parts corresponding to the three ranges of time. Then and the separation (B1) is accomplished by taking for the sum over all trajectories lying in from to some end point of , for the sum over trajectories in of between end points and and for the sum of over the section of the trajectory lying in and going from to . Then the sum on and can be taken to be the integrals on , , respectively. Hence the various states can be taken to correspond to particles being at various coordinates . (Of course any other representation of the states in the sense of Dirac’s transformation theory could be used equally well. Which one, whether coordinate, momentum, or energy level representation, ;is of course just a matter of convenience and wee cannot determine that simply from (B1).)

We can consider next the problem including virtual photons. That is, now contains an additional factor where involves a double integral over all time. Those parts of the index which correspond to the particle states can be taken in the same way as though were absent. We study now the extra complexities in the states produced by splitting the . Let us first (solely for simplicity of the argument) take the case that there are only two regions , separated by time , and try to expand

(B2)

The factor involves as a double integral which can be split into three parts for the first of which both , are in , for the second both are in , for the third one is in the other in . Writing as shows that the factors and produce no new problems for they can be taken bodily into and respectively. However, we must disentangle the variables which are mixed up in .

The expression for , is just twice (24) but with the integral on extending over the range and that for extending over . Thus contains the variables for times in and in in a quite complicated mixture. Our problem is to write as a sum over possibly a vast class of states of the product of two parts, like, each of which involves the coordinates in one interval alone.

This separation may be made in many different ways, corresponding to various possible representations of the state of the electromagnetic field. We choose a particular one. First we can expand the exponential, , in a power series, as . The states can therefore be subdivided into subclasses corresponding to an integer which we can interpret as the number of quanta in the field at time . The amplitude for the case clearly just involves and in the way that it should if we interpret these as the amplitudes for regions and , respectively, of making a transition between a state of zero photons and another state of zero photons.

Next consider the case . This implies an additional factor in the transitional element; the factor . The variables are still mixed up. But an easy way to perform the separation suggests itself. Namely, expand the in , as a Fourier integral as

(B3)

For the exponential can be written immediately as a product of , a function only of coordinates for times in (suppose ), and (a function only of coordinates during interval ). The integral on can be symbolized as sum over states characterized by the value of K. Thus the state with must be further characterized by specifying a vector K, interpreted as the momentum of the photon. Finally the factor in is simply the sum of four parts each of which is already split (namely 1, and each of the three components in the vector scalar product). Hence each photon of momentum K must still be characterized by specifying it as one of four varieties; that is, there are four polarizations.²⁴ Thus in trying to represent the effect of the past on the future we are lead to invent photons of four polarizations and characterized by a propagation vector K.

The term for a given polarization and value of K (for ) is clearly just where the is defined in (59) but with the time integral extending just over region , while is the same expression with the integration over region . Hence the amplitude for transition during interval a from a state with no quanta to a state with one in a given state of polarization and momentum is calculated by inclusion of an extra factor in the transition element. Absorption in region corresponds to a factor .

We next turn to the case . This requires analysis of . The can be expanded again as a Fourier integral, but for each of the two in we have a value of K which may be different. Thus we say we have two photons, one of momentum K and one momentum and we sum over all values of and . (Similarly each photon is characterized by its own independent polarization index.) The factor can be taken into account neatly by asserting that we count each possible pair of photons as constituting just one state at time . Then the arises for the sum over all , (and polarizations) counts each pair twice. On the other hand, for the terms representing two identical photons () of like polarization, the cannot be so interpreted. Instead we invent the rule that a state of two like photons has statistical weight as great as that calculated as though the photons were different. This, generalized to identical photons, is the rule of Bose statistics.

The higher values of offer no problem. The is interpreted combinatorially for different photons, and as a statistical factor when some are identical. For example, for all identical one obtains a factor so that can be interpreted as the amplitude for emission (from no initial photons) of identical photons, in complete agreement with (61) for .

To obtain the amplitude for transitions in which neither the initial nor the final state is empty of photons we must consider the more general case of the division into three time regions (B1). This time we see that the factor which involves the coordinates in an entangled manner is . It is to be expanded in the form . Again the expansion in power series and development in Fourier series with a polarization sum will solve the problem. Thus the exponential is . Now the are written as Fourier series, one of the terms containing variables K. Since involve , involve and involve , this term will give the amplitude that photons are emitted during the interval , of those are absorbed during but the remaining , along with new ones emitted during go on to be absorbed during the interval . We have therefore photons in the state at time when begins, and at when is over. They each are characterized by momentum vectors and polarizations. When these are different the factors are absorbed combinatorially. When some are equal we must invoke the rule of the statistical weights. For example, suppose all photons are identical. Then , , so that our sum is

(B4)

Putting , , this is the sum on and of

(B5)

The last factor we have seen is the amplitude for emission of photons during interval , while the first factor is the amplitude for absorption of during . The sum is therefore the factor for transition from to identical photons, in accordance with (57). We see the significance of the simple generating function (56).

We have therefore found rules for real photons in terms of those for virtual. The real photons are a way of representing and keeping track of those aspects of the past behavior which may influence the future.

If one starts from a theory involving an arbitrary modification of the direct interaction (or in more general situations) it is possible in this way to discover what kinds of states and physical entities will be involved if one tries to represent in the present all the information needed to predict the future. With the Hamiltonian method, which begins by assuming such a representation, it is difficult to suggest modifications of a general kind, for one cannot formulate the problem without having a complete representation of the characteristics of the intermediate states, the particles involved in interaction, etc. It is quite possible (in the author’s opinion, it is very likely) that we may discover that in nature the relation of past and future is so intimate for short durations that no simple representation of a present may exist. In such a case a theory could not find expression in Hamiltonian form.

An exactly similar analysis can be made just as easily starting with the general forms (46), (48). Also a coordinate representation of the photons could have been used instead of the familiar momentum one. One can deduce the rules (60), (61). Nothing essentially different is involved physically, however, so we shall not pursue the subject further here. Since they imply [23] all the rules for real photons, Eqs. (46), (47), (48) constitute a compact statement of all the laws of quantum electrodynamics. But they give divergent results. Can the result after charge and mass renormalization also be expressed to all orders in in a simple way?

C Differential equation for electron propagation

An attempt has been made to find a differential wave equation for the propagation of an electron interacting with itself, analogous to the Dirac equation, but containing terms representing the self-action. Neglecting all effects of closed loops, one such equation has been found, but not much has been done with it. It is reported here for whatever value it may have.

An electron acting upon itself is, from one point of view, a complex system of a particle and a field of an indefinite number of photons. To find a differential law of propagation of such a system we must ask first what quantities known at one instant will permit the calculation of these same quantities an instant later. Clearly, a knowledge of the position of the particle is not enough. We should need to specify: (1) the amplitude that the electron is at and there are no photons in the field, (2) the amplitude the electron is at and there is one photon of such and such a kind in the field, (3) the amplitude there are two photons, etc. That is, a series of functions of ever increasing numbers of variables. Following this view, we shall be led to the wave equation of the theory of second quantization.

We may also take a different view. Suppose we know a quantity , a spinor function of , and functional of , defined as the amplitude that an electron arrives at , with no photon in the field when it moves in an arbitrary external unquantized potential . We allow the electron also to interact with itself, but is the amplitude at a given instant that there happens to be no photons present. As we have seen, a complete knowledge of this functional will also tell us the amplitude that the electron arrives at and there is just one photon, of form present. It is, from (60), .

Higher numbers of photons correspond to higher functional derivatives of . Therefore, contains all the information requisite for describing the state of the electron-photon system, and we may expect to find a differential equation for it. Actually it satisfies (, ),

(C1)

as may be seen from a physical argument.²⁵ The operator operating on the coordinate of should equal, from Dirac’s equation, the changes in as we go from one position to a neighboring position due to the action of vector potentials. The term is the effect of the external potential. But may also change for at the first position we may have had a photon present (amplitude that it was emitted at another point l is ) which was absorbed at (amplitude photon released at 1 gets to is where is the squared invariant distance from 1 to ) acting as a vector potential there (factor ). Effects of vacuum polarization are left out.

Expansion of the solution of (C1) in a power series in and starting from a free particle solution for a single electron, produces a series of terms which agree with the rules of II for action of potentials and virtual photons to various orders. It is another matter to use such an equation for the practical solution of a problem to all orders in . It might be possible to represent the self-energy problem as the variational problem for , stemming from (C1). The will first have to be modified to obtain a convergent result.

We are not in need of the general solution of (C1). (In fact, we have it in (46), (48) in terms of the solution of the ordinary Dirac equation . The general solution is too complicated, for complete knowledge of the motion of a self-acting electron in an arbitrary potential is essentially all of electrodynamics (because of the kind of relation of real and virtual processes discussed for photons in Appendix B, extended also to real and virtual pairs). Furthermore, it is easy to see that other quantities also satisfy (C1). Consider a system of many electrons, and single out some one for consideration, supposing all the others go from some definite initial state to some definite final state . Let be the amplitude that the special electron arrives at , there are no photons present, and the other electrons go from to when there is an external potential present (which also acts on the other electrons). Then also satisfies (C1). Likewise the amplitude with closed loops (all other electrons go vacuum to vacuum) also satisfies (C1) including all vacuum polarization effects. The various problems correspond to different assumptions as to the dependence of on in the limit of zero . The Eq. (C1) without further boundary conditions is probably too general to be useful.

Notes

¹	R P. Feynman, Phys. Rev. 76, 749 (1949), hereafter called I, and Phys. Rev. 76, 769 (1949), hereafter called II.

See in this connection also the papers of S. Tomonaga, Phys. Rev. 74, 224 (l948); S. Kanesawa and S. Tomonaga, Prog. Theoret. Phys. 3, 101 (1948); J. Schwinger, Phys. Rev. 76, 790 (1949); F. Dyson, Phys. Rev. 75, 1736 (1949); W. Pauli and F. Villars, Rev. Mod. Phys. 21, 434 (1949). The papers cited give references to previous work.

³	R. P. Feynman, Rev. Mod. Phys. 20, 367 (1948), hereafter called C.

⁴	E. Fermi, Rev. Mod. Phys. 4, 87 (1932).

⁵	W. Heitler, The Quantum Theory of Radiation, second edition (Oxford University Press, London, 1944).

⁶

The term in the sum for

is obviously infinite but must be included for relativistic invariance. Our problem here is to re-express the usual (and divergent) form of electrodynamics in the form given in II. Modifications for dealing with the divergences are discussed in II and we shall not discuss them further here.

⁷

That (12) is correct, at least insofar as it depends on

, can be seen directly as follows. Let

be the classical path which satisfies the boundary condition

. Then in the integral defining

replace each of the variables

, (

) that is, use the displacement

from the classical path

as the coordinate rather than the absolute position. With the substitution

in the action

the terms linear in

drop out by integrations by parts using the equation of motion

for the classical path, and the boundary conditions

. That this should occur should occasion no surprise, for the action functional is an extremum at

so that it will only depend to second order in the displacements

from this extremal orbit

. Further, since the action functional is quadratic to begin with, it cannot depend on

more than quadratically. Hence

so that since

. The factor following the

is the amplitude for a free oscillator to proceed from

and does not therefore depend on

, or

, being a function only of

. [That it is actually

can be demonstrated either by direct integration of the

variables or by using some normalizing property of the kernels

, for example that

for the case

must equal unity.] The expression for

given in C on page 386 is in error, the quantities

and

should be interchanged.

⁸	It is most convenient to define the state with the phase factor and the final state with the factor so that the results will not depend on the particular times , chosen.

⁹

One can obtain the final result, that the total interaction is just

, in a formal manner starting from the Hamiltonian from which the longitudinal oscillators have not yet been eliminated. There are for each

and

, four oscillators

corresponding to the three components of the vector potential (

) and the scalar potential (

). It must then be assumed that the wave functions of the initial and final state of the

oscillators is the function

. The wave function suggested here has only formal significance, of course, because the dependence on

is not square integrable, and cannot be normalized. If each oscillator were assumed actually in the ground state, the sign of the

term would be changed to positive, and the sign of the frequency in the contribution of these oscillators would be reversed (they would have negative energy).

¹⁰

The classical action for this problem is just

where

is the real part of the expression (24). In view of the generalization of the Lagrangian formulation of quantum mechanics suggested in Section 12 of C, one might have anticipated that

would have been simply

. This corresponds, however, to boundary conditions other than no quanta present in past and future. It is harder to interpret physically. For a system enclosed in a light tight box, however, it appears likely that both

and

lead to the same results.

¹¹

One can show from (25) how the correlated effect of many atoms at a distance produces on a given system the effects of an external potential. Formula (24) yields the result that this potential is that obtained from Liénard and Wiechert by retarded waves arising from the charges and currents resulting from the distant atoms making transitions. Assume the wave functions χ and ψ can be split into products of wave functions for system and distant atoms and expand

assuming the effect of any individual distant atom is small. Coulomb potentials arise even from nearby particles it they are moving slowly.

¹²

The term corresponding to this for the self-energy expression (26) would give an integral over

which is evidently infinite and leads to the quadratically divergent self-energy. There is no such term for the Dirac electron, but there is for Klein-Gordon particles. We shall not discuss the infinities in this paper as they have already been discussed in II.

¹³

Alternatively, note that Eq. (37) is exact for arbitrarily large

if the potential

is constant. For if the potential in the Dirac equation is the gradient of a scalar function

the potential may be removed by replacing the wave function by

(gauge transformation). This alters the kernel by a factor

owing to the change in the initial and final wave functions. A constant potential

is the gradient of

and can be completely removed by this gauge transformation, so that the kernel differs from that of a free particle by the factor

as in (37).

¹⁴

In changing the charge

we mean to vary only the degree to which virtual photons are important. We do not contemplate changes in the influence of the external potentials. If one wishes, as

is raised the strength of the potential is decreased proportionally so that

_μ, the potential times the charge

, is held constant.

¹⁵	The functional derivative is defined such that if [] is a number depending on the functions (l), the first order variation in produced by a change from to , is given by the integral extending over all four-space

¹⁶

That is, of course, those rules of II which apply to the unmodified electrodynamics of Dirac electrons. (The limitation excluding real photons in the initial and final states is removed in Sec. 8.) The same arguments clearly apply to nucleons interacting via neutral vector mesons, vector coupling. Other couplings require a minor extension of the argument. The modification to the

,as in (37), produced by some couplings cannot very easily be written without using operators in the exponents. These operators can be treated as numbers if their order of operation is maintained to be always their order in time. This idea will be discussed and applied more generally in a succeeding paper.

¹⁷	For an alternative method starting directly from the formula (24) for virtual photons, see Appendix B.

¹⁸	F. Rohrlich (to be published).

¹⁹	The physical ideas involved in such a description are discussed in detail by Y. Nambu, Prog. Theor. Phys. 5, 82 (1950). An equation of type (A2) extended to the case of Dirac electrons has been studied by V. Fock, Physik Zeits. Sowjetunion 12, 404 (1937).

²⁰	The factor in front of is simply to make the definition of here agree with that in I and II. In II it operates with as a perturbation. But the perturbation coming from (A3) in a natural way by expansion of the exponential is .

²¹	Expression (A8) is closely related to Schwinger's parametric integral representation of these functions. For example, (A8) becomes formula (45) of F. Dyson, Phys. Rev. 75, 486 (1949)) for if is substituted for .

²²

The form (A10) suggests another interesting possibility for avoiding the divergences of quantum electrodynamics in this case. The divergences arise from the

function when

. We might restrict the integration in the double integral such that

where δ is some finite quantity, very small compared with

. More generally, we could keep the region

from contributing by including in the integrand a factor

where

for

large compared to some δ, and

(e.g.,

acts qualitatively like

. (Another way might be to replace

by a discontinuous variable, that is, we do not use the limit in (A4) as

but set

.) The idea is that two interactions would contribute very little in amplitude if they followed one another too rapidly in

. It is easily verified that this makes the otherwise divergent integrals finite. But whether the resulting formulas make good physical sense is hard to see. The action of a potential would now depend on the value of

so that Eq. (A2), or its equivalent, would not be separable in

so that

would no longer be a strict eigenvalue for all disturbances. High energy potentials could excite states corresponding to other eigenvalues, possibly thereby corresponding to other masses. This note is meant only as a speculation, for not enough work has been done in this direction to make sure that a reasonable physical theory can be developed along these lines. (What little work has been done was not promising.) Analogous modifications can also be made for Dirac electrons.

²³

The formulas for real processes deduced in this way are strictly limited to the case in which the light comes from sources which are originally dark, and that eventually all light emitted is absorbed again. We can only extend it to the case for which these restrictions do not hold by hypothesis, namely, that the details of the scattering process are independent of these characteristics of the light source and of the eventual disposition of the scattered light. The argument of the text gives a method for discovering formulas for real processes when no more than the formula for virtual processes is at hand. But with this method belief in the general validity of the resulting formulas must rest on the physical reasonableness of the above-mentioned hypothesis.

²⁴

Usually only two polarizations transverse to the propagation vector

are used. This can be accomplished by a further rearrangement of terms corresponding to the reverse of the steps leading from (17) to (19). We omit the details here as it is well known that either formulation gives the same results. See II, Section 8.

²⁵

Its general validity can also be demonstrated mathematically from (45). The amplitude for arriving at

with no photons in the field with virtual photon coupling

is a transition amplitude. It must, therefore, satisfy (45) with

for any

. Hence show that the quantity

also satisfies Eq. (45) by substituting

for

in (45) and using the fact that

satisfies (45). Hence if

then

for all

. But

means that

satisfies (C1). Therefore, that solution

of (45) which also satisfies

(the propagation of a free electron without virtual photons) is a solution of (C1) as we wished to show. Equation (C1) may be more convenient than (45) for some purposes for it does not involve differentiation with respect to the coupling constant, and is more analogous to a wave equation.