Quantum field theory 1, lecture 01


Practical information

Practical information about the course, as well as a list of recommended literature, can be found here.

1 What is quantum field theory?

Historically, quantum field theory (QFT) has been developed as quantum mechanics for many (in fact infinitely many) degrees of freedom. For example, the quantum mechanical description for electromagnetic fields (light) and its excitations, the photons, leads to a quantum field theory. Quantum mechanics of photons, electrons and positrons is quantum electrodynamics (QED) and so one can go on.

In contrast to the transition from classical mechanics to quantum mechanics, the step from there to quantum field theory does not lead to a conceptually entirely new theoretical framework. Still, it was historically not an easy development and a lot of confusion was connected with notions like “second quantization” etc.

There are many new phenomena arising in a field theory setting. This includes collective effects of many degrees of freedom, e. g. spontaneous symmetry breaking. Particle number is not necessarily conserved and one can have particle creation and annihilation.

Historically, quantum field theory has been developed as a relativistic theory, which combines quantum mechanics with Lorentz symmetry. This was necessary for quantum electrodynamics. Until today, Lorentz symmetry remains to be a key incredient for the quantum field theoretic description of elementary particle physics. It is not central for quantum field theory itself, however. Concepts of quantum field theory can also be used to describe the quantum theory of many atoms, for example ultra-cold quantum gases, or phonons in solids, or the spins composing magnets. These systems are treated by non-relativistic QFT.

Probabilistic fields. One may characterize much of the content of the following lectures by two main ingredients

  • Fields (degrees of freedom at every point \(x\))
  • Probabilistic theory (as every quantum theory is one)

In this sense, one may speak of quantum field theory as a probabilistic theory of fields. The reader may note that “quantum” is missing in the above characterization. Indeed, in modern developments, all probabilistic field theories, be they “quantum” or “classical”, are described with the same concepts and methods based on the functional integral. The key element here is the one of fluctuating fields as one has it in many situations. Something as tangible as the surface of an ocean is already an example. The concepts are useful in many areas, ranging from statistical mechanics to particle physics, quantum gravity, cosmology, biology, economics and so on. The common view on all these subjects, based on the functional integral, will be the guideline of these lectures.

PFT, probabilistic field theory, would be a more appropriate name. We will nevertheless use the traditional, historic name, QFT. Neither “quantum” nor “relativistic” are crucial conceptually. Relativistic quantum field theory is from this perspective an important “special case”, to which we will pay much attention.

2 Functional integral

We start with a simple model, the one dimensional Ising model.

2.1 Ising model in one dimension

Ising spin. An Ising spin has two possible values, \begin{equation*} s=\pm 1. \end{equation*} One can also start somewhat more general with some two-level variable with possible values \(A_1\) and \(A_2\) and relate them to the Ising spins via a map, \begin{equation*} A_1 \to s=+1, \quad \quad \quad A_2 \to s=-1. \end{equation*} For example, a state could be occupied, \(n=1\), or empty, \(n=0\). These states can be mapped to Ising spins via \(s=2n-1\). From an information theoretic point of view, each Ising spin carries one bit of information.

Ising chain. Let us consider a chain of discrete points \(x\) and take them to be equidistant, \begin{equation*} x \in \{ x_{\text{in}}, x_{\text{in}}+\varepsilon , x_{\text{in}} + 2 \varepsilon , \ldots , x_{\text{f}} - \varepsilon , x_{\text{f}} \}. \end{equation*} The Ising chain contains a spin \(s(x)\) at each point (or lattice site) \(x\).

Configuration. Now let us pose one Ising spin at each point or lattice site \(x\). A set of of such spin values at all the possible points \(x\) will be called a configuration and denoted by \(\{ s(x) \}\). (This should be seen as an abbreviation for \(\{ s(x_{\text{in}}), s(x_{\text{in}}+\varepsilon ), \ldots , s(x_{\text{f}}) \}\).) For example, the spin value \(s(x)\), corresponding occupation number \(n(x)\) and spin direction for a particular configuration of seven spins could be as follows.

1 1 -1 -1 -1 1 -1 spin value \( s(x) \)
1 1 0 0 0 1 0 occupation number \( n(x) \)
\( \uparrow \) \( \uparrow \) \( \downarrow \) \( \downarrow \) \( \downarrow \) \( \uparrow \) \( \downarrow \) spin direction

In general, for \(P\) points, or lattice sites, there are \(N=2^P\) possible configurations since each spin can be either up or down. We can label them by an index \(\tau =1, \ldots , N\).

Euclidean action. We now introduce the concept of an euclidean action by assigning to each configuration a real number \(S\in \mathbb{R}\), \begin{equation*} \{ s(x) \} \to S[s] = S(\{ s(x) \}). \end{equation*} For example, one may have a next neighbor interaction and the action corresponds to \begin{equation} S[s] = - \sum _x \beta s(x+\varepsilon ) s(x), \label{eq:IsingModelNextNeighborAction} \end{equation} where we use the following abbreviation for a sum over lattice sites \begin{equation*} \sum _x = \sum _{x=x_{\text{in}}}^{x_{\text{f}}-\varepsilon }, \end{equation*} and \(\beta \) is a real parameter.

Partition function. One can define a partition function as a sum over all configurations, weighted by the exponential of minus the action, \begin{equation*} Z = \sum _{\{ s(x) \}} e^{-S[s]} = \sum _\tau e^{-S_\tau }. \end{equation*} Note that the partition function is here a real and positive number, \(Z>0\).

Probability distribution. Let us now assign to each configuration a probability, \(\{ s(x) \} \to p[s] = p(\{ s(x) \}) \), or in another notation, \(\tau \to p_\tau \). We will set \begin{equation*} p[s] = \frac{1}{Z} e^{-S[s]}. \end{equation*} Note the following properties

  • positivity \(p[s] \geq 0\) (and \(p[s]\to 0\) for \(S[s]\to \infty \)),
  • normalization \(\sum _{\{ s(x) \}} p[s] =\sum _\tau p_\tau = 1\).

These are the defining properties of probability distributions.

Observables. We may construct an observable by assigning to every configuration \(\{ s(x) \}\) (also labeled by \(\tau \)) a value \(A[s] = A_\tau \), \begin{equation*} \{ s(x) \} \to A[s], \quad \quad \quad \tau \to A_\tau . \end{equation*} In other words, the observable \(A\) has the value \(A_\tau \) in the configuration \(\tau \).

Expectation value. The expectation value of an observable is defined by \begin{equation*} \langle A \rangle = \sum _\tau p_\tau A_\tau = \frac{1}{Z} \sum _{\{ s(x) \}} e^{-S[s]} A[s]. \end{equation*}

Two-point correlation. A correlation function of two observables is given by the expression \begin{equation*} \langle A B \rangle = \sum _\tau p_\tau A_\tau B_\tau = \frac{1}{Z} \sum _{\{ s(x) \}} e^{-S[s]} A[s] B[s]. \end{equation*}

Local action. Oftentimes one can write the action as a sum of the form \begin{equation*} S[s] = \sum _x \mathscr{L}(x), \end{equation*} with \(\mathscr{L}(x)\) depending only on the spins in some neighborhood of \(x\). For our example \eqref{eq:IsingModelNextNeighborAction} with next neighbor interaction one would have \begin{equation*} \mathscr{L}(x) = - \beta s(x+\varepsilon ) s(x). \end{equation*} In fact, the simplest version of the traditional Ising model has \(\beta =\frac{J}{k_{\text{B}} T}\) with interaction parameter \(J\), temperature \(T\) and Boltzmann constant \(k_{\text{B}}\). In this context, the Euclidean action corresponds in fact to the ratio \(S=\frac{H}{k_{\text{B}} T}\) of Energy or Hamiltonian \(H\) and temperature as it appears in the Boltzmann weight factor \(\exp (-\frac{H}{k_{\text{B}}T})\). The Hamiltonian is then obviously \begin{equation*} H= - \sum _ x J s(x+\varepsilon ) s(x). \end{equation*}

Boundary terms. One must pay some attention to the boundaries of the Ising chain. Let us denote by \(\mathscr{L}_{\text{in}}\) a term that depends only on \(s(x_{\text{in}})\), the initial spin and similarly by \(\mathscr{L}_{\text{f}}\) a term that depends only on \(s(x_{\text{f}})\), the final spin. We write the action as \begin{equation*} S = \sum _t \mathscr{L}(t) + \mathscr{L}_{\text{in}} + \mathscr{L}_{\text{f}}. \end{equation*} By choosing \(\mathscr{L}_{\text{in}}\) and \(\mathscr{L}_{\text{f}}\) appropriately one can pose different boundary conditions, in general probabilistic, or also deterministic as an approriate limit.

Typical problem. A typical problem one may encounter in the context of the Ising model in one dimension is: What is the expectation value \(\langle s(x) \rangle \) or the two-point correlation function \(\langle s(x_1) s(x_2) \rangle \) for given boundary conditions specified by \(\mathscr{L}_{\text{in}}\) and \(\mathscr{L}_{\text{f}}\)?

Functional integral language. We now formulate the model in a language that is convenient for generalization. We write for expectation values \begin{equation*} \langle A \rangle = \frac{1}{Z} \int D s \; e^{-S[s]} A, \end{equation*} with the partition function \begin{equation*} Z = \int Ds \; e^{-S[s]}. \end{equation*} The functional measure is here defined by \begin{equation*} \int Ds = \sum _{\{ s(x) \}} = \sum _\tau = \prod _x \sum _{s(x)=\pm 1}. \end{equation*} For a finite Ising chain, the functional integral is simply a finite sum over configurations.

2.2 Continuum functional integral

Lattice functional integral. Let us now take a real, continuous variable \(\phi (x)\in \mathbb{R}\) instead of the discrete Ising spins \(s(x)\in \{ +1,-1 \}\). The position variable \(x\) is for the time being still labeling discrete points or lattice sites. We then define the functional measure \begin{equation*} \int D\phi = \prod _x \int _{-\infty }^\infty d\phi (x). \end{equation*} This is now the continuum version of a sum over configurations. Indeed it sums over all possible functions \(\phi (x)\) of the (discrete) position \(x\). To realize that indeed every function appears in \(\int D\phi \) one may go back to a discrete variable, \(\phi (x) \in \{ \phi _1, \ldots , \phi _M \} \) with \(M\) possible values and take \(M\to \infty \).

Configuration. For every lattice site \(x\) we specify now a real number \(\phi (x)\) which in total gives then one configuration. Obviously there are now infinitely many configurations even if the number of lattice sites is finite.

Path integral. At this point one can make the transition to a probabilistic path integral. To this end one would replace \(x\to t\) and \(\phi (x)\to \vec x(t)\), such that the sum over functions \(\phi (x)\) becomes one over paths \(\vec x(t)\). The functional measure would be \(\int D \vec x\).

Action. The Euclidean action can be written as \begin{equation*} S = \sum _x \mathscr{L}(x) + \mathscr{L}_{\text{in}} + \mathscr{L}_{\text{f}}, \end{equation*} where \( \mathscr{L}(x)\) depends on \(\phi (x^\prime )\) with \(x^\prime \) in the vicinity of \(x\). Similarly, \( \mathscr{L}_{\text{in}}\) depends on \(\phi (x_{\text{in}})=\phi _{\text{in}}\) and \( \mathscr{L}_{\text{f}}\) depends on \(\phi (x_{\text{f}})=\phi _{\text{f}}\).

Lattice \(\phi ^4\) theory. Here we take the action local with \begin{equation*} \mathscr{L}(x) = \frac{K}{8 \varepsilon } \left [ \phi (x+\varepsilon ) - \phi (x-\varepsilon ) \right ]^2 + \varepsilon V(\phi (x)), \end{equation*} where the potential is given by \begin{equation*} V(\phi (x)) = \frac{m^2}{2} \phi (x)^2 + \frac{\lambda }{8} \phi (x)^4. \end{equation*} The partition function is \begin{equation*} Z=\int D\phi \; e^{-S[\phi ]}, \end{equation*} and a field expectation value is given by \begin{equation*} \langle \phi (x) \rangle = \frac{1}{Z} \int D\phi \; e^{-S[\phi ]} \phi (x). \end{equation*} The functional integral is here still a finite-dimensional integral where the dimension corresponds to the number of lattice points \(P\). The action \(S[\phi ]\) is a function of \(P\) continuous variables \(\phi (x)\).

Continuum limit. Let us now take the limit \(\varepsilon \to 0\) for \(x_{\text{f}} - x_{\text{in}}\) fixed. This means that the number of lattice points \(P\) needs to diverge. The “lattice derivative” \begin{equation*} \partial _x \phi (x) = \frac{1}{2\varepsilon } \left [ \phi (x+\varepsilon ) - \phi (x-\varepsilon ) \right ] \end{equation*} becomes a standard derivative, at least for sufficiently smooth configurations, where it exists. One also has \begin{equation*} \sum _x \varepsilon \to \int dx, \end{equation*} and the Euclidean action becomes \begin{equation*} S= \int dx \left \{ \mathscr{L}(x) + \mathscr{L}_{\text{in}} + \mathscr{L}_{\text{f}} \right \}, \end{equation*} where now \begin{equation*} \mathscr{L}(x) = \frac{K}{2} \left [\partial _x \phi (x) \right ]^2 + V(\phi (x)). \end{equation*} The first term is called the kinetic term, the second the potential. In the limit \(\varepsilon \to 0\) the action is a functional of the functions \(\phi (x)\).

Physical observables. As physical observables one takes those \(A[\phi ]\) for which the limit \(\langle A \rangle \), \(\langle A B \rangle \) and so on exists in the limit \(\varepsilon \to 0\). It will not always be easy to decide whether a given \(A[\phi ]\) is a physical observable, but the definition is simple. For \(\varepsilon \to 0\) the expression \(A[\phi ]\) is again a functional.

Functional integral. The functional integral in the continuum theory is now defined as the “continuum limit” of the lattice functional integral for \(\varepsilon \to 0\). By definition, this is well defined for “physical observables”. One may ask: what are such physical observables? The answer to this question is not simple, in general. One should note here that also very rough functions \(\phi (x)\) are included in the functional integral, although their contribution is suppressed. If the kinetic term in the Euclidean action \(S_{\text{kin}} = \sum _x \frac{K}{8\varepsilon } \left [ \phi (x+\varepsilon ) - \phi (x-\varepsilon ) \right ]^2\) diverges for \(\varepsilon \to 0\), i. e. \(S\to \infty \), then one has \(e^{-S} \to 0\) and the probability of such configuration vanishes. The corresponding limits may not be trivial, however, because very many rough configurations exist.

Additive rescaling of action. Let us consider a change \(S\to S^\prime = S + C\) or \(\mathscr{L}(x) \to \mathscr{L}^\prime (x) = \mathscr{L}(x)+\tilde c\) where \(C = (x_{\text{f}}-x_{\text{in}})\tilde c\) is a constant that is independent of the fields. The partition function changes then like \(Z \to Z^\prime = e^{-C} Z\). Similarly, \begin{equation*} \int D\phi \, e^{-S} A[\phi ] \to e^{-C} \int D\phi \, e^{-S} A[\phi ]. \end{equation*} This means that \(C\) drops out when one considers expectation values like \(\langle A \rangle \)! It can even happen that \(C\) diverges for \(\varepsilon \to 0\) such that formally \(Z\to 0\) or \(Z\to \infty \). This is not a problem because the absolute value of \(Z\) is irrelevant. The probability distribution \(p[\phi ] = \frac{1}{Z} e^{-S[\phi ]}\) is unchanged.