Tuesday, April 1, 2014

Is Spin a Relativistic Effect? Levy-Leblond and First Order Wave Equations

One of the earliest concepts we cover in atomic structure is the concept of electron spin. We are taught to think of the electron's spin the same way we think of its charge; there's no particular reason a particle should have spin according to quantum mechanics, it merely happens to. Once the electron has spin, we include it (when necessary) as an additional parameter in the wave-function. Some texts state that spin is a relativistic effect that must be included as an ad hoc addition to nonrelativistic quantum mechanics, for example in Quantum Chemistry by I. Levine chapter 10 we find the claim:
In the nonrelativistic quantum mechanics to which we are confining ourselves, electron spin must be introduced as an additional hypothesis.

Other undergraduate and graduate texts on quantum mechanics I have examined appear to take more or less the same view. But is it really true? Is it really impossible to justify spin without invoking relativity? In this post I plan to demonstrate that it is not. But first, a little background.

Background: The Discovery of Spin

In retrospect, spin was first discovered in 1922 by Otto Stern and Walther Gerlach, though they didn't realize it at the time. In the now famous Stern-Gerlach experiment, they found that an electron passing through a magnetic field will be deflected in one of two possible directions. (Historically, they found this to be true for gaseous silver atoms, but the reason was later established to be because of the spin of the single unpaired electron.)

In 1925, George Uhlenbeck and Samuel Goudsmit proposed the idea of "spin" to explain certain spectroscopic features. The initial conception of spin was that the electron is a charged sphere spinning about its axis. Like any moving charge, it will create a magnetic field. If the electron is only allowed to spin in one of two possible directions, it will have two different magnetic moments, hence the results of the Stern-Gerlach experiment.

It was immediately understood that taking this literally would cause problems. Most importantly, using the classical electron radius would result in an equatorial velocity substantially greater than the speed of light (see D. Griffiths Introduction to Quantum Mechanics problem 4.138). Furthermore, in the prosaically named Standard Model of particle physics the electron is generally taken to be a zero-dimensional point particle, and thus "spinning" is geometrically meaningless. Rather, spin is taken to imply "intrinsic angular momentum," a concept which does not exist in the macroscopic world (but note an entirely different interpretation by H.C. Ohanian here).

In 1928 Dirac published his eponymous equation. In the equation, the wave-function ψ is actually a 4-component vector, resulting in four solutions. The first two solutions were immediately understood to correspond to a spin-up and spin-down electron, and will actually predict its observed magnetic moment. The latter two, which at first seem to correspond to the existence of electrons with negative energy, were eventually reinterpreted to correspond to positrons with positive energy. Thus the Dirac equation predicted the existence of antimatter several years before it was discovered.

As physics progressed, it was discovered that not only electrons, but an entire class of particles known as (elementary) fermions have spin-½, and thus all obey the Dirac equation. Although no elementary fermions are believed to exist with a spin greater than ½, certain composite fermions such as baryons and some nuclei do have higher half-integer spins. Generalized relativistic equations exist to describe any particle of spin-(n+½), such as the Rarita–Schwinger equation for spin-32 particles.

So spin can be beautifully described by relativistic quantum mechanics. But can it only be described by relativistic quantum mechanics?

The Derivation of the Dirac Equation

In order to determine if spin can be explained nonrelativisically, we first must understand why it arises in relativistic equations. Derivations of relativistic wave equations can easily be found in numerous online and print sources, so I will only replicate the features important to our discussion of spin. I will largely follow the discussion found in Introduction to Elementary Particles by D. Griffiths, chapter 7. I also rely on a more advanced treatment by Y. Bekenstein available for free here. (For the truly adventurous, a free online Quantum Field Theory textbook is available here.)

As we recall from elementary quantum mechanics, the Schrödinger equation can be "derived" by starting with the equation E = T + V and making the canonical substitutions:

One may be tempted to derive a relativistic equation in the same way by taking the relativistically correct energy E2 = c2p2 + m2c4 and making the above substitutions. Schrödinger did exactly this, even before discovering his most famous equation. When this substitution is made, we get the following:

This equation is known today as the Klein-Gordon equation. As can be seen, the Klein-Gordon equation is second order in the time derivative. This ends up creating the seemingly preposterous result that its probability current can be negative (see Bekenstein page 168). This led early physicists to reject it on physical grounds, and to go hunting for an equation which was first order in the time derivative. (The paradox was eventually solved, and the Klein-Gordon equation is now used to describe spin-0 particles.)

There where two basic approaches taken to create a wave equation with first order time dependence. Schrödinger proceeded by abandoning any attempt to keep his work relativistically correct, and developed his eponymous equation which is first order in time but second order in space. Dirac, on the other hand, decided to keep his equation relativistically correct, and tried to find an equation which was first order in both time and space.

In order for Dirac's equation to work, he realized that its coefficients would have to be non-abelian (i.e. ab ≠ ba). This is obviously impossible for numerical coefficients, but is perfectly reasonable for matrix coefficients, hence the γ matrices and 4-vector ψ of the Dirac equation. Interpretation of the solutions to the Dirac equation gives us spin and even the magnetic moment of the electron, as mentioned.

Physicists' later realization that the solutions to the Klein-Gordon equation did not behave the same way with respect to rotations as solutions to the Dirac equation was part of the hint that the Klein-Gordon equation should describe spin-0 particles. However, the important point is that spin did not emerge from relativity, it emerged from enforcing first order spatial dependence. If it is physically possible to enforce first order spatial dependence on a nonrelativistic equation, what might the results be?

Developing First Order Nonrelativistic Wave Equations

The matter stood as above for about 40 years. It was assumed that we could either invoke relativity and recover spin in the Dirac and other equations; or we could abandon relativity and settle on the Schrödinger equation, which apparently makes no reference to spin. However, in the 1960's the French physicist Jean-Marc Lévy-Leblond (b. 1940) began to question if the Schrödinger equation really simply neglected spin. What if, he reasoned, the Schrödinger equation is actually the nonrelativistic equivalent of the Klein-Gordon equation for spin-0 particles—might there exist a nonrelativistic equivalent of the Dirac equation as well?

Lévy-Leblond's seminal work on the topic (available without a paywall here) is, unsurprisingly, rather complex, but the basic idea can be explained without too much difficulty. I will attempt to present the essence of his arguments in the remainder of this section.

We begin with the Galilei group of matrix operations. This can be thought of as the group of rotations plus time and space translations—i.e. the Poincaré group minus (Lorentz) boosts—and as such would be the nonrelativistic subgroup of the Poincaré group1. The most important factor about the Galilei group is that its irreducible representations can be interpreted as corresponding to different spin states. If we attempt to derive a quantum mechanical equation consistent with the spin-0 irreducible representation of the Galilei group, it turns out that we are forced to re-derive the Schrödinger equation. So the Schrödinger equation is the only possible spin-0 nonrelativistic equation. But is it the only possible nonrelativistic equation period?

What we need is a quantum mechanical equation which is first order in all derivatives and which gives results consistent with the Schrödinger equation (which we know happens to give really good results for electrons, even if it misses spin). We will not, however, attempt to enforce any form of Galilean invariance at this point.

On a deeper level, these two requirements can actually be considered one and the same. The Dirac equation can be derived by factoring the Klein-Gordon equation (not the solutions). When we (appropriately) square the Dirac equation, the γ matrices become the positive and negative identity matrices, and we recover the Klein-Gordon equation. This is because both equations ultimately describe the same thing, the primary difference being that once we square the Dirac equation, the positive and negative vector spin quantities becomes the same scalar quantity. By analogy to the relativistic case, Lévy-Leblond sought to factor the Schrödinger equation to first order. By doing so, he would have an equation which described the same thing as the Schrödinger equation, but without squaring the vector spin to a scalar.

In any event, we need an equation of the form:

With the canonical substitutions for E and p, and with A, B, and C being coefficients to be determined. Here's where things get interesting. In order to fulfill the above requirements, the coefficients must be non-abelian! The equation derived is

with σ being the Pauli matrices and φ and χ being the components of Φ. This equation is known today as the Lévy-Leblond equation. Note that the Pauli matrices are not the only choice for the coefficients, but any other choices would be functionally identical.

The Lévy-Leblond equation has several important attributes: (1) Both φ and χ are valid solutions to the Schrödinger equation, so the Lévy-Leblond equation is nonrelativistically correct (i.e. its solutions will describe an electron at least as well as the Schrödinger equation's solutions, but it won't square out the vector spin). (2) When the Dirac equation is taken to the nonrelativistic limit where a particle's kinetic energy is trivial when compared to its rest energy, it can be shown that the Dirac equation reduces to the Lévy-Leblond equation. But most importantly, (3) as it stands when we apply Galilean transformations to the Lévy-Leblond equation, we find that it belongs to the spin-½ irreducible representation of the Galilei group2. Like the Dirac equation, the Lévy-Leblond equation will predict two spin states and can even be used to calculate the correct magnetic moment of the electron. Thus by enforcing that our equation be first order in its spatial derivatives, without any relativistic incorporations whatsoever, we discover the existence of spin!


It is often said that spin is a relativistic effect, and cannot be justified in the framework of nonrelativistic quantum mechanics. This is seemingly due to the fact that spin was first theoretically understood in the relativistic context of the Dirac equation. In this line, modern texts often describe spin as being a manifestation of the irreducible representations of the Poincaré group. Unquestionably, relativity provides an elegant framework for understanding spin, but relativity itself does not necessitate the existence of spin. Spin is a purely quantum mechanical effect that arises from enforcing first order spatial dependence on a quantum wave equation, and can equally well be described by the Galilei group. This can be understood in light of the fact that both the Schrödinger and Klein-Gordon equations are second order in momentum, and will square away the difference between the up and down spin vectors. The Dirac and Lévy-Leblond equations, by maintaining first order momentum, can differentiate between vectors in opposite directions. Although we may not be able to give an intuitive explanation of what spin is, we can say that it will emerge naturally from a quantum wave equation without any relativistic considerations whatsoever. It is a shame that Lévy-Leblond's contributions are so often overlooked in texts and classes.

1 This is something of an oversimplification. The Galilei group is the Poincaré group in the limit that the speed of light is taken to be infinite. See Lévy-Leblond here for more on the Galilei group, and also see here. In any event, the definition in the text is more than sufficient for our purposes.

2 Strictly, when we apply a general Galilean transformation to Φ, we find that it is invariant with respect to a transformation which ends up recreating the two-dimensional ray representation of the rotation group. Hence the solution corresponds to the spin-½ irreducible representation of the Galilei group.