I never understood why the Schrödinger's equation has an i...until now!
Mahesh Shenoy rebuilds the Schrödinger equation from scratch, from the 1853 hydrogen spectrum crisis through Bohr's orbits and de Broglie's matter waves, then derives the momentum and energy operators by hand. The momentum operator works cleanly from a sine wave, but the energy operator fails, which forces the question the whole video is built around: why does the equation need an imaginary number. The answer comes from Jean Robert Argand's geometric picture of complex exponentials as rotations, and Max Born's probability rule explains why physics itself demands it, only a rotating wave keeps total probability conserved.
Published Jun 30, 202644:37 video30 min readAdded Jul 1, 2026Open on YouTube →
At a glance
The Schrödinger equation is the single most consequential equation most people have never derived, and Mahesh Shenoy of FloatHeadPhysics sets out to build it from scratch, the same way Erwin Schrödinger supposedly did on a Christmas vacation in the Swiss Alps in 1925. The video has two goals. First, reconstruct the equation intuitively, starting from nothing more than energy conservation and a sine wave. Second, and this is the hook, answer the question that bothers every physics student the first time they see it: why does an equation that gives us computer chips, electron microscopes, atomic clocks, and GPS have an imaginary number sitting right in the middle of it?
The path runs through seventy years of failed classical physics, from a mysterious hydrogen spectrum in 1853 to Niels Bohr's "trust me" orbits to Louis de Broglie's matter waves, before the real derivation begins. Building the momentum operator from a plain sine wave works cleanly. Building the energy operator from the same sine wave breaks, badly, and that failure is the whole point. The fix requires a function that is its own derivative and periodic at the same time, which is mathematically impossible for real numbers and turns out to require a 90 degree rotation in the complex plane. That rotation is what the imaginary unit i actually does, and Max Born's probability interpretation, added as a casual footnote a year later, explains why the physics demands it: only a rotating wave keeps total probability constant over time.
The page below rebuilds it in the video's own order: the hydrogen spectrum crisis, the birth of quantum jumps, de Broglie's leap to matter waves, the modern derivation of the operators, the moment the real number approach fails, Jean Robert Argand's geometric answer, the assembled Schrödinger equation, its resemblance to the heat equation, Born's probability footnote, and the payoff in hydrogen orbitals and modern electronics.
The hydrogen spectrum and a crisis in classical physics
The story starts in 1853, when physicist Anders Jonas Ångström found that hot hydrogen gas gives off light only at very specific colors, not a smooth rainbow. Every hot element, it turned out, has its own signature spectrum, a discovery so useful it let scientists find brand new elements just by looking at their light. Helium was first identified this way, in the sun, before it was ever isolated on Earth. The unit for measuring these wavelengths, the angstrom, still carries his name.
Nobody knew why elements only emitted those specific colors. The first real clue came decades later from a Swiss math teacher, Johann Balmer, who was obsessed with numbers and patterns and found, by trial and error, a startlingly simple formula that predicted the hydrogen spectrum lines, including ones that hadn't been observed yet and were later confirmed. Nobody could say why the formula worked. It just did.
Answering that required a working theory of light, and by then physicists had one: James Clerk Maxwell had shown light is a ripple in the electromagnetic field, produced by accelerating charges. Wiggle a charge slowly and you get low frequency light, wiggle it fast and you get high frequency light, and that same wiggling could shake other charges at a distance, which is the entire basis of wireless communication. It was a triumph, except it could not explain the hydrogen spectrum. A hot gas has billions of randomly jiggling charges across every frequency, so Maxwell's theory predicted every color of light should come out. It didn't. Something was wrong.
It got worse. Once physicists established that atoms have a positive nuclear core, the obvious model was electrons orbiting it like tiny planets. But an orbiting, accelerating charge should constantly radiate energy according to Maxwell's own theory, which means every atom in the universe should collapse in a fraction of a second. Physics could not explain why matter is stable. That is the crisis this whole video exists to resolve.
The birth of quantum jumps
The break came from an unrelated puzzle: the photoelectric effect. Shine UV light on a piece of zinc and electrons pop out, which makes sense, they absorb energy from the light. But shine a brighter beam of visible light on the same zinc and nothing happens, no electrons at all, no matter how intense. Pumping in more energy should knock more electrons loose. It didn't. Color mattered, brightness didn't, and nobody could explain why.
Albert Einstein's answer was radical: light doesn't deliver energy as a continuous stream, it delivers it in discrete chunks called photons, and the energy of a photon depends only on its frequency, not on how many of them there are. A bright visible light throws a lot of photons per second, but each one is too weak to budge an electron, like pelting a bowling ball with ping pong balls. A dim UV light throws fewer photons, but each one hits like a cannonball. This won Einstein the Nobel Prize and introduced the constant that carries Max Planck's name.
Niels Bohr pushed the idea further. If light is absorbed in chunks, he reasoned, it must be emitted in chunks too, which means an electron has to jump from a higher energy level straight to a lower one, releasing the difference as a single photon. So he postulated that electrons can only orbit the nucleus at certain special energy levels and nowhere in between, and that while sitting in one of those special orbits, they simply do not radiate, defying Maxwell's prediction by decree. It sounded like he was making it up. But the postulate explained everything at once: heat a gas, electrons jump up, they fall back down releasing a photon of a specific frequency, and working through the math reproduces Balmer's formula exactly, while also explaining why atoms don't collapse. Bohr later admitted, in an interview, that the moment he saw Balmer's formula the whole picture became clear to him. What he could never explain was why those specific orbits were special, or why an accelerating electron parked in one of them wouldn't radiate. His answer amounted to "trust me."
De Broglie's leap: matter as a wave
The justification came from a French PhD student, Louis de Broglie. Light has a dual nature, moving as a wave but interacting like a particle. De Broglie asked the mirror question: what if matter, which we normally treat as particles, also moves as a wave? Electrons circling a nucleus could then form standing waves, and just like a guitar string can only vibrate with a whole number of loops, three or four or five but nothing fractional, an electron wave could only fit around the nucleus at three or four or five loops and nothing in between.
That single idea does double duty. It explains why electrons only exist at Bohr's specific distances (only those distances let a standing wave close on itself cleanly), and it explains why they don't radiate while sitting there (they aren't accelerating orbiting particles anymore, they're stationary waves; a photon only appears when the electron jumps between two standing wave patterns). For his thesis, de Broglie went further and derived a formula for the wavelength of matter using special relativity. It was so strange that his examiners only signed off after checking with Einstein himself.
The following year, a professor at the University of Zurich gave a seminar on de Broglie's matter waves. A colleague in the audience asked an obvious but pointed question: if matter is a wave, where is its wave equation? The professor was Erwin Schrödinger, and he took the question home. He spent Christmas vacationing in the Swiss Alps, reportedly with his mistress, and came back with the equation that now carries his name.
1853Anders Jonas Ångström finds hot hydrogen gas gives off light at only a few specific colors, not a smooth rainbow. The unit for measuring these wavelengths, the angstrom, is later named for him.
1880sJohann Balmer, a Swiss math teacher, finds a simple trial and error formula that predicts the hydrogen spectral lines, with no explanation for why it works.
1900sJames Clerk Maxwell's electromagnetic wave theory cannot explain the hydrogen spectrum, or why orbiting electrons don't radiate away and collapse into the nucleus.
1905Albert Einstein explains the photoelectric effect: light delivers energy in discrete photon chunks, each carrying energy set by its frequency, not its brightness.
1913Niels Bohr postulates electrons orbit only at special, non radiating energy levels. This reproduces Balmer's formula and explains atomic stability, with no reason given for why those levels are special.
1924Louis de Broglie's PhD thesis proposes matter itself moves as a wave, explaining Bohr's special orbits as the only wavelengths that fit like a standing wave on a string.
1925 A colleague asks, if matter is a wave, where is its wave equation? Erwin Schrödinger takes a Christmas vacation in the Swiss Alps and returns with the equation.
1926Max Born adds a footnote: the wave function squared is a probability. The Born rule gives the wave function, and the i inside it, a physical meaning.
1933 Schrödinger shares the Nobel Prize in Physics with Paul Dirac for the theory.
Figure 1. Eighty years of clues, from a mysterious hydrogen spectrum in 1853 to a working wave equation in 1925 and its physical meaning in 1926. Every step in the derivation below traces back to one of these discoveries.
Building the first matter wave
Rather than replay Schrödinger's own derivation, which by his own admission became "unintelligible," or Feynman's, which the presenter calls "quite mathy," the video reconstructs a more modern and intuitive path. It starts from the most bedrock principle in physics, energy conservation: total energy equals kinetic energy plus potential energy. Kinetic energy is one half m v squared, and multiplying top and bottom by mass turns that into momentum squared over 2m, a cleaner form to work with.
The textbook next step is to "make it quantum" by replacing energy and kinetic energy with operators acting on a wave function called psi (ψ), and this is exactly where most treatments lose the intuition. Two questions need answering: why do we need operators at all, and how would you build one yourself?
The first answer is almost obvious once you see it. Einstein gave us E = hf (energy equals Planck's constant times frequency) and de Broglie gave us p = h over lambda (momentum equals Planck's constant over wavelength). Why not just plug those directly into the energy equation? Because that substitution only works for an infinitely long, perfectly pure sine wave, the only kind of wave with one single, definite wavelength and frequency. A real, general wave is a mixture of many wavelengths and frequencies at once. For an ordinary wave that is a minor technicality. For a matter wave it means something stranger: the particle doesn't have one single value of momentum or energy, it carries a whole range simultaneously. Nobody yet knows what that means physically (the video parks that question for later, and Born eventually answers it), but the practical consequence is clear: you can't substitute a single number for energy, you need a machine that extracts the entire mixture. That machine is an operator.
The second question, how to build one, gets solved with a classic problem solving move: attack the hardest possible version of the problem by first solving the simplest version. The simplest possible wave is a plain standing sine wave, and building the operator for that case turns out to reveal everything.
Schrödinger's own notation calls the wave function psi. Starting from psi equals sin(x), the derivation adds generality one control at a time: an amplitude A so the height isn't locked to plus or minus one, a factor of 2π over lambda multiplying x so the wavelength isn't locked to 2π, and, because a standing wave's amplitude itself has to swing periodically in time, a second sine function of time with its own period T, scaled by 2π over T. Cleaning up the constants (defining ω, omega, as 2π times frequency, and κ, kappa, as 2π over lambda) gives a compact standing wave: psi equals A sin(κx) sin(ωt). It is still just a generic wave, nothing quantum about it yet.
The quantum step is substituting Einstein and de Broglie's relations, rewritten in terms of ω and κ using the reduced Planck constant ħ (h bar, h over 2π): ω is the temporal frequency, the number of oscillations per second, and it turns out to encode the particle's total energy. κ is the spatial frequency, the number of oscillations per meter, and it encodes the particle's momentum. The video pauses here for one of its best asides: since special relativity already tells us that space and time are not separate things but shadows of one four dimensional spacetime, maybe energy and momentum aren't separate either, maybe they're two shadows of a single deeper object. They are: physicists call it four momentum, the same relativistic bundling that unifies space and time also unifies energy and momentum.
Building the kinetic energy operator (the part that works)
With the standing wave psi = A sin(κx) sin(ωt) in hand, the goal is to "pull" momentum out of it using calculus. Differentiate psi with respect to x (a partial derivative, since time is held constant) and momentum over ħ pops out front while the sine term turns into a cosine. That's momentum to the first power, but kinetic energy needs momentum squared, so differentiate again. The second derivative brings out another factor of momentum over ħ, and crucially the cosine flips back into negative sine, meaning the original function psi reappears on the other side of the equation. Rearranging gives the momentum squared operator, cleanly: apply this operator (a second spatial derivative, scaled by minus ħ squared) to psi and you get momentum squared times psi back out. Divide by 2m and you have the kinetic energy operator.
Compared against the textbook Schrödinger equation, it matches exactly. The physical reading is elegant: a second derivative is a measure of curvature, so a wave with sharper curvature carries more kinetic energy, which lines up with de Broglie's rule that a shorter wavelength means more momentum. But curvature is a more powerful idea than wavelength, because wavelength only makes sense for a pure sine wave, while curvature can be defined pointwise for any shape at all. Kinetic energy, in other words, is encoded in how sharply the wave bends.
Figure 2. The momentum operator (left) survives a double derivative of a sine wave because sine, flipped twice, returns to sine. The energy operator (right) needs only a single derivative, and a single derivative turns sine into cosine, a different function, so the operator fails outright. The fix is not a different real function. It is dropping the sine entirely for a complex exponential, whose first derivative is always itself.
Why Fourier lets one sine wave stand for every wave
There's an obvious objection to all of this: the operators were only derived for one very special case, a pure sine wave, so why should they apply to any general, messy wave shape? The answer traces back to Jean Baptiste Joseph Fourier, the mathematician so devoted to the idea that heat had healing power that he reportedly kept his rooms sweltering and sat wrapped in blankets through summer. Whatever drove him, Fourier proved something extraordinary while modeling heat flow: any wave, any shape at all, can be written as a sum of sine and cosine waves. A square wave, built from just one sine wave, looks nothing like a square. Add a second sine wave with the right height and width and it gets closer. Add a third, a fourth, a fifth, and the sum converges toward a perfect square wave. Technically you need infinitely many terms for an exact match, but a handful gets remarkably close, for a square wave or any other shape you care to build. This is the Fourier series, and its generalization is the Fourier transform.
Because derivatives are linear (the derivative of A plus B is the derivative of A plus the derivative of B), an operator applied to a Fourier sum gets applied to every one of its sine wave components independently, and each one hands back its own kinetic energy multiplied by its own piece of the wave function. Add them all back up and the operator, run on any general wave, spits out the full mixture of kinetic energy as a weighted sum across every component. That is the missing proof: if the operator works for one sine wave, it works for every wave, because every wave is secretly built from sine waves.
The problem with real numbers: the energy operator fails
Momentum squared is done, so kinetic energy is done. Building the energy operator should be the same trick, one derivative with respect to time this time, since energy is linear rather than squared. Differentiate psi = A sin(ωt) with respect to time and, sure enough, E over ħ pops out front. But the sine has turned into a cosine, and a cosine is not the same function as sine. The operator needs psi to reappear cleanly on both sides so it can be isolated and read off as "the energy operator." Here it just doesn't. A single derivative of a sine wave never gives back the sine wave.
Thinking it through mathematically pins down exactly what function would work: something whose first derivative is proportional to itself. Sine doesn't have that property. Cosine doesn't either. No periodic function does. There is exactly one family of functions in all of mathematics with that property: the exponential. Try psi as an exponential function of time and the derivative comes back looking exactly like psi again, times a constant, precisely what's needed to isolate energy cleanly. Except now there's a new problem: exponentials aren't periodic. A real exponential just blows up forever, or with a negative exponent, decays away to nothing forever. Neither one oscillates the way a wave has to. So the math demands an exponential, the physics demands something periodic, and ordinary exponential functions can never be both.
Where does i come from?
The resolution comes from Jean Robert Argand, an amateur mathematician (a bookkeeper by trade) who reframed the exponential's blow up or decay in purely physical terms. The derivative of an exponential is proportional to itself, so think of that derivative as a velocity. If the velocity points in the same direction as the position, position and velocity reinforce each other over and over, a runaway feedback loop, and the function blows up. Multiply the exponent by a negative number and the velocity flips to point back toward the origin: position shrinks, velocity shrinks with it, and you get decay instead. Both are just the two faces of the same real exponential.
Argand's question was what happens if the velocity is always perpendicular to the position instead of parallel or antiparallel to it. Then the position never grows and never shrinks. It just gets pushed sideways, forever, at constant speed. That is uniform circular motion, a genuinely periodic process built from an exponential relationship. To make the velocity rotate 90 degrees relative to the position, you need to multiply the exponent by some number that itself represents a 90 degree rotation. Multiplying by 1 is a 0 degree rotation. Multiplying by minus 1 is a 180 degree rotation (two 90 degree turns stacked). So whatever number represents one 90 degree turn, squaring it, applying it twice, has to equal minus 1. The number that squares to minus 1 is, by definition, the square root of minus 1.
The imaginary unit i is exactly that 90 degree rotation operator. In the exponent of a time dependent function, i doesn't scale the function up or down, it spins it, at constant magnitude, forever, in a plane that has nothing to do with ordinary physical space. That plane, with a real axis and an imaginary axis, is the Argand diagram (also called the complex plane), named for exactly this geometric insight.
Figure 3. Argand's insight, treating the derivative of an exponential as a velocity. Aligned velocity feeds runaway growth, opposed velocity feeds decay, and only a velocity kept perpendicular to position produces a constant-magnitude, genuinely periodic rotation. The number that rotates by exactly 90 degrees, and whose square is therefore minus 1, is i.
The energy operator and the assembled Schrödinger equation
Swapping the real sine for a complex exponential, psi = A e^(iωt) sin(κx), changes nothing about the spatial part of the derivation (the momentum and kinetic energy operators still work exactly as before, since the time part is held constant during an x derivative). But now the time derivative behaves. Differentiate e^(iωt) with respect to time and it returns iω times itself, meaning psi reappears cleanly and the energy operator can finally be isolated: E acting on psi equals iħ times the time derivative of psi.
There's a small wrinkle worth noting: this derivation naturally produces a positive i, while the textbook Schrödinger equation is usually written with a negative sign in the corresponding spot. That's not a physics disagreement, just a sign convention: this derivation chose the wave to spin clockwise, while physicists conventionally choose counterclockwise, and flipping that choice flips the sign, canceling out to match the standard form. Either convention describes the same physics.
With both operators built and proven, by Fourier, to generalize beyond the single sine wave case, assembling the full equation is just substitution into energy conservation: total energy operator equals kinetic energy operator plus potential energy, all acting on psi. That is the Schrödinger equation, in full:
iħ ∂ψ/∂t = −ħ²/2m ∂²ψ/∂x² + V(x)ψ
And the original question has an answer. The i is there because psi has to be complex. Psi has to be complex because it needs to behave like an exponential (so the energy operator can be built by taking a single derivative and getting psi back) while also being periodic (so it behaves like an actual wave). A real number can give you one property or the other, blow up and decay, or oscillate, but never both at once. Only a complex exponential, spinning in the Argand plane under the imaginary unit, can be both at the same time.
Why it looks like the heat equation, except for the i
Strip the i back out of the Schrödinger equation and ask what's left. The time part becomes a real exponential again, which no longer rotates, it can only grow or decay. Decay, specifically, matches a very familiar physical picture: plot temperature along a rod, and as time passes hot spots cool and cool spots warm, the whole profile relaxing toward uniformity, always slowing down as differences shrink. That relaxation curve is exactly a decaying exponential in time, and the equation describing it, with the right constants and no potential energy term, is the classical heat equation, the same one Fourier built his entire mathematical career analyzing.
So the Schrödinger equation and the heat equation share almost the same skeleton: a first time derivative on one side, a second space derivative on the other. The single difference between "a matter wave in quantum mechanics" and "how heat diffuses through a rod" is the i sitting in front of the time derivative. Drop it and you get decay toward equilibrium. Keep it and you get rotation that never stops. That one letter is the entire difference between a wave and a diffusion process.
Heat equation (no i)
Schrödinger equation (with i)
Time part of the solution
real exponential, e^(−kt)
complex exponential, e^(−iωt)
What it does over time
decays toward equilibrium
rotates forever in the complex plane
What the variable represents
temperature along a rod
probability amplitude of an electron
Squared magnitude over time
shrinks toward zero
stays constant, total probability = 100%
Figure 4. Drop the i from the Schrödinger equation and it becomes structurally identical to the heat equation, decaying instead of oscillating. The i is the only thing that turns diffusion into a wave, and, as Born's rule shows next, the only thing that keeps total probability from leaking away.
Schrödinger's own discomfort, and Max Born's footnote
Even after building the equation, Schrödinger wasn't happy with it. He spent months trying to find a mathematical trick that would let him eliminate the imaginary number, and when he couldn't, he wrote to physicist Hendrik Lorentz that "what is unpleasant here, and indeed directly to be objected to, is the use of complex numbers. Psi is surely fundamentally a real function." The equation worked. He still didn't like why.
The physical justification, rather than the merely mathematical one, arrived as a footnote in a paper by Max Born in 1926. Born went back to the double slit experiment: fire a single photon's worth of light through two slits and it lands at one specific spot on the screen, unpredictably, but not randomly in every sense. Dark fringes, where no photons have landed, are places a photon is very unlikely to land next. The bright central fringe, where many photons have already landed, is a place a photon is very likely to land. The probability of landing anywhere is proportional to the brightness, or intensity, of the interference pattern there, which is itself proportional to the wave's amplitude squared. That reinterprets Einstein's photon idea probabilistically.
Born's move was to apply the exact same logic to matter waves: replace the photon source with an electron source, and the probability of finding the electron at a given point becomes the square of the wave function's value there. This is the Born rule, and it finally answers a question the video parked at the beginning, what does it physically mean for a matter wave to be a mixture of many energy and momentum values? It means the wave function encodes a probability distribution: measuring the electron's energy or momentum returns one particular value, but which one you get is governed by probabilities hidden inside psi.
The Born rule is also exactly why the wave function has to rotate rather than oscillate. Imagine, hypothetically, that some mathematical trick let the matter wave oscillate back and forth on the plain real axis, the way an ordinary standing wave does. Its square, the probability density, would then fluctuate too, rising and falling, and at some instants could dip all the way to zero. But the total probability of finding the electron somewhere in the entire universe has to always equal 100 percent. It can shift from place to place, never vanish and reappear. A real oscillating wave can't guarantee that; its total probability would wobble along with the oscillation. A wave rotating in the complex plane has no such problem: at every instant its magnitude, and therefore its squared magnitude, stays exactly the same, because rotation doesn't change a vector's length, only its direction. The local probability density can still shift around in space over time, but the total, integrated across all space, stays locked at 100 percent. The imaginary number isn't a mathematical convenience bolted onto the equation. It's the only mechanism that keeps quantum probability conserved.
The payoff: hydrogen orbitals and the modern world
Schrödinger solved his own equation for the hydrogen atom using a 1/r potential (the electric attraction between the electron and the proton) and got back far more than the spectrum Balmer had guessed at decades earlier. The solution predicted the relative brightness of each spectral line, correctly showed how the lines split apart under electric or magnetic fields, and, most importantly, produced the full three dimensional probability clouds of where an electron is likely to be found around the nucleus: the orbitals that define chemistry itself. The equation became, in the video's words, the F = ma of quantum mechanics, and Schrödinger shared the 1933 Nobel Prize in Physics with Paul Dirac for it.
Everything downstream of that one equation, with its one stubborn imaginary unit, runs the modern world. Focusing electron waves the way you'd focus light gives you electron microscopes powerful enough to resolve individual atoms. Calculating the precise energy levels inside a heavy atom like cesium gives you atomic clocks accurate enough to define the second itself. And when many atoms are packed close together, their individual energy levels merge into continuous bands, and controlling the gaps between those bands is exactly how transistors, and every microchip built from them, work. The real, physical, tangible modern world, brought to you by the imaginary number.
Key takeaways
The Schrödinger equation can be rebuilt from energy conservation (E = KE + PE) plus two operators, an energy operator and a kinetic energy operator, that "pull" the full mixture of possible values out of a matter wave rather than substituting a single number.
Building the momentum squared operator from a plain sine wave works because a second derivative of sine returns sine (with a minus sign). Building the energy operator from the same sine wave fails, because a single derivative turns sine into cosine, a different function entirely.
Fourier's discovery that any wave shape can be built from sums of sine and cosine waves, combined with the fact that derivatives are linear, is what proves an operator derived for one simple sine wave works for any general wave.
The only function whose derivative returns itself is the exponential, but real exponentials aren't periodic, they only grow or decay. Jean Robert Argand's geometric insight: making the "velocity" perpendicular to the "position" produces circular rotation, genuinely periodic, and the number that represents a 90 degree rotation, whose square is minus 1, is the imaginary unit i.
The wave function psi must be complex because it needs to be both exponential (so the energy operator works with a single derivative) and periodic (so it behaves like a wave). A real number can be one or the other but never both.
Dropping the i from the Schrödinger equation turns it into the classical heat equation: a real, decaying exponential in time instead of a rotating complex one. The i is the single difference between diffusion and a wave.
Max Born's Born rule (probability equals the wave function squared) explains why the physics demands rotation rather than oscillation: only a wave with constant magnitude, spinning rather than swinging up and down, can keep the total probability of finding a particle fixed at 100 percent for all time.
The equation's payoff is enormous: it correctly predicts the hydrogen spectrum, spectral line splitting, and the three dimensional electron orbitals that underlie chemistry, and its consequences (electron microscopes, atomic clocks, transistor band gaps) run the modern world.
Chapters
Timestamps are clickable. Click one and the player jumps there and keeps playing while you read. These are the creator's own chapter marks.
0:00 Where does the Schrödinger Equation come from?
0:51 The hydrogen spectrum & crisis in classical physics
3:36 Birth of quantum mechanics
8:08 Where is the wave equation?
12:14 Building our first quantum wave
17:07 Building the kinetic energy operator
20:57 The heat obsessed mathematical genius
24:40 The problem with real numbers!
29:30 Where does i come from?
34:00 The energy operator & the Schrödinger equation
36:47 Why does it resemble the heat equation (except the i)
38:41 A Nobel prize winning footnote (by Max Born)
43:12 The real world brought to you by the imaginary number (The end)
Notable quotes
It comes from nowhere. Out of man's imagination, struggles with the details of experiment, and all kinds of mysteries.
Richard Feynman, quoted at 0:12
Why is an equation with such real impact built using an imaginary number? What's i doing there?
Mahesh Shenoy, 0:33
As soon as I saw Balmer's formula, the whole thing was immediately clear to me.
Niels Bohr, quoted around 5:40
I went through Schrödinger's original derivation and I couldn't understand a thing. In fact, he himself called it unintelligible later on.
Mahesh Shenoy, 8:35
Kinetic energy is encoded in the curvature.
Mahesh Shenoy, 19:10
It will be a great idea to pause the video over here and see if you can try this yourself. Please do that, moment of truth.
Mahesh Shenoy, 20:50
The i has entered the room. Oh, that's, that's where it comes from.
Mahesh Shenoy, 32:05
What is unpleasant here, and indeed directly to be objected to, is the use of complex numbers. Psi is surely fundamentally a real function.
Erwin Schrödinger, letter to Hendrik Lorentz, quoted around 37:50
It's the i that turns an exponential into a spinning exponential, periodic.
Mahesh Shenoy, 37:30
It's the square root of minus 1, the imaginary number, that makes the physics real.
Mahesh Shenoy, 42:50
Resources mentioned
Richard Feynman, quoted on where the Schrödinger equation "really" comes from, and whose own derivation the video consulted and set aside as too mathematical.
Erwin Schrödinger, who derived the Schrödinger equation in 1925 and later wrote to Hendrik Lorentz objecting to his own equation's use of complex numbers.
Johann Balmer, whose trial and error formula predicted the hydrogen spectral lines decades before anyone could explain why it worked.
James Clerk Maxwell, whose electromagnetic wave theory of light explained wireless communication but could not explain the hydrogen spectrum or atomic stability.
Niels Bohr, whose postulate of special, non radiating electron orbits explained the hydrogen spectrum and atomic stability without explaining why those orbits were special.
Louis de Broglie, who proposed that matter itself behaves as a wave, explaining Bohr's orbits as electron standing waves.
Jean Robert Argand, whose geometric insight into complex exponentials as rotations explains where the imaginary unit i physically comes from; the Argand diagram (complex plane) is named after him.
Hendrik Lorentz, the recipient of Schrödinger's letter objecting to complex numbers in his own equation.
Max Born, whose 1926 footnote proposing the Born rule (probability equals the wave function squared) gave the wave function, and the imaginary number inside it, a physical meaning.
Paul Dirac, who shared the 1933 Nobel Prize in Physics with Schrödinger.
The double slit experiment, used twice in the video, first to introduce wave particle duality and again to motivate Born's probabilistic interpretation.
Brilliant, the video's sponsor, an interactive platform for math, science, programming, data, and AI.
This single equation helped unlock the
modern world. It gave us computer chips,
electron microscopes, atomic clocks,
GPS, high-speed internet, the list goes
on. But where did this equation come
from? Well, according to Richard
Feineman, it comes from
>> nowhere. Out of man's imagination,
struggles with the details of experiment
and all kinds of mysteries.
>> That man was Irvin Schroinger. He
derived it in 1925 while vacationing in
the Swiss Alps with his mistress.
So I have two questions. First, how can
we intuitively build this equation
ourselves from scratch? But secondly,
why is an equation with such real impact
built using an imaginary number? What's
I doing there? If you're ready, let's
find out.
It all starts in 1853 when the physicist
Anders Enstrom finds that hot hydrogen
gas gives out very specific colors of
light. We soon figured out it's not just
hydrogen. Every hot element gives out
its own signature spectrum. And suddenly
this became a powerful tool to discover
brand new elements just by looking at
their light. This is how we discovered
helium for the very first time in the
sun. So in his honor, the unit for
measuring these wavelengths was named
the Enstrom.
But nobody knew why these elements gave
out those specific colors of light. The
first clue actually came a few decades
later from a Swiss math teacher named
John Balmer. Balmer was obsessed with
numbers and patterns. And he finds a
surprisingly simple formula for the
hydrogen spectrum just by trial and
error. And the formula predicts there
should be more lines. And we actually
confirm it. But nobody had any clue what
this formula meant. Why did it work?
To answer this, we needed a good theory
of light. By now, we knew light is a
wave confirmed by the interference
pattern. And Maxwell showed that light
is basically a ripple in the
electromagnetic field produced by
accelerating charges. So wiggling
charges produce light. Wiggling it
slowly gives us low frequency light and
wiggling it faster gets us high
frequency light. And these EM waves or
light could wiggle other charges which
meant wireless communication.
It was a breakthrough in communication
technology, but it couldn't explain the
hydrogen spectrum.
See, according to Maxwell, a hot glowing
gas has billions of randomly jiggling
charges from very low frequencies to
very high frequencies, which meant they
should be giving out every color of
light,
but they didn't. So something was
horribly wrong with Maxwell's theory.
But it got worse. Pretty soon we
discovered that atoms had a positive
nuclear core. So we thought the negative
electrons must be orbiting this nucleus,
making the atoms stable. But if they did
that, they would be constantly
accelerating and accelerating charges
radiate EM waves. So they would lose
energy and collapse. So now we couldn't
even explain why atoms were stable.
Physics seemed to be in crisis. But a
few years later, everything changed.
If you shine UV light on zinc, for
example, electrons come out. That makes
sense. Electrons receive energy from the
EM waves. But if you shine a brighter
visible light, no electrons came out.
That didn't make any sense. I mean,
you're pumping in more energy. So, we
would expect electrons to come out with
more energy. So, why didn't they why did
the color matter and not the brightness?
To explain this, Albert Einstein
proposed something radical. What if
light doesn't deliver energy
continuously but in discrete chunks
called photons? And what if the energy
of each photon depends only on its
frequency? Then a bright visible light
will deliver a lot of photons per
second, but each one's too weak to budge
an electron. It's like throwing ping
pong balls at a bowling ball. But on the
other hand, a dim UV light will deliver
fewer photons per second, but each one
is strong enough to knock it off like a
single cannonball.
So this explained the photoelectric
mystery beautifully and Einstein won the
Nobel Prize for it. And this constant is
you probably know the planks constant.
But a few years later, a Danish
physicist named Neils Boore pushed this
idea even further.
Bore wondered if light is absorbed in
chunks, it must also be emitted in
chunks, right? Which meant electrons had
to transition from a higher energy level
straight to a lower energy one to
release these energy chunks. So he
postulated that electrons orbit the
nucleus only at some special energy
levels, nowhere in between. And in these
special orbits, he said they don't
radiate energy. It seemed like he was
just making stuff up. But look what
happens now. When you heat up a gas,
electrons jump to a higher allowed
level. But when they fall back down,
they release the energy difference as a
photon of a specific frequency. This
explained the hydrogen spectrum. And the
photon's energy is basically the energy
lost by the electron. From this he
derived an expression for the frequency
and got the Balmer's formula. So his
postulate explained the hydrogen
spectrum, the Balmer's formula and the
atomic stability all at once. This was
huge.
Later in an interview, he says, "As soon
as I saw Balmer's formula, the whole
thing was immediately clear to me." He
was probably showing off, but there were
many unanswered questions here. I mean,
why were electrons restricted to those
specific orbits? and why don't they
radiate while they are there? Boore
basically said, "Trust me." But a French
PhD student named Louis De Bruy came up
with an answer. He pushed Bor's idea to
the limit. Light moves as a wave but
interacts with stuff like a particle,
right? Dual nature. So he wondered, what
if matter behaves the same way? What if
matter 2 interacts like a particle but
moves as a wave? Then electrons inside
an atom could form standing waves. And
just like a guitar string can only
vibrate with say three loops or four
loops or five but nothing in between.
Electron waves can only vibrate with
three or four or five loops but nothing
in between.
This explained why electrons can only
exist at those specific distances from
the nucleus as Boore said. And it also
explains why they don't radiate while
sitting at those energy levels because
they're not orbiting particles. They're
not accelerating. They're stationary
waves now. They only radiate a photon
when they transition from a higher to a
lower level.
This was so radical. Matter behaving as
waves. And that's not all. For his PhD
thesis, he derived an expression for its
wavelength using special relativity. He
had found wavelength of matter. It seems
so bizarre. His examiners only approved
it after confirming it with Einstein
himself. Finally in the following year a
professor at the University of Zurich
gave a seminar on this very topic and
after the talk a colleague in the
audience asked if matter is a wave where
is the wave equation. The professor was
Irvin Schroinger and he took that
question seriously. So he went on a
vacation to the Swiss Alps over
Christmas and came back with the wave
equation. But how did he do it?
I went through Shinger's original
derivation and I couldn't understand a
thing. In fact, he himself called it
unintelligible later on. So, I looked up
Fineman's derivation and that was also
quite mathy. But after a lot of
searching, I found a modern version
which seemed pretty intuitive. It starts
by asking what rule should matter waves
obey in general. The most fundamental
one we know is energy conservation,
right? Total energy equals kinetic
energy plus potential energy. Okay, that
makes sense. Um, kinetic energy is half
mv squared. So if you multiply the top
and the bottom by m, we can write it as
momentum squared over 2m. And I'm like,
perfect. This is all starting to make
sense. And in the final step, it says to
make it quantum, replace energy with
energy operator. Wait, what? kinetic
energy with kinetic energy operator.
What? What's going on? And they act on
the wave function sigh, giving us the
Schroinger's equation. What just
happened?
After calming down a bit, I realized I
just had to answer two questions. First
of all, why do we need these operators?
And second of all, how do I build them
myself intuitively? So, let's start with
the first question. We know E equals HF.
And we also know P= H / lambda deoy. Why
not just substitute them directly into
this equation, right? That was what I
was thinking. Well, the answer is
actually right in front of us. You see,
that would only work for an infinitely
long pure sine wave because look, only
then it would have one single definite
veil length and frequency.
But a general wave doesn't have a single
wavelength or frequency. It can have a
whole range all mixed together.
Now for ordinary waves, that's not a big
deal. But for matter waves, think about
what it means. This means the particle
doesn't even have a single value of
momentum or energy. At this point, we
don't even know how to interpret that. I
mean, what does it even mean for an
electron to not have a definite energy?
Well, let's keep that question aside.
We'll come back to it later. But for
now, what's important is because mow
waves don't have a single value of
energy or momentum, we can't substitute
directly. Instead, we need something
that extracts the entire mixture.
That's what these operators do. The
energy operator over here pulls the
entire range of energy hiding in the
wave. Similarly, the kinetic energy
operator pulls the entire range of
kinetic energy. That's why in quantum
mechanics we always talk about
operators, right? Because matter waves
don't have single values for energy or
momentum or whatever. They have a whole
range. Okay, so first question answered.
On to the second one and the most
important one. How do I build these
operators myself? Here's a powerful
problem solving principle. When you're
trying to solve a hard problem, first
see if you can create a simpler version
of that problem and try to solve that.
So in our case, the hard problem is to
build operators that work in general.
The simpler version would be to try and
build these operators for the simplest
wave possible,
a sine wave.
So maybe if I can build an operator for
a sine wave, I can then use that
intuition to generalize it. First of
all, this could be a traveling wave or
you know it could be a standing wave
like you know de Bruy described. I like
standing waves. The math feels slightly
more intuitive. So let's go with that.
Let's draw a couple of axes. You have
x-axis and you have s which is you know
shinger's own notation. So the first
question is what is the equation for s?
Well, let's pause the animation. We can
write s= sinx. I mean we could also
write s= cos x. It's just a matter of
where we put the origin. But let's stick
to sinx. That's the simplest equation,
right? But guess what? S's height swings
between +1 and minus1. I want our height
to be slightly more general. Let's call
it a. So how do we do that? Well, we
scale this by a. That's the amplitude.
Now I can control the amplitude.
Perfect. But sign always resets after 2
pi. Which means right now our wavelength
is locked exactly at 2 pi. I don't want
that. I want to be able to have any
wavelength. I want lambda. So what do we
do? Well, we multiply x by 2 pi over
lambda. I mean, think about it. Now when
x equals lambda the lambda cancels out
and the argument becomes 2 pi and the
wave resets. So now lambda has become
our wavelength. But this isn't a wave
yet. It's a frozen picture.
A standing wave means the amplitude
itself changes over time periodically.
So a itself needs to be some periodic
function of time. Again we'll choose the
simplest function s but just like before
s has a period of 2 pi. So right now our
you know time period is locked at 2 pi.
So we want the period to be let's say
capital t. So we use the same trick as
before. We multiply this by 2 pi over
capital t. And we are done. We just need
a bit of cleaning up over here. 1 / t is
frequency. And physicists hate writing 2
pi over and over again. So we will
define 2 pi f as a new variable omega
and similarly 2 pi over lambda as a new
variable kappa. We're only doing this so
that we don't have to write two pies
over and over again. Okay. If we
substitute it, boom, we have built the
equation for our standing wave. But this
is still a generic wave. There's nothing
quantum about it.
To make it quantum, we bring in Einstein
and De Bruy equation. Now, since this is
a pure sine wave of one specific
frequency and wavelength, we can
directly substitute over here. But
before we do that, we have to write this
in terms of omega and kappa as well. So,
let's quickly do that. To do that, we'll
just multiply and divide by 2 pi
everywhere. And now look at what we get.
H over 2 pi, we'll call that as h bar.
We call this the reduced plank constant.
And 2 pi f is omega. And 2 pi over
lambda. Well, we have kappa. And this
cleaning up actually makes things much
more beautiful. I mean, think about it.
What exactly is omega over here? Omega
basically tells us number of waves per
second, right? So we can call it the
temporal frequency. What about kappa?
Well, kappa tells us the number of waves
per meter. Look at this. per meter. So
that is the spatial frequency
which means for matter waves the
temporal frequency encodes the total
energy. It lives in the time domain and
the spatial frequency encodes the
momentum. It lives in the space domain.
When I saw this, a light bulb went off.
I mean, think about it. You probably
know that in special relativity, space
and time are not two separate things.
They're just shadows of the underlying
four-dimensional spaceime, right? So, we
could guess that energy and momentum
aren't separate things. Maybe they are
just two components,
shadows of something much deeper, a
four-dimensional object. And that's
exactly what we have in special
relativity. It's called four momentum.
So, in relativity, we don't think of
energy and momentum separately. We just
think of them as two components of the
underlying
object called for momentum. I know this
is a deg uh this is a tangent but oh my
god like that connection is beautiful.
Anyways we substitute now for omega and
kappa and boom we have built our very
first mowave equation. But remember what
our actual goal was here was to build
energy and kinetic energy operators. So
let's start with kinetic energy. Since
it has momentum squared in it, our
question would be how do we pull
momentum out of this equation? We can
differentiate it with respect to x,
right? Well, actually we need to do
partial derivative. Then this part
becomes a constant. And now look, p or h
pops out in front and s turns to cos.
But we don't want just momentum. We want
momentum squared because our goal is to
get kinetic energy. So what do we do?
Well, we differentiate again. On the
left hand side, we get a second
derivative. And over here, P / H pops
out one more time. And cost becomes
negative sign. And if you look
carefully, look, this is our original
function s. So we have got sigh back.
And if we rearrange,
we have built our very first operator,
the momentum squared operator.
Look, when you do this operation on S,
you get momentum squared time S. So
momentum squared has been extracted. So
this is the momentum squared operator.
For the last step, I need kinetic
energy, right? So kinetic energy is just
momentum square by 2 m. So let me just
divide by 2 m.
And this is the kinetic energy. So I
have found how to extract kinetic
energy. So this must be the kinetic
energy operator.
Wow. We've built it all by ourselves.
And if we compare it to the actual
Schroinger's equation, oh my god, it's
exactly the same thing.
Whoa.
All right, let's calm down. But what
does it actually say?
Well, second derivative is basically
curvature, right? So this says if your
wave has more curvature, then it will
have more kinetic energy. And that makes
sense because we already saw that you
know from Droy's equation that shorter
wavelength means more momentum means
more kinetic energy. Shorter wavelength
means more curvature. But our
understanding has upgraded because the
idea of wavelength only works for pure
sine waves right but the idea of
curvature can be defined
at every single point. So it is a
general way of thinking about it.
Kinetic energy is encoded in the
curvature.
Oh man that is awesome. But this brings
up an annoying problem.
See, we derived this operator for a
special case, a pure sine wave, right?
But it turns out the operator works in
general for any shape. Now, that is
awesome. I'm not complaining because
this means we can do the same thing for
building the energy operator and then we
can finish the storing equation and
fulfill our destiny. But I won't be able
to sleep at night because although we
have some intuition for why this should
work in general, I can't actually
convince myself mathematically why
something we derived for one specific
special case perfectly works in general.
I was stuck here for a while until I met
a man who was so obsessed with heat that
apparently he kept his room at blazing
temperatures and he sat inside them
wrapped in a blanket during hot summers.
His name John Baptist Joseph Forier.
Forier believed heat had magical healing
powers. But historians think it's
probably because, you know, he developed
extreme cold sensitivity during his time
in the Egyptian desert. But whatever it
is, what's important for us is that he
was a mathematical genius. And so
obviously he wanted [snorts] to
mathematically model how heat flows. And
he did that. And while doing so he
developed an incredible principle. He
found that any wave or any shape at all
can be written as a sum of ss and cosine
waves. Here's what I mean. Take this
square wave. According to Foryer, you
can write this as just sums of ss and
cosiness. If you just use one sine wave,
that doesn't look like much. But if you
add a second one with a slightly
different height and width, look, it
gets closer. Add a third and a fourth
and a fifth and you keep going and look
look the shape converges towards a
perfect square wave. Now of course
technically we need infinitely many but
look I mean even with a few we can get
remarkably close right
here's another example again by adding
multiple sine waves of just the right
width and height we can construct this
shape too. we can construct any shape
and forer showed that this works in
general and the idea is called the 4year
series or more generally it's called the
4year transforms. This idea was so
radical that even the top mathematicians
back then just couldn't believe it this
would be true. But today we have a very
elegant proof for it. Let me know if you
want me to make a video on that. But for
now we'll just accept that. So according
to Foryer our matter wave can be written
as the sum of lots and lots of pure sine
waves. So now what will happen if we
apply that operator we built to this
general wave? Well derivatives are
linear meaning derivative of a plus b
equals derivative of a plus derivative
of b. That means this operator gets
applied to every single component.
For each component, it spits out the
components kinetic energy multiplied by
the components wave function. And then
it adds up all of those and spits this
entire sum back. Which means look the
operator when working on a general wave
spits out the full mixture of the
kinetic energy as a weighted sum. And so
for helps us understand why if an
operator works for a sine wave it should
work for any wave in general. Oh man for
you beauty. Imagine if forier knew that
his obsession with heat would one day
unlock the framework for just you know
modeling quantum world. Oh my god, how
would he be feeling? Oh my god. Anyways,
this means all we have to do is repeat
the same process for extracting energy
and we are done. It will be a great idea
to pause the video over here and see if
you can try this yourself. Please do
that moment of truth.
All right. So to extract energy look I
have to differentiate with respect to
time. This time this term is a constant.
So nothing happens to it. So when I
differentiate sign well again e over h
pops out and sign turns into cos. We
have our energy. So let's just
rearrange.
And
wait
there's a problem. I'm not getting my
function back because sine turned into
cos. So I can't write this as s. Wait,
why did it work last time? Oh, last time
it worked because we took a double
derivative, right? Because I wanted
momentum squared. And so when I took a
second derivative, well, s turned to
cost and cost turned back to s. And I
was able to cleanly write this as s and
build my operator. Oh, but this time I
just need a single derivative because I
already found energy. I don't want
energy squared. So, I can't take a
second derivative. But I can't write
this as s because the sign has turned
into cost. Oh man, we were so close.
What? What do we do?
Let's just think mathematically. For
this to work, to get our s back, this
function of time has to be such that its
first derivative must be itself. But a s
or a cosine or any periodic function for
that matter, none of them give us that.
There's only one function in this entire
multiverse with that property. It's the
exponential function. I mean, think
about it for a second. Imagine that this
function was exponential in time. Now,
if we differentiate it with respect to
time, again, e over h bar pops out. And
this time because the derivative of
exponential is itself. Look, we get our
function s back. We can now rearrange
and build our operator just like before.
And the math would work out. But the
problem is this is not a wave. For a
wave, the amplitude needs to be some
kind of a periodic function of time. But
exponentials are not periodic function.
Exponentials will just keep blowing up
forever,
right?
Or if you if you put a negative side for
example then yeah then the exponential
would just keep decaying forever.
Whatever it is this is not a wave.
So we have a problem. I mean for the
physics to work we need this function to
be periodic. But when you use a periodic
function the math breaks down because I
cannot extract energy and build an
operator. For that to happen, I need
that function to be exponential. But
then the math works, but the physics
doesn't work because it's not a wave. So
to make both of them work, we somehow
need an exponential function of time
that's also periodic.
But that's impossible. By definition,
exponential functions are exponential.
So, how in the world can we build a
function that's both exponential and
periodic? Well, it's kind of like how we
can make our learning both exponential
and periodic using Brilliant, the
sponsor of this video. I am pretty
anxious about AI's impact on my son's
future learning because it can write for
you. It can do your homeworks and it can
even think for you. But it can also be a
great personal tutor and that's where
Brilliant comes in. So, just check this
out. Koji, I don't know exactly what to
do over here.
>> Let's pick one corner of the shape and
see where it lands.
>> What happened to that point?
>> That's Cooji, Brilliance personalized AI
tutor. And look, it doesn't just give
you the right answer directly. It
instead helps you think through it. It's
a zero, so I can't really tell.
>> Zero is tricky. Let's look at this point
instead.
>> Isn't this exciting? This is exactly
what great tutors do. They work with
you. They model learning itself. And now
Brilliant is leveraging AI to make them
highly affordable.
>> The rule on the board.
>> That's right.
>> Brilliant has several courses in math,
programming, science, and tech, and at
various grade levels. And what's more
awesome is you can get started with
Brilliant Tutor for free. Just go to
brilliant.org/flotedphysics /flotated
physics or scan this QR code. The link
is also in the description and if you do
decide to make a purchase, you'll get a
20% off on your annual premium
subscription. So, happy learning. Back
to the video. Okay, so coming back, how
do we create a function that's both
exponential and periodic? The answer
actually comes from a bookkeeper and an
accountant from Paris. His name John
Robert Arand. Argan asks Mahesh, why do
exponential functions in general blow up
or decay? Well, I say that's because
that's literally what exponential is.
There's no deeper explanation, right?
Well, he says, think about it
physically. The derivative of an
exponential is proportional to itself,
right? So, think of this as velocity.
The velocity in this particular case is
proportional to the position and it's in
the same direction as the position. So a
little time later the position grows but
the velocity grows as well. So now the
position grows even faster. The velocity
grows even faster and that's how we get
a runaway effect. The whole thing blows.
If you had a negative exponent then you
would have the same effect except now
the velocity would be in the opposite
direction because of the negative sign.
So now a little time later the position
reduces because of that the velocity
also reduces. So the position reduces
slower the velocity reduces even slower
and you get a decay. But Argan asks what
if somehow we could make the velocity
perpendicular
to the position.
This time a little later, the position
neither grows nor shrinks, but it just
shifts sideways, which means the
velocity magnitude stays the same as
well. And it continues to stay
perpendicular, meaning it keeps pushing
it sidewards forever.
That is a uniform circular motion.
That will make our exponential periodic.
I mean, sure, it's not oscillating, but
it that's okay. It's periodic. That's
what I want. But Argan, how do we
control the direction of this um
velocity? And how do we make it
perpendicular? Well, Argan says, think
about this. When we multiply the
exponent by one, the velocity points in
the same direction as position. That's a
0° rotation of the velocity. When we
multiply by -1, it points in the
opposite direction. So this rotates
velocity by 180°
which means we need to multiply the
exponent with some number that rotates
the velocity by 90°. How do we find
that? Well say is simple. Whatever
number this is if you multiply it by
itself one more time well it would
rotate again by 90° meaning 180°
but we already know that is negative
one. So in other words, whatever this
number is, it multiplied by itself
should give me -1 or the square of that
number should give me -1 or that number
which rotates by 90° is the square root
of -1.
The I has entered the room. Oh,
that's that's where it ah that's where
it comes from.
I'm sorry. Let's calm down. But what is
the I doing over here? The I in the
exponent produces in 90° rotation making
our exponential function spin making it
periodic. Now, of course, this rotation
is not in real space. This is a complex
plane. So, this is the real axis and
this is the um imaginary axis. Now
because of Argan's beautiful geometric
insight into these complex exponentials,
we call this complex plane the Argon
diagram. So just to recap, we started
with microwaves oscillating up and down
and we found that the math didn't work.
We couldn't cleanly extract the energy.
Now to make the math work, we needed
exponentials, but they aren't periodic.
Argan gave us a way out. complex
exponentials. Now, the math works
because I can cleanly extract energy
because it's an exponential function.
But the physics works too because the
function has become periodic.
This means now we have to accept that
our microwave is rotating in some kind
of an abstract complex plane. But that's
okay. At least we found a way out. So,
let's run with it now. So what we're
going to do is let's use this complex
exponential as our basic wave. Okay. Now
the first question is does that change
anything that we did so far? Well
remember we are taking a partial
derivative with respect to x and when we
do that the time part is treated as a
constant. So whether we have a sign here
or you know exponential over here it
doesn't matter. So everything stays the
same and so nothing changes over here.
So that's awesome. I don't have to do
any more work over here. But now it's
time to build the energy operator. So if
you differentiate with respect to time,
we get energy extracted and we get our
function back. So I get my S cleanly. So
it's time to rearrange and isolate
energy over here. And if I multiply
numerator and denominator by I, we have
done it. We have cleanly extracted
energy. This is the energy operator.
But wait, when I saw this, I was like,
wait a second, why is it a negative
sign? I mean, I know that the original
Shortinger's equation, as we will see,
doesn't have a negative sign. Well, it
turns out it's a small convention thing.
We chose our matter wave spinning in the
clockwise direction as seen from here.
Well, it turns out physicists like to
choose anticlockwise or counterclockwise
as the convention. So physicists love to
choose the negative sign over here for
their exponentials over here. So this
would also be negative and so you'll
have a negative popping out that cancels
with this one. And so our energy
operator wouldn't have a negative sign
over here. That's just a matter of
convention. Anyways, thanks to forier I
know that this operator works in general
which means we can now build our
shinger's equation. So we start by
energy conservation. Total energy equals
kinetic plus potential. And then for
total energy we substitute the energy
operator. For the kinetic energy we
substitute the kinetic energy operator
and they're work and they're operating
on the wave function s and we have our
stroinger's equation in its full glory.
So can we now answer our original
question intuitively? Why is there an i
over here? Well because our wave
function itself is complex. Why should
it be complex? because it needs both an
exponential and a periodic function. Why
does it need to be an exponential
function? Well, because that's the only
way I can extract energy cleanly. I can
build an energy operator because I need
the first derivative of my function to
be itself. Right? [laughter]
That doesn't that make sense? Okay. Now,
here's another question. What would
happen if I were to consider the same
equation without the i? Now, the
solutions would be real exponentials in
time. They will no longer rotate which
means they would just blow up or decay
forever. Would it represent anything
physical?
Yes.
Say the vertical axis was temperature
and the horizontal axis represented a
rod which means we now have a
temperature distribution.
And as time passes the hot regions cool
down and the cool regions heat up. So
the graph shrinks. But as a temperature
gets closer to each other, it gets
slower over time. Meaning we get exactly
a decay.
In other words, this now represents the
heat equation. How temperature changes
over time. But of course, it can have
different constants and we would expect
it to not have any potential energy
term. So we can now intuitively see why
the Schroinger's equation looks so
similar to the heat equation except for
the I. It's the eye that turns an
exponential into a spinning exponential
periodic.
At this point I'm really satisfied with
the Schroinger equation where it comes
from and why there is an I. And I think
I have a really good intuition behind
it. But a part of me still feels that
it's all still mathematical. It kind of
feels like, you know, there might be
some kind of a mathematical trick that
we just haven't thought of yet using
which we can get rid of that I and
that's exactly what Shinger was trying
to do for months after publishing his
equation. And when he couldn't, he wrote
a letter to Henrik Loren saying what is
unpleasant here and indeed directly to
be objected to is the use of complex
numbers. Sigh is surely fundamentally a
real function. So mathematically it
makes sense why the eye must be there.
But what is the physical reason behind
it? Well, the breakthrough came as a
footnote actually in a paper published
by Max Bourne.
Borne says let's go back to the double
slit. We are now coming a full circle to
where we started. Awesome. This time if
we send only one photon's worth of
energy through those slits, it has to
land somewhere at one specific spot on
the screen. Right. Mah, can you tell me
where? And I say, I have no idea. And
Bon says, neither do I, but I can't tell
where it's more likely to land. The
chances of landing in these dark regions
is almost zero because no photons have
landed so far. And the brightest spot in
the center, the chances is very high
there because lots of photons have
already landed there before. So the
dimmer the regions, the lower the
chances. So even though I can't say
exactly where one photon lands, I can
talk about the probability and that
probability is proportional to the
brightness or intensity which is
proportional to the amplitude squared.
This is how Einstein's idea can be
interpreted probabilistically. And Borne
says well I just thought that we can try
and apply the same thing to mow waves.
If you replace light with a source of
electron, they too travels as waves and
then we should get the exact same
result. Which means the probability of
finding now the electron in any spot is
the square of the wave function at that
point. This is today called the borne
rule. This is what he wrote down as a
casual footnote and it gave us a way to
interpret the microwave. It's a
probability wave. And remember at the
beginning we asked a question of how
mrow waves carry mixtures of energy and
momentum values. How do you make sense
of that? Well, that mixture is really
probability distribution. So for
example, when you measure its energy or
momentum, there's some probability of
getting each particular value. And that
probability distribution is hidden in
that wave function.
So coming back to our original question,
how does the born rule give us a
physical meaning to I? Suppose our
matter wave just oscillate back and
forth on the real axis. Let's say we
found some mathematical trick to make
this work. Okay? Then the size squared
over here would also fluctuate, right?
And since size square is the probability
of finding the particle somewhere the
total area under this curve should
represent the total probability of
finding this electron anywhere in the
universe. But in this particular case
that total probability fluctuates. I
mean at some moment it can even go to
zero for example. That makes no sense
right? I mean the total probability of
finding the electron somewhere in the
universe has to be 100%. Right? But how
can it be zero? It doesn't make any
sense. The probability, for example, can
shift from place to place, but the total
should never change. So you can clearly
see a simple up and down standing wave
can't give us a probability wave, a
matter wave. There's no way to make it
work.
But a wave that is rotating in complex
plane has no such problem.
Look here at every point in the complex
plane the height of this I stays exactly
the same because it's just spinning. So
what happens to s squared? The si square
stays fixed which means the total
probability now stays fixed. Now of
course for more complicated matter waves
the probability will change with time
but the total probability will still
stay the same. However, if you were to
model matter waves using just up and
down oscilly motion in just real axis,
there's no way for that to happen. Even
the simplest wave, you can't model it.
So, it's the i that makes the wave
function spin and conserves the total
probability. It's the square root of -1,
the imaginary number that makes the
physics real.
Schroinger solved his equation for the
hydrogen atom using the 1 / r potential
function. And not only did he get the
hydrogen spectrum, he also predicted the
relative brightness of each line and
showed how adding electric or magnetic
fields splits these lines. But more
importantly, the equation predicted the
three-dimensional probability
distributions of the electrons in the
hydrogen atom, the orbitals.
In short, the equation was a complete
radical breakthrough. It became the F=
MA of quantum mechanics. As a result,
Schroinger shared the 1933 physics Nobel
Prize with Paul Durac. And today,
scientists and engineers have done
extraordinary things with it. We've
found ways to focus electron waves and
build electron microscopes powerful
enough to see individual atoms. We've
calculated approximate energy levels
inside heavier atoms like cesium for
example to build atomic clocks. And when
multiple atoms come close together, we
found that those energy levels turn into
bands. And by controlling the band gaps,
we've built ultra tiny transistor
switches that make up every single
microchip.
Our real modern world brought to you by
the imaginary number.