We must set the scene giving conventional wisdom for 1900. Maxwell had incorporated some of Faradys’s insights on electromagnetism in his differential equations. These equations lead to a simple quantitative theory of light that explains its velocity and polarization. There was one puzzle: photodetectors that people knew how to build went pop-pop intermittently. They did not report the steady stream of energy predicted by Maxwell’s waves. This was ‘explained’ by the discrete nature of electrons in the photodetectors but that did not explain how an individual electron got the energy to pop out of its well. Leave that puzzle aside just now.
Imagine a source of light in a small volume. It is a diffuse gas with electricity passing thru. We use sodium vapor with its nearly pure yellow color. Occasionally an electron will fall into an atom and give off energy in the form a two light pulses leaving in opposite directions. They leave in random directions and random polarizations. If we place two photodetectors in opposite directions from the source, connect them to head phones, we hear these pops, which we explain as events at the light source. Since we already expect the source to be discrete as the electrons fall into the atoms, the individual pops do not surprise us, they are merely little pulses of Maxwell’s waves. We have postulated that the light pulses leave in opposite directions to conserve momentum which Maxwell’s light demands. We notice a substantial correlation in our two detectors which we take as confirmation of this prediction. The correlation is imperfect for our alignment is imperfect and our detectors are imperfect. We adjust and compensate for these imperfections in our subsequent observations.
Now we buy two polarizing light filters. These are each a disk that transmits about 1/2 the light that tries to pass thru (1/2 energy flux). If we put these in tandem adjusted one way, the pair still transmits 1/2 the light rather than the 1/4 that we might expect. Turning one filter by 90° blocks all of the light. This effect was long known and Maxwell’s equations explained it well. They also predicted that at 45° the filters attenuated the light by 1/4 and that also agrees with observation. No photons need apply to explain anything so far.
The only sensible notion is that as the electron falls into the atom at the source and for the light that is destined to reach our detectors, some random matched polarization happens in the two directions. We have already noticed that one filter always attenuates the light by 1/2 in that the pulses happen about 1/2 the time. This was itself a small puzzle; why should photodetectors pop off only half as often just because the light waves had only 1/√2 the normal amplitude? We explain that as a weakness in our photodetector; the probability of an electron being knocked out and causing a pop is evidently proportional to the light energy flux.
Now we put our two filters each in front of one of our two detectors. We align them in parallel, as in when they let thru 1/2 of the light when in tandem. Now consider what happens when the light pulses were emitted with polarization parallel with our filters; they should both get thru with full amplitude. When they are emitted at 90° to our filters they should neither get thru. When they are at 45° their amplitudes should both be attenuated by 1/√2 which means that the energy flux should be 1/2 and the probability should be 1/2 that of the full signal. All of this is observed as predicted given that we know only the distribution of the polarizations as the pulses leave the source. For that population of pulses that leaves the source near 45° polarization the two photodetectors each see 1/2 the energy flux and independently decide to report it. The observed results clearly imply, however, that either both detectors report the pulse, or neither detector reports the light. The observed correlation is perfect modulo the previously measured imperfections in our apparatus. The perfect correlation is the paradox.
Photons in place of light pulses do not solve this central problem. Aside from the perfect correlation there are these worries: