Particle Physics Planet

December 18, 2017

Emily Lakdawalla - The Planetary Society Blog

#AGU17: JunoCam science
JunoCam may be an outreach instrument, but its superb photos of storms on Jupiter are providing plenty of data for scientists to talk about.

December 18, 2017 08:41 PM

Peter Coles - In the Dark

Frank Kelly’s Christmas Countdown

It’s time for the famous Cardiff Physics & Astronomy Christmas Lunch, so I’ll get into the Christmas spirit and just leave this here:

by telescoper at December 18, 2017 12:13 PM

December 17, 2017

Christian P. Robert - xi'an's og

controlled SMC

At the end of [last] August, Jeremy Heng, Adrian Bishop†, George Deligiannidis and Arnaud Doucet arXived a paper on controlled sequential Monte Carlo (SMC). That we read today at the BiPs reading group in Paris-Saclay, when I took these notes. The setting is classical SMC, but with a twist in that the proposals at each time iteration are modified by an importance function. (I was quite surprised to discover that this was completely new in that I was under the false impression that it had been tried ages ago!) This importance sampling setting can be interpreted as a change of measures on both the hidden Markov chain and on its observed version. So that the overall normalising constant remains the same. And then being in an importance sampling setting there exists an optimal choice for the importance functions. That results in a zero variance estimated normalising constant, unsurprisingly. And the optimal solution is actually the backward filter familiar to SMC users.

A large part of the paper actually concentrates on figuring out an implementable version of this optimal solution. Using dynamic programming. And projection of each local generator over a simple linear space with Gaussian kernels (aka Gaussian mixtures). Which becomes feasible through the particle systems generated at earlier iterations of said dynamic programming.

The paper is massive, both in terms of theoretical results and of the range of simulations, and we could not get through it within the 90 minutes Sylvain LeCorff spent on presenting it. I can only wonder at this stage how much Rao-Blackwellisation or AMIS could improve the performances of the algorithm. (A point I find quite amazing in Proposition 1 is that the normalising constant Z of the filtering distribution does not change along observations when using the optimal importance function, which translates into the estimates being nearly constant after a few iterations.)

Filed under: Books, pictures, Statistics, University life Tagged: BiPS, dynamic programming, hidden Markov models, importance sampling, normalising constant, sequential Monte Carlo

by xi'an at December 17, 2017 11:17 PM

The n-Category Cafe

Entropy Modulo a Prime (Continued)

In the comments last time, a conversation got going about <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>-adic entropy. But here I’ll return to the original subject: entropy modulo <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>. I’ll answer the question:

Given a “probability distribution” mod <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>, that is, a tuple <semantics>π=(π 1,,π n)(/p) n<annotation encoding="application/x-tex"> \pi = (\pi_1, \ldots, \pi_n) \in (\mathbb{Z}/p\mathbb{Z})^n </annotation></semantics> summing to <semantics>1<annotation encoding="application/x-tex">1</annotation></semantics>, what is the right definition of its entropy <semantics>H p(π)/p?<annotation encoding="application/x-tex"> H_p(\pi) \in \mathbb{Z}/p\mathbb{Z}? </annotation></semantics>

How will we know when we’ve got the right definition? As I explained last time, the acid test is whether it satisfies the chain rule

<semantics>H p(γ(π 1,,π n))=H p(γ)+ i=1 nγ iH p(π i).<annotation encoding="application/x-tex"> H_p(\gamma \circ (\pi^1, \ldots, \pi^n)) = H_p(\gamma) + \sum_{i = 1}^n \gamma_i H_p(\pi^i). </annotation></semantics>

This is supposed to hold for all <semantics>γ=(γ 1,,γ n)Π n<annotation encoding="application/x-tex">\gamma = (\gamma_1, \ldots, \gamma_n) \in \Pi_n</annotation></semantics> and <semantics>π i=(π 1 i,,π k i i)Π k i<annotation encoding="application/x-tex">\pi^i = (\pi^i_1, \ldots, \pi^i_{k_i}) \in \Pi_{k_i}</annotation></semantics>, where <semantics>Π n<annotation encoding="application/x-tex">\Pi_n</annotation></semantics> is the hyperplane

<semantics>Π n={(π 1,,π n)(/p) n:π 1++π n=1},<annotation encoding="application/x-tex"> \Pi_n = \{ (\pi_1, \ldots, \pi_n) \in (\mathbb{Z}/p\mathbb{Z})^n : \pi_1 + \cdots + \pi_n = 1\}, </annotation></semantics>

whose elements we’re calling “probability distributions” mod <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>. And if God is smiling on us, <semantics>H p<annotation encoding="application/x-tex">H_p</annotation></semantics> will be essentially the only quantity that satisfies the chain rule. Then we’ll know we’ve got the right definition.

Black belts in functional equations will be able to use the chain rule and nothing else to work out what <semantics>H p<annotation encoding="application/x-tex">H_p</annotation></semantics> must be. But the rest of us might like an extra clue, and we have one in the definition of real Shannon entropy:

<semantics>H (π)= i:π i0π ilogπ i.<annotation encoding="application/x-tex"> H_\mathbb{R}(\pi) = - \sum_{i: \pi_i \neq 0} \pi_i \log \pi_i. </annotation></semantics>

Now, we saw last time that there is no logarithm mod <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>; that is, there is no group homomorphism

<semantics>(/p) ×/p.<annotation encoding="application/x-tex"> (\mathbb{Z}/p\mathbb{Z})^\times \to \mathbb{Z}/p\mathbb{Z}. </annotation></semantics>

But there is a next-best thing: a homomorphism

<semantics>(/p 2) ×/p.<annotation encoding="application/x-tex"> (\mathbb{Z}/p^2\mathbb{Z})^\times \to \mathbb{Z}/p\mathbb{Z}. </annotation></semantics>

This is called the Fermat quotient <semantics>q p<annotation encoding="application/x-tex">q_p</annotation></semantics>, and it’s given by

<semantics>q p(n)=n p11p/p.<annotation encoding="application/x-tex"> q_p(n) = \frac{n^{p - 1} - 1}{p} \in \mathbb{Z}/p\mathbb{Z}. </annotation></semantics>

Let’s go through why this works.

The elements of <semantics>/p 2<annotation encoding="application/x-tex">\mathbb{Z}/p^2\mathbb{Z}</annotation></semantics> are the congruence classes mod <semantics>p 2<annotation encoding="application/x-tex">p^2</annotation></semantics> of the integers not divisible by <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>. Fermat’s little theorem says that whenever <semantics>n<annotation encoding="application/x-tex">n</annotation></semantics> is not divisible by <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>,

<semantics>n p11p<annotation encoding="application/x-tex"> \frac{n^{p - 1} - 1}{p} </annotation></semantics>

is an integer. This, or rather its congruence class mod <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>, is the Fermat quotient. The congruence class of <semantics>n<annotation encoding="application/x-tex">n</annotation></semantics> mod <semantics>p 2<annotation encoding="application/x-tex">p^2</annotation></semantics> determines the congruence class of <semantics>n p11<annotation encoding="application/x-tex">n^{p - 1} - 1</annotation></semantics> mod <semantics>p 2<annotation encoding="application/x-tex">p^2</annotation></semantics>, and it therefore determines the congruence class of <semantics>(n p11)/p<annotation encoding="application/x-tex">(n^{p - 1} - 1)/p</annotation></semantics> mod <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>. So, <semantics>q p<annotation encoding="application/x-tex">q_p</annotation></semantics> defines a function <semantics>(/p 2) ×/p<annotation encoding="application/x-tex">(\mathbb{Z}/p^2\mathbb{Z})^\times \to \mathbb{Z}/p\mathbb{Z}</annotation></semantics>. It’s a pleasant exercise to show that it’s a homomorphism. In other words, <semantics>q p<annotation encoding="application/x-tex">q_p</annotation></semantics> has the log-like property

<semantics>q p(mn)=q p(m)+q p(n)<annotation encoding="application/x-tex"> q_p(m n) = q_p(m) + q_p(n) </annotation></semantics>

for all integers <semantics>m,n<annotation encoding="application/x-tex">m, n</annotation></semantics> not divisible by <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>.

In fact, it’s essentially unique as such. Any other homomorphism <semantics>(/p 2) ×/p<annotation encoding="application/x-tex">(\mathbb{Z}/p^2\mathbb{Z})^\times \to \mathbb{Z}/p\mathbb{Z}</annotation></semantics> is a scalar multiple of <semantics>q p<annotation encoding="application/x-tex">q_p</annotation></semantics>. (This follows from the classical theorem that the group <semantics>(/p 2) ×<annotation encoding="application/x-tex">(\mathbb{Z}/p^2\mathbb{Z})^\times</annotation></semantics> is cyclic.) It’s just like the fact that up to a scalar multiple, the real logarithm is the unique measurable function <semantics>log:(0,)R<annotation encoding="application/x-tex">\log : (0, \infty) \to \R</annotation></semantics> such that <semantics>log(xy)=logx+logy<annotation encoding="application/x-tex">\log(x y) = \log x + \log y</annotation></semantics>, but here there’s nothing like measurability complicating things.

So: <semantics>q p<annotation encoding="application/x-tex">q_p</annotation></semantics> functions as a kind of logarithm. Given a mod <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics> probability distribution <semantics>π=Π n<annotation encoding="application/x-tex">\pi = \in \Pi_n</annotation></semantics>, we might therefore guess that the right definition of its entropy is

<semantics> i:π i0π iq p(a i),<annotation encoding="application/x-tex"> - \sum_{i : \pi_i \neq 0} \pi_i q_p(a_i), </annotation></semantics>

where <semantics>a i<annotation encoding="application/x-tex">a_i</annotation></semantics> is an integer representing <semantics>π i/p<annotation encoding="application/x-tex">\pi_i \in \mathbb{Z}/p\mathbb{Z}</annotation></semantics>.

However, this doesn’t work. It depends on the choice of representatives <semantics>a i<annotation encoding="application/x-tex">a_i</annotation></semantics>.

To get the right answer, we’ll look at real entropy in a slightly different way. Define <semantics> :[0,1]<annotation encoding="application/x-tex">\partial_\mathbb{R}: [0, 1] \to \mathbb{R}</annotation></semantics> by

<semantics> (x)={xlogx if x0, 0 if x=0..<annotation encoding="application/x-tex"> \partial_\mathbb{R}(x) = \begin{cases} - x \log x &if&nbsp; x \neq 0, \\ 0 &if&nbsp; x = 0. \end{cases}. </annotation></semantics>

Then <semantics> <annotation encoding="application/x-tex">\partial_\mathbb{R}</annotation></semantics> has the derivative-like property

<semantics> (xy)=x (y)+ (x)y.<annotation encoding="application/x-tex"> \partial_\mathbb{R}(x y) = x \partial_\mathbb{R}(y) + \partial_\mathbb{R}(x) y. </annotation></semantics>

A linear map with this property is called a derivation, so it’s reasonable to call <semantics> <annotation encoding="application/x-tex">\partial_\mathbb{R}</annotation></semantics> a nonlinear derivation.

The observation that <semantics> <annotation encoding="application/x-tex">\partial_\mathbb{R}</annotation></semantics> is a nonlinear derivation turns out to be quite useful. For instance, real entropy is given by

<semantics>H (π)= i=1 n (π i)<annotation encoding="application/x-tex"> H_\mathbb{R}(\pi) = \sum_{i = 1}^n \partial_\mathbb{R}(\pi_i) </annotation></semantics>

(<semantics>πΠ n<annotation encoding="application/x-tex">\pi \in \Pi_n</annotation></semantics>), and verifying the chain rule for <semantics>H <annotation encoding="application/x-tex">H_\mathbb{R}</annotation></semantics> is done most neatly using the derivation property of <semantics> <annotation encoding="application/x-tex">\partial_\mathbb{R}</annotation></semantics>.

An equivalent formula for real entropy is

<semantics>H (π)= i=1 n (π i) ( i=1 nπ i).<annotation encoding="application/x-tex"> H_\mathbb{R}(\pi) = \sum_{i = 1}^n \partial_\mathbb{R}(\pi_i) - \partial_\mathbb{R}\biggl( \sum_{i = 1}^n \pi_i \biggr). </annotation></semantics>

This is a triviality: <semantics>π i=1<annotation encoding="application/x-tex">\sum \pi_i = 1</annotation></semantics>, so <semantics> (π i)=0<annotation encoding="application/x-tex">\partial_\mathbb{R}\bigl( \sum \pi_i \bigr) = 0</annotation></semantics>, so this is the same as the previous formula. But it’s also quite suggestive: <semantics>H (π)<annotation encoding="application/x-tex">H_\mathbb{R}(\pi)</annotation></semantics> measures the extent to which the nonlinear derivation <semantics> <annotation encoding="application/x-tex">\partial_\mathbb{R}</annotation></semantics> fails to preserve the sum <semantics>π i<annotation encoding="application/x-tex">\sum \pi_i</annotation></semantics>.

Now let’s try to imitate this in <semantics>/p<annotation encoding="application/x-tex">\mathbb{Z}/p\mathbb{Z}</annotation></semantics>. Since <semantics>q p<annotation encoding="application/x-tex">q_p</annotation></semantics> plays a similar role to <semantics>log<annotation encoding="application/x-tex">\log</annotation></semantics>, it’s natural to define

<semantics> p(n)=nq p(n)=nn pp<annotation encoding="application/x-tex"> \partial_p(n) = -n q_p(n) = \frac{n - n^p}{p} </annotation></semantics>

for integers <semantics>n<annotation encoding="application/x-tex">n</annotation></semantics> not divisible by <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>. But the last expression makes sense even if <semantics>n<annotation encoding="application/x-tex">n</annotation></semantics> is divisible by <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>. So, we can define a function

<semantics> p:/p 2/p<annotation encoding="application/x-tex"> \partial_p : \mathbb{Z}/p^2\mathbb{Z} \to \mathbb{Z}/p\mathbb{Z} </annotation></semantics>

by <semantics> p(n)=(nn p)/p<annotation encoding="application/x-tex">\partial_p(n) = (n - n^p)/p</annotation></semantics>. (This is called a <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>-derivation.) It’s easy to check that <semantics> p<annotation encoding="application/x-tex">\partial_p</annotation></semantics> has the derivative-like property

<semantics> p(mn)=m p(n)+ p(m)n.<annotation encoding="application/x-tex"> \partial_p(m n) = m \partial_p(n) + \partial_p(m) n. </annotation></semantics>

And now we arrive at the long-awaited definition. The entropy mod <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics> of <semantics>π=(π 1,,π n)<annotation encoding="application/x-tex">\pi = (\pi_1, \ldots, \pi_n)</annotation></semantics> is

<semantics>H p(π)= i=1 n p(a i) p( i=1 na i),<annotation encoding="application/x-tex"> H_p(\pi) = \sum_{i = 1}^n \partial_p(a_i) - \partial_p\biggl( \sum_{i = 1}^n a_i \biggr), </annotation></semantics>

where <semantics>a i<annotation encoding="application/x-tex">a_i \in \mathbb{Z}</annotation></semantics> represents <semantics>π i/p<annotation encoding="application/x-tex">\pi_i \in \mathbb{Z}/p\mathbb{Z}</annotation></semantics>. This is independent of the choice of representatives <semantics>a i<annotation encoding="application/x-tex">a_i</annotation></semantics>. And when you work it out explicitly, it gives

<semantics>H p(π)=1p(1 i=1 na i p).<annotation encoding="application/x-tex"> H_p(\pi) = \frac{1}{p} \biggl( 1 - \sum_{i = 1}^n a_i^p \biggr). </annotation></semantics>

Just as in the real case, <semantics>H p<annotation encoding="application/x-tex">H_p</annotation></semantics> satisfies the chain rule, which is most easily shown using the derivation property of <semantics> p<annotation encoding="application/x-tex">\partial_p</annotation></semantics>.

Before I say any more, let’s have some examples.

  • In the real case, the uniform distribution <semantics>u n=(1/n,,1/n)<annotation encoding="application/x-tex">u_n = (1/n, \ldots, 1/n)</annotation></semantics> has entropy <semantics>logn<annotation encoding="application/x-tex">\log n</annotation></semantics>. Mod <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>, this distribution only makes sense if <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics> does not divide <semantics>n<annotation encoding="application/x-tex">n</annotation></semantics> (otherwise <semantics>1/n<annotation encoding="application/x-tex">1/n</annotation></semantics> is undefined); but assuming that, we do indeed have <semantics>H p(u n)=q p(n)<annotation encoding="application/x-tex">H_p(u_n) = q_p(n)</annotation></semantics>, as we’d expect.

  • When we take our prime <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics> to be <semantics>2<annotation encoding="application/x-tex">2</annotation></semantics>, a probability distribution <semantics>π<annotation encoding="application/x-tex">\pi</annotation></semantics> is just a sequence of bits like <semantics>(0,0,1,0,1,1,1,0,1)<annotation encoding="application/x-tex">(0, 0, 1, 0, 1, 1, 1, 0, 1)</annotation></semantics> with an odd number of <semantics>1<annotation encoding="application/x-tex">1</annotation></semantics>s. Its entropy <semantics>H 2(π)/2<annotation encoding="application/x-tex">H_2(\pi) \in \mathbb{Z}/2\mathbb{Z}</annotation></semantics> turns out to be <semantics>0<annotation encoding="application/x-tex">0</annotation></semantics> if the number of <semantics>1<annotation encoding="application/x-tex">1</annotation></semantics>s is congruent to <semantics>1<annotation encoding="application/x-tex">1</annotation></semantics> mod <semantics>4<annotation encoding="application/x-tex">4</annotation></semantics>, and <semantics>1<annotation encoding="application/x-tex">1</annotation></semantics> if the number of <semantics>1<annotation encoding="application/x-tex">1</annotation></semantics>s is congruent to <semantics>3<annotation encoding="application/x-tex">3</annotation></semantics> mod <semantics>4<annotation encoding="application/x-tex">4</annotation></semantics>.

  • What about distributions on two elements? In other words, let <semantics>α/p<annotation encoding="application/x-tex">\alpha \in \mathbb{Z}/p\mathbb{Z}</annotation></semantics> and put <semantics>π=(α,1α)<annotation encoding="application/x-tex">\pi = (\alpha, 1 - \alpha)</annotation></semantics>. What is <semantics>H p(π)<annotation encoding="application/x-tex">H_p(\pi)</annotation></semantics>?

    It takes a bit of algebra to figure this out, but it’s not too hard, and the outcome is that for <semantics>p2<annotation encoding="application/x-tex">p \neq 2</annotation></semantics>, <semantics>H p(α,1α)= r=1 p1α rr.<annotation encoding="application/x-tex"> H_p(\alpha, 1 - \alpha) = \sum_{r = 1}^{p - 1} \frac{\alpha^r}{r}. </annotation></semantics> This function was, in fact, the starting point of Kontsevich’s note, and it’s what he called the <semantics>112<annotation encoding="application/x-tex">1\tfrac{1}{2}</annotation></semantics>-logarithm.

We’ve now succeeded in finding a definition of entropy mod <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics> that satisfies the chain rule. That’s not quite enough, though. In principle, there could be loads of things satisfying the chain rule, in which case, what special status would ours have?

But in fact, up to the inevitable constant factor, our definition of entropy mod <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics> is the one and only definition satisfying the chain rule:

Theorem   Let <semantics>(I:Π n/p)<annotation encoding="application/x-tex">(I: \Pi_n \to \mathbb{Z}/p\mathbb{Z})</annotation></semantics> be a sequence of functions. Then <semantics>I<annotation encoding="application/x-tex">I</annotation></semantics> satisfies the chain rule if and only if <semantics>I=cH p<annotation encoding="application/x-tex">I = c H_p</annotation></semantics> for some <semantics>c/p<annotation encoding="application/x-tex">c \in \mathbb{Z}/p\mathbb{Z}</annotation></semantics>.

This is precisely analogous to the characterization theorem for real entropy, except that in the real case some analytic condition on <semantics>I<annotation encoding="application/x-tex">I</annotation></semantics> has to be imposed (continuity in Faddeev’s theorem, and measurability in the stronger theorem of Lee). So, this is excellent justification for calling <semantics>H p<annotation encoding="application/x-tex">H_p</annotation></semantics> the entropy mod <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>.

I’ll say nothing about the proof except the following. In Faddeev’s theorem over <semantics><annotation encoding="application/x-tex">\mathbb{R}</annotation></semantics>, the tricky part of the proof involves the fact that the sequence <semantics>(logn) n1<annotation encoding="application/x-tex">(\log n)_{n \geq 1}</annotation></semantics> is not uniquely characterized up to a constant factor by the equation <semantics>log(mn)=logm+logn<annotation encoding="application/x-tex">\log(m n) = \log m + \log n</annotation></semantics>; to make that work, you have to introduce some analytic condition. Over <semantics>/p<annotation encoding="application/x-tex">\mathbb{Z}/p\mathbb{Z}</annotation></semantics>, the tricky part involves the fact that the domain of the “logarithm” (Fermat quotient) is not <semantics>/p<annotation encoding="application/x-tex">\mathbb{Z}/p\mathbb{Z}</annotation></semantics>, but <semantics>/p 2<annotation encoding="application/x-tex">\mathbb{Z}/p^2\mathbb{Z}</annotation></semantics>. So, analytic difficulties are replaced by number-theoretic difficulties.

Kontsevich didn’t actually write down a definition of entropy mod <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics> in his two-and-a-half page note. He did exactly enough to show that there must be a unique sensible such definition… and left it there! Of course he could have worked it out if he’d wanted to, and maybe he even did, but he didn’t write it up here.

Anyway, let’s return to the quotation from Kontsevich that I began my first post with:

Conclusion: If we have a random variable <semantics>ξ<annotation encoding="application/x-tex">\xi</annotation></semantics> which takes finitely many values with all probabilities in <semantics><annotation encoding="application/x-tex">\mathbb{R}</annotation></semantics> then we can define not only the transcendental number <semantics>H(ξ)<annotation encoding="application/x-tex">H(\xi)</annotation></semantics> but also its “residues modulo <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>” for almost all primes <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics> !

In the notation of these posts, he’s saying the following. Let

<semantics>π=(π 1,,π n)<annotation encoding="application/x-tex"> \pi = (\pi_1, \ldots, \pi_n) </annotation></semantics>

be a real probability distribution in which each <semantics>π i<annotation encoding="application/x-tex">\pi_i</annotation></semantics> is rational. There are only finitely many primes that divide one or more of the denominators of <semantics>π 1,,π n<annotation encoding="application/x-tex">\pi_1, \ldots, \pi_n</annotation></semantics>. For primes <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics> not belonging to this exceptional set, we can interpret <semantics>π<annotation encoding="application/x-tex">\pi</annotation></semantics> as a probability distribution in <semantics>/p<annotation encoding="application/x-tex">\mathbb{Z}/p\mathbb{Z}</annotation></semantics>. We can therefore take its mod <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics> entropy, <semantics>H p(π)<annotation encoding="application/x-tex">H_p(\pi)</annotation></semantics>.

Kontsevich is playfully suggesting that we view <semantics>H p(π)/p<annotation encoding="application/x-tex">H_p(\pi) \in \mathbb{Z}/p\mathbb{Z}</annotation></semantics> as the residue class mod <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics> of <semantics>H (π)<annotation encoding="application/x-tex">H_\mathbb{R}(\pi) \in \mathbb{R}</annotation></semantics>.

There is more to this than meets the eye! Different real probability distributions can have the same real entropy, so there’s a question of consistency. Kontsevich’s suggestion only makes sense if

<semantics>H (π)=H (π)H p(π)=H p(π).<annotation encoding="application/x-tex"> H_\mathbb{R}(\pi) = H_\mathbb{R}(\pi') \implies H_p(\pi) = H_p(\pi'). </annotation></semantics>

And this is true! I have a proof, though I’m not convinced it’s optimal. Does anyone see an easy argument for this?

Let’s write <semantics> (p)<annotation encoding="application/x-tex">\mathcal{H}^{(p)}</annotation></semantics> for the set of real numbers of the form <semantics>H (π)<annotation encoding="application/x-tex">H_\mathbb{R}(\pi)</annotation></semantics>, where <semantics>π<annotation encoding="application/x-tex">\pi</annotation></semantics> is a real probability distribution whose probabilities <semantics>π i<annotation encoding="application/x-tex">\pi_i</annotation></semantics> can all be expressed as fractions with denominator not divisible by <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>. We’ve just seen that there’s a well-defined map

<semantics>[.]: (p)/p<annotation encoding="application/x-tex"> [.] : \mathcal{H}^{(p)} \to \mathbb{Z}/p\mathbb{Z} </annotation></semantics>

defined by

<semantics>[H (π)]=H p(π).<annotation encoding="application/x-tex"> [H_\mathbb{R}(\pi)] = H_p(\pi). </annotation></semantics>

For <semantics>x (p)<annotation encoding="application/x-tex">x \in \mathcal{H}^{(p)} \subseteq \mathbb{R}</annotation></semantics>, we view <semantics>[x]<annotation encoding="application/x-tex">[x]</annotation></semantics> as the congruence class mod <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics> of <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics>. This notion of “congruence class” even behaves something like the ordinary notion, in the sense that <semantics>[.]<annotation encoding="application/x-tex">[.]</annotation></semantics> preserves addition.

(We can even go a bit further. Accompanying the characterization theorem for entropy mod <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>, there is a characterization theorem for information loss mod <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>, strictly analogous to the theorem that John Baez, Tobias Fritz and I proved over <semantics><annotation encoding="application/x-tex">\mathbb{R}</annotation></semantics>. I won’t review that stuff here, but the point is that an information loss is a difference of entropies, and this enables us to define the congruence class mod <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics> of the difference of two elements of <semantics> (p)<annotation encoding="application/x-tex">\mathcal{H}^{(p)}</annotation></semantics>. The same additivity holds.)

There’s just one more thing. In a way, the definition of entropy mod <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics> is unsatisfactory. In order to define it, we had to step outside the world of <semantics>/p<annotation encoding="application/x-tex">\mathbb{Z}/p\mathbb{Z}</annotation></semantics> by making arbitrary choices of representing integers, and then we had to show that the definition was independent of those choices. Can’t we do it directly?

In fact, we can. It’s a well-known miracle about finite fields <semantics>K<annotation encoding="application/x-tex">K</annotation></semantics> that any function <semantics>KK<annotation encoding="application/x-tex">K \to K</annotation></semantics> is a polynomial. It’s a slightly less well-known miracle that any function <semantics>K nK<annotation encoding="application/x-tex">K^n \to K</annotation></semantics>, for any <semantics>n0<annotation encoding="application/x-tex">n \geq 0</annotation></semantics>, is also a polynomial.

Of course, multiple polynomials can induce the same function. For instance, the polynomials <semantics>x p<annotation encoding="application/x-tex">x^p</annotation></semantics> and <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics> induce the same function <semantics>/p/p<annotation encoding="application/x-tex">\mathbb{Z}/p\mathbb{Z} \to \mathbb{Z}/p\mathbb{Z}</annotation></semantics>. But it’s still possible to make a uniqueness statement. Given a function <semantics>F:K nK<annotation encoding="application/x-tex">F : K^n \to K</annotation></semantics>, there’s a unique polynomial <semantics>fK[x 1,,x n]<annotation encoding="application/x-tex">f \in K[x_1, \ldots, x_n]</annotation></semantics> that induces <semantics>F<annotation encoding="application/x-tex">F</annotation></semantics> and is of degree less than the order of <semantics>K<annotation encoding="application/x-tex">K</annotation></semantics> in each variable separately.

So, there must be a polynomial representing entropy, of order less than <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics> in each variable. And as it turns out, it’s this one:

<semantics>H p(π 1,,π n)= 0r 1,,r n<p: r 1++r n=pπ 1 r 1π n r nr 1!r n!.<annotation encoding="application/x-tex"> H_p(\pi_1, \ldots, \pi_n) = - \sum_{\substack{0 \leq r_1, \ldots, r_n \lt p:\\r_1 + \cdots + r_n = p}} \frac{\pi_1^{r_1} \cdots \pi_n^{r_n}}{r_1! \cdots r_n!}. </annotation></semantics>

You can check that when <semantics>n=2<annotation encoding="application/x-tex">n = 2</annotation></semantics>, this is in fact the same polynomial <semantics> r=1 p1π 1 r/r<annotation encoding="application/x-tex">\sum_{r = 1}^{p - 1} \pi_1^r/r</annotation></semantics> as we met before — Kontsevich’s <semantics>112<annotation encoding="application/x-tex">1\tfrac{1}{2}</annotation></semantics>-logarithm.

It’s striking that this direct formula for entropy modulo a prime looks quite unlike the formula for real entropy,

<semantics>H (π)= i:π i0π ilogπ i.<annotation encoding="application/x-tex"> H_\mathbb{R}(\pi) = - \sum_{i : \pi_i \neq 0} \pi_i \log \pi_i. </annotation></semantics>

It’s also striking that in the case <semantics>n=2<annotation encoding="application/x-tex">n = 2</annotation></semantics>, the formula for real entropy is

<semantics>H (α,1α)=αlogα(1α)log(1α),<annotation encoding="application/x-tex"> H_\mathbb{R}(\alpha, 1 - \alpha) = - \alpha \log \alpha - (1 - \alpha) \log(1 - \alpha), </annotation></semantics>

whereas mod <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>, we get

<semantics>H p(α,1α)= r=1 p1α rr,<annotation encoding="application/x-tex"> H_p(\alpha, 1 - \alpha) = \sum_{r = 1}^{p - 1} \frac{\alpha^r}{r}, </annotation></semantics>

which is a truncation of the Taylor series of <semantics>log(1α)<annotation encoding="application/x-tex">-\log(1 - \alpha)</annotation></semantics>. And yet, the characterization theorems for entropy over <semantics><annotation encoding="application/x-tex">\mathbb{R}</annotation></semantics> and over <semantics>/p<annotation encoding="application/x-tex">\mathbb{Z}/p\mathbb{Z}</annotation></semantics> are strictly analogous.

As I see it, there are two or three big open questions:

  • Entropy over <semantics><annotation encoding="application/x-tex">\mathbb{R}</annotation></semantics> can be understood, interpreted and applied in many ways. How can we understand, interpret or apply entropy mod <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>?

  • Entropy over <semantics><annotation encoding="application/x-tex">\mathbb{R}</annotation></semantics> and entropy mod <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics> are defined in roughly analogous ways, and uniquely characterized by strictly analogous theorems. Is there a common generalization? That is, can we unify the two definitions and characterization theorems, perhaps proving a theorem about entropy over suitable fields?

by leinster ( at December 17, 2017 09:10 PM

Peter Coles - In the Dark

The Taste and Tincture of Another Education

This is what a University education meant to the poet and theologian Thomas Traherne (1636-1674), according to his Centuries of Meditations. In this astonishing book describing his own voyage of spiritual discovery, Traherne celebrates, among many other things, the beauty and complexity of creation as a manifestation of the power of God. Even  a non-religious person like myself can find much to appreciate in his words about the wonder of the natural world and the joy of learning for learning’s sake:

Having been at the University, and received there the taste and tincture of another education, I saw that there were things in this world of which I never dreamed; glorious secrets, and glorious persons past imagination. 

There I saw that Logic, Ethics, Physics, Metaphysics, Geometry, Astronomy, Poesy, Medicine, Grammar, Music, Rhetoric all kinds of Arts, Trades, and Mechanisms that adorned the world pertained to felicity; at least there I saw those things, which afterwards I knew to pertain unto it: and was delighted in it. 

There I saw into the nature of the Sea, the Heavens, the Sun, the Moon and Stars, the Elements, Minerals, and Vegetables. All which appeared like the King’s Daughter, all glorious within; and those things which my nurses, and parents, should have talked of there were taught unto me.

by telescoper at December 17, 2017 06:37 PM

December 16, 2017

Christian P. Robert - xi'an's og

Réquiem por un campesino español [book review]

Thanks To Victor Elvira, I read this fantastic novel by Ramón Sender, a requiem for a Spanish peasant, Pablo, which tells the story of a bright and progressive Spanish peasant from Aragon, who got shot by the fascists during the Spanish Civil War. The story is short and brilliant, told from the eyes of the parish priest who denounced Pablo to the Franco falanges who eventually executed it. The style is brilliant as well, since the priest keeps returning to his long-term connection with Pablo, from his years as an altar boy, discovering poverty and injustice when visiting dying parishioners with the priest, to launching rural reform actions against the local landowners. And uselessly if understandably trying to justify his responsibility in the death of the young man, celebrating a mass in his memory where no one from the village attends, except for the landowners themselves. A truly moving celebration of the Spanish Civil War and of the massive support of the catholic church for Franco.

Filed under: Books, pictures, Travel Tagged: Aragon, book reviews, Catholic Church, Franco, Ramón Sender, Spain, Spanish Civil War, Spanish history

by xi'an at December 16, 2017 11:17 PM

Christian P. Robert - xi'an's og

how to make ISBA conference safe for all?

As Kristian Lum’s courageous posting of her harrowing experience at ISBA 2010 and of her resulting decision to leave academia, if not thankfully research (as demonstrated by her recent work on the biases in policing software), is hitting the Bayesian community and beyond as a salutary tsunami, I am seeking concrete actions to change ISBA meetings towards preventing to the largest extent sexual harassment and helping victims formally as well as informally, as Dan Simpson put it on his blog post. Having discussed the matter intensely with colleagues and friends over the past days, and joined a Task Force set immediately on Dec 14 by Kerrie Mengersen in her quality of President of ISBA, there are many avenues in the medium and long terms to approach such goals. But I feel the most urgent action is to introduce contact referents (for lack of a better name outside the military or the religious…) who at each conference could be reached at all times in case of need or of reporting inappropriate conduct of any kind. This may prove difficult to build, not because of a lack of volunteers but because of the difficulty in achieving a representativity of all attendees towards them trusting at least one member well enough to reach and confide. One section of ISBA, j-ISBA, can and definitely does help in this regard, including its involvement in the Task Force, but we need to reach further. As put by Kerrie in her statement, your input is valued.


Filed under: University life Tagged: ISBA, Kerrie Mengersen,, sexual harassment, Spain, Valencia conferences

by xi'an at December 16, 2017 11:16 AM

The n-Category Cafe

The Icosahedron and E8

Here’s a draft of a little thing I’m writing for the Newsletter of the London Mathematical Society. The regular icosahedron is connected to many ‘exceptional objects’ in mathematics, and here I describe two ways of using it to construct <semantics>E 8<annotation encoding="application/x-tex"> \mathrm{E}_8</annotation></semantics>. One uses a subring of the quaternions called the ‘icosians’, while the other uses Du Val’s work on the resolution of Kleinian singularities. I leave it as a challenge to find the connection between these two constructions!

(Dedicated readers of this blog may recall that I was struggling with the second construction in July. David Speyer helped me a lot, but I got distracted by other work and the discussion fizzled. Now I’ve made more progress… but I’ve realized that the details would never fit in the Newsletter, so I’m afraid anyone interested will have to wait a bit longer.)

You can get a PDF version here:

From the icosahedron to E8.

But blogs are more fun.

From the Icosahedron to E8

In mathematics, every sufficiently beautiful object is connected to all others. Many exciting adventures, of various levels of difficulty, can be had by following these connections. Take, for example, the icosahedron — that is, the regular icosahedron, one of the five Platonic solids. Starting from this it is just a hop, skip and a jump to the <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> lattice, a wonderful pattern of points in 8 dimensions! As we explore this connection we shall see that it also ties together many other remarkable entities: the golden ratio, the quaternions, the quintic equation, a highly symmetrical 4-dimensional shape called the 600-cell, and a manifold called the Poincaré homology 3-sphere.

Indeed, the main problem with these adventures is knowing where to stop. The story we shall tell is just a snippet of a longer one involving the McKay correspondence and quiver representations. It would be easy to bring in the octonions, exceptional Lie groups, and more. But it can be enjoyed without these digressions, so let us introduce the protagonists without further ado.

The icosahedron has a long history. According to a comment in Euclid’s Elements it was discovered by Plato’s friend Theaetetus, a geometer who lived from roughly 415 to 369 BC. Since Theaetetus is believed to have classified the Platonic solids, he may have found the icosahedron as part of this project. If so, it is one of the earliest mathematical objects discovered as part of a classification theorem. In any event, it was known to Plato: in his Timaeus, he argued that water comes in atoms of this shape.

The icosahedron has 20 triangular faces, 30 edges, and 12 vertices. We can take the vertices to be the four points

<semantics>(0,±1,±Φ)<annotation encoding="application/x-tex"> (0 , \pm 1 , \pm \Phi) </annotation></semantics>

and all those obtained from these by cyclic permutations of the coordinates, where

<semantics>Φ=5+12<annotation encoding="application/x-tex"> \displaystyle{ \Phi = \frac{\sqrt{5} + 1}{2} } </annotation></semantics>

is the golden ratio. Thus, we can group the vertices into three orthogonal golden rectangles: rectangles whose proportions are <semantics>Φ<annotation encoding="application/x-tex">\Phi</annotation></semantics> to 1.

In fact, there are five ways to do this. The rotational symmetries of the icosahedron permute these five ways, and any nontrivial rotation gives a nontrivial permutation. The rotational symmetry group of the icosahedron is thus a subgroup of <semantics>S 5<annotation encoding="application/x-tex"> \mathrm{S}_5</annotation></semantics>. Moreover, this subgroup has 60 elements. After all, any rotation is determined by what it does to a chosen face of the icosahedron: it can map this face to any of the 20 faces, and it can do so in 3 ways. The rotational symmetry group of the icosahedron is therefore a 60-element subgroup of <semantics>S 5<annotation encoding="application/x-tex"> \mathrm{S}_5</annotation></semantics>. Group theory therefore tells us that it must be the alternating group <semantics>A 5<annotation encoding="application/x-tex"> \mathrm{A}_5</annotation></semantics>.

The <semantics>E 8<annotation encoding="application/x-tex"> \mathrm{E}_8</annotation></semantics> lattice is harder to visualize than the icosahedron, but still easy to characterize. Take a bunch of equal-sized spheres in 8 dimensions. Get as many of these spheres to touch a single sphere as you possibly can. Then, get as many to touch those spheres as you possibly can, and so on. Unlike in 3 dimensions, where there is “wiggle room”, you have no choice about how to proceed, except for an overall rotation and translation. The balls will inevitably be centered at points of the <semantics>E 8<annotation encoding="application/x-tex"> \mathrm{E}_8</annotation></semantics> lattice!

We can also characterize the <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> lattice as the one giving the densest packing of spheres among all lattices in 8 dimensions. This packing was long suspected to be optimal even among those that do not arise from lattices — but this fact was proved only in 2016, by the young mathematician Maryna Viazovska [V].

We can also describe the <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> lattice more explicitly. In suitable coordinates, it consists of vectors for which:

• the components are either all integers or all integers plus <semantics>12<annotation encoding="application/x-tex">\textstyle{\frac{1}{2}}</annotation></semantics>, and

• the components sum to an even number.

This lattice consists of all integral linear combinations of the 8 rows of this matrix:

<semantics>(1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 1 1 0 12 12 12 12 12 12 12 12)<annotation encoding="application/x-tex"> \left( \begin{array}{rrrrrrrr} 1&amp;-1&amp;0&amp;0&amp;0&amp;0&amp;0&amp;0 \\ 0&amp;1&amp;-1&amp;0&amp;0&amp;0&amp;0&amp;0 \\ 0&amp;0&amp;1&amp;-1&amp;0&amp;0&amp;0&amp;0 \\ 0&amp;0&amp;0&amp;1&amp;-1&amp;0&amp;0&amp;0 \\ 0&amp;0&amp;0&amp;0&amp;1&amp;-1&amp;0&amp;0 \\ 0&amp;0&amp;0&amp;0&amp;0&amp;1&amp;-1&amp;0 \\ 0&amp;0&amp;0&amp;0&amp;0&amp;1&amp;1&amp;0 \\ -\frac{1}{2}&amp;-\frac{1}{2}&amp;-\frac{1}{2}&amp;-\frac{1}{2}&amp;-\frac{1}{2}&amp;-\frac{1}{2}&amp;-\frac{1}{2}&amp;-\frac{1}{2} \end{array} \right) </annotation></semantics>

The inner product of any row vector with itself is 2, while the inner product of distinct row vectors is either 0 or -1. Thus, any two of these vectors lie at an angle of either 90° or 120° from each other. If we draw a dot for each vector, and connect two dots by an edge when the angle between their vectors is 120° we get this pattern:

This is called the <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> Dynkin diagram. In the first part of our story we shall find the <semantics>E 8<annotation encoding="application/x-tex"> \mathrm{E}_8</annotation></semantics> lattice hiding in the icosahedron; in the second part, we shall find this diagram. The two parts of this story must be related — but the relation remains mysterious, at least to me.

The Icosians

The quickest route from the icosahedron to <semantics>E 8<annotation encoding="application/x-tex"> \mathrm{E}_8</annotation></semantics> goes through the fourth dimension. The symmetries of the icosahedron can be described using certain quaternions; the integer linear combinations of these form a subring of the quaternions called the ‘icosians’, but the icosians can be reinterpreted as a lattice in 8 dimensions, and this is the <semantics>E 8<annotation encoding="application/x-tex"> \mathrm{E}_8</annotation></semantics> lattice [CS]. Let us see how this works. The quaternions, discovered by Hamilton, are a 4-dimensional algebra

<semantics>={a+bi+cj+dk:a,b,c,d}<annotation encoding="application/x-tex"> \displaystyle{ \mathbb{H} = \{a + b i + c j + d k \colon \; a,b,c,d\in \mathbb{R}\} } </annotation></semantics>

with multiplication given as follows:

<semantics>i 2=j 2=k 2=1,<annotation encoding="application/x-tex"> \displaystyle{i^2 = j^2 = k^2 = -1, } </annotation></semantics> <semantics>ij=k=jiandcyclicpermutations<annotation encoding="application/x-tex"> \displaystyle{i j = k = - j i \; and \; cyclic \; permutations } </annotation></semantics>

It is a normed division algebra, meaning that the norm

<semantics>|a+bi+cj+dk|=a 2+b 2+c 2+d 2<annotation encoding="application/x-tex"> \displaystyle{ |a + b i + c j + d k| = \sqrt{a^2 + b^2 + c^2 + d^2} } </annotation></semantics>


<semantics>|qq|=|q||q|<annotation encoding="application/x-tex"> |q q'| = |q| |q'| </annotation></semantics>

for all <semantics>q,q<annotation encoding="application/x-tex">q,q' \in \mathbb{H}</annotation></semantics>. The unit sphere in <semantics><annotation encoding="application/x-tex"> \mathbb{H}</annotation></semantics> is thus a group, often called <semantics>SU(2)<annotation encoding="application/x-tex"> \mathrm{SU}(2)</annotation></semantics> because its elements can be identified with <semantics>2×2<annotation encoding="application/x-tex"> 2 \times 2</annotation></semantics> unitary matrices with determinant 1. This group acts as rotations of 3-dimensional Euclidean space, since we can see any point in <semantics> 3<annotation encoding="application/x-tex"> \mathbb{R}^3</annotation></semantics> as a purely imaginary quaternion <semantics>x=bi+cj+dk<annotation encoding="application/x-tex"> x = b i + c j + d k</annotation></semantics>, and the quaternion <semantics>qxq 1<annotation encoding="application/x-tex"> qxq^{-1}</annotation></semantics> is then purely imaginary for any <semantics>qSO(3)<annotation encoding="application/x-tex"> q \in \mathrm{SO}(3)</annotation></semantics>. Indeed, this action gives a double cover

<semantics>α:SU(2)SO(3)<annotation encoding="application/x-tex"> \displaystyle{ \alpha \colon \mathrm{SU}(2) \to \mathrm{SO}(3) } </annotation></semantics>

where <semantics>SO(3)<annotation encoding="application/x-tex"> \mathrm{SO}(3)</annotation></semantics> is the group of rotations of <semantics> 3<annotation encoding="application/x-tex"> \mathbb{R}^3</annotation></semantics>.

We can thus take any Platonic solid, look at its group of rotational symmetries, get a subgroup of <semantics>SO(3)<annotation encoding="application/x-tex"> \mathrm{SO}(3)</annotation></semantics>, and take its double cover in <semantics>SU(2)<annotation encoding="application/x-tex"> \mathrm{SU}(2)</annotation></semantics>. If we do this starting with the icosahedron, we see that the <semantics>60<annotation encoding="application/x-tex"> 60</annotation></semantics>-element group <semantics>A 5SO(3)<annotation encoding="application/x-tex"> \mathrm{A}_5 \subset \mathrm{SO}(3)</annotation></semantics> is covered by a 120-element group <semantics>ΓSU(2)<annotation encoding="application/x-tex"> \Gamma \subset \mathrm{SU}(2)</annotation></semantics>, called the binary icosahedral group.

The elements of <semantics>Γ<annotation encoding="application/x-tex"> \Gamma</annotation></semantics> are quaternions of norm one, and it turns out that they are the vertices of a 4-dimensional regular polytope: a 4-dimensional cousin of the Platonic solids. It deserves to be called the ‘hypericosahedron’, but it is usually called the 600-cell, since it has 600 tetrahedral faces. Here is the 600-cell projected down to 3 dimensions, drawn using Robert Webb’s Stella software:

Explicitly, if we identify <semantics><annotation encoding="application/x-tex"> \mathbb{H}</annotation></semantics> with <semantics> 4<annotation encoding="application/x-tex"> \mathbb{R}^4</annotation></semantics>, the elements of <semantics>Γ<annotation encoding="application/x-tex"> \Gamma</annotation></semantics> are the points

<semantics>(±12,±12,±12,±12)<annotation encoding="application/x-tex"> \displaystyle{ (\pm \textstyle{\frac{1}{2}}, \pm \textstyle{\frac{1}{2}},\pm \textstyle{\frac{1}{2}},\pm \textstyle{\frac{1}{2}}) } </annotation></semantics>

<semantics>(±1,0,0,0)<annotation encoding="application/x-tex"> \displaystyle{ (\pm 1, 0, 0, 0) }</annotation></semantics>

<semantics>12(±Φ,±1,±1/Φ,0),<annotation encoding="application/x-tex"> \displaystyle{ \textstyle{\frac{1}{2}} (\pm \Phi, \pm 1 , \pm 1/\Phi, 0 ),} </annotation></semantics>

and those obtained from these by even permutations of the coordinates. Since these points are closed under multiplication, if we take integral linear combinations of them we get a subring of the quaternions:

<semantics>𝕀={ qΓa qq:a q}.<annotation encoding="application/x-tex"> \displaystyle{ \mathbb{I} = \{ \sum_{q \in \Gamma} a_q q : \; a_q \in \mathbb{Z} \} \subset \mathbb{H} .} </annotation></semantics>

Conway and Sloane [CS] call this the ring of icosians. The icosians are not a lattice in the quaternions: they are dense. However, any icosian is of the form <semantics>a+bi+cj+dk<annotation encoding="application/x-tex"> a + bi + cj + dk</annotation></semantics> where <semantics>a,b,c<annotation encoding="application/x-tex"> a,b,c</annotation></semantics>, and <semantics>d<annotation encoding="application/x-tex"> d</annotation></semantics> live in the golden field

<semantics>(5)={x+5y:x,y}<annotation encoding="application/x-tex"> \displaystyle{ \mathbb{Q}(\sqrt{5}) = \{ x + \sqrt{5} y : \; x,y \in \mathbb{Q}\} } </annotation></semantics>

Thus we can think of an icosian as an 8-tuple of rational numbers. Such 8-tuples form a lattice in 8 dimensions.

In fact we can put a norm on the icosians as follows. For <semantics>q𝕀<annotation encoding="application/x-tex"> q \in \mathbb{I}</annotation></semantics> the usual quaternionic norm has

<semantics>|q| 2=x+5y<annotation encoding="application/x-tex"> \displaystyle{ |q|^2 = x + \sqrt{5} y } </annotation></semantics>

for some rational numbers <semantics>x<annotation encoding="application/x-tex"> x</annotation></semantics> and <semantics>y<annotation encoding="application/x-tex"> y</annotation></semantics>, but we can define a new norm on <semantics>𝕀<annotation encoding="application/x-tex"> \mathbb{I}</annotation></semantics> by setting

<semantics>q 2=x+y<annotation encoding="application/x-tex"> \displaystyle{ \|q\|^2 = x + y } </annotation></semantics>

With respect to this new norm, the icosians form a lattice that fits isometrically in 8-dimensional Euclidean space. And this is none other than <semantics>E 8<annotation encoding="application/x-tex"> \mathrm{E}_8</annotation></semantics>!

Klein’s Icosahedral Function

Not only is the <semantics>E 8<annotation encoding="application/x-tex"> \mathrm{E}_8</annotation></semantics> lattice hiding in the icosahedron; so is the <semantics>E 8<annotation encoding="application/x-tex"> \mathrm{E}_8</annotation></semantics> Dynkin diagram. The space of all regular icosahedra of arbitrary size centered at the origin has a singularity, which corresponds to a degenerate special case: the icosahedron of zero size. If we resolve this singularity in a minimal way we get eight Riemann spheres, intersecting in a pattern described by the <semantics>E 8<annotation encoding="application/x-tex"> \mathrm{E}_8</annotation></semantics> Dynkin diagram!

This remarkable story starts around 1884 with Felix Klein’s Lectures on the Icosahedron [Kl]. In this work he inscribed an icosahedron in the Riemann sphere, <semantics>P 1<annotation encoding="application/x-tex"> \mathbb{C}\mathrm{P}^1</annotation></semantics>. He thus got the icosahedron’s symmetry group, <semantics>A 5<annotation encoding="application/x-tex"> \mathrm{A}_5</annotation></semantics>, to act as conformal transformations of <semantics>P 1<annotation encoding="application/x-tex"> \mathbb{C}\mathrm{P}^1</annotation></semantics> — indeed, rotations. He then found a rational function of one complex variable that is invariant under all these transformations. This function equals <semantics>0<annotation encoding="application/x-tex"> 0</annotation></semantics> at the centers of the icosahedron’s faces, 1 at the midpoints of its edges, and <semantics><annotation encoding="application/x-tex"> \infty</annotation></semantics> at its vertices.

Here is Klein’s icosahedral function as drawn by Abdelaziz Nait Merzouk. The color shows its phase, while the contour lines show its magnitude:

We can think of Klein’s icosahedral function as a branched cover of the Riemann sphere by itself with 60 sheets:

<semantics>:P 1P 1.<annotation encoding="application/x-tex"> \displaystyle{ \mathcal{I} \colon \mathbb{C}\mathrm{P}^1 \to \mathbb{C}\mathrm{P}^1 .} </annotation></semantics>

Indeed, <semantics>A 5<annotation encoding="application/x-tex"> \mathrm{A}_5</annotation></semantics> acts on <semantics>P 1<annotation encoding="application/x-tex"> \mathbb{C}\mathrm{P}^1</annotation></semantics>, and the quotient space <semantics>P 1/A 5<annotation encoding="application/x-tex"> \mathbb{C}\mathrm{P}^1/\mathrm{A}_5</annotation></semantics> is isomorphic to <semantics>P 1<annotation encoding="application/x-tex"> \mathbb{C}\mathrm{P}^1</annotation></semantics> again. The function <semantics><annotation encoding="application/x-tex"> \mathcal{I}</annotation></semantics> gives an explicit formula for the quotient map <semantics>P 1P 1/A 5P 1<annotation encoding="application/x-tex"> \mathbb{C}\mathrm{P}^1 \to \mathbb{C}\mathrm{P}^1/\mathrm{A}_5 \cong \mathbb{C}\mathrm{P}^1</annotation></semantics>.

Klein managed to reduce solving the quintic to the problem of solving the equation <semantics>(z)=w<annotation encoding="application/x-tex"> \mathcal{I}(z) = w</annotation></semantics> for <semantics>z<annotation encoding="application/x-tex"> z</annotation></semantics>. A modern exposition of this result is Shurman’s Geometry of the Quintic [Sh]. For a more high-powered approach, see the paper by Nash [N]. Unfortunately, neither of these treatments avoids complicated calculations. But our interest in Klein’s icosahedral function here does not come from its connection to the quintic: instead, we want to see its connection to <semantics>E 8<annotation encoding="application/x-tex"> \mathrm{E}_8</annotation></semantics>.

For this we should actually construct Klein’s icosahedral function. To do this, recall that the Riemann sphere <semantics>P 1<annotation encoding="application/x-tex"> \mathbb{C}\mathrm{P}^1</annotation></semantics> is the space of 1-dimensional linear subspaces of <semantics> 2<annotation encoding="application/x-tex"> \mathbb{C}^2</annotation></semantics>. Let us work directly with <semantics> 2<annotation encoding="application/x-tex"> \mathbb{C}^2</annotation></semantics>. While <semantics>SO(3)<annotation encoding="application/x-tex"> \mathrm{SO}(3)</annotation></semantics> acts on <semantics>P 1<annotation encoding="application/x-tex"> \mathbb{C}\mathrm{P}^1</annotation></semantics>, this comes from an action of this group’s double cover <semantics>SU(2)<annotation encoding="application/x-tex"> \mathrm{SU}(2)</annotation></semantics> on <semantics> 2<annotation encoding="application/x-tex"> \mathbb{C}^2</annotation></semantics>. As we have seen, the rotational symmetry group of the icosahedron, <semantics>A 5SO(3)<annotation encoding="application/x-tex"> \mathrm{A}_5 \subset \mathrm{SO}(3)</annotation></semantics>, is double covered by the binary icosahedral group <semantics>ΓSU(2)<annotation encoding="application/x-tex"> \Gamma \subset \mathrm{SU}(2)</annotation></semantics>. To build an <semantics>A 5<annotation encoding="application/x-tex"> \mathrm{A}_5</annotation></semantics>-invariant rational function on <semantics>P 1<annotation encoding="application/x-tex"> \mathbb{C}\mathrm{P}^1</annotation></semantics>, we should thus look for <semantics>Γ<annotation encoding="application/x-tex"> \Gamma</annotation></semantics>-invariant homogeneous polynomials on <semantics> 2<annotation encoding="application/x-tex"> \mathbb{C}^2</annotation></semantics>.

It is easy to construct three such polynomials:

<semantics>V<annotation encoding="application/x-tex"> V</annotation></semantics>, of degree <semantics>12<annotation encoding="application/x-tex"> 12</annotation></semantics>, vanishing on the 1d subspaces corresponding to icosahedron vertices.

<semantics>E<annotation encoding="application/x-tex"> E</annotation></semantics>, of degree <semantics>30<annotation encoding="application/x-tex"> 30</annotation></semantics>, vanishing on the 1d subspaces corresponding to icosahedron edge midpoints.

<semantics>F<annotation encoding="application/x-tex"> F</annotation></semantics>, of degree <semantics>20<annotation encoding="application/x-tex"> 20</annotation></semantics>, vanishing on the 1d subspaces corresponding to icosahedron face centers.

Remember, we have embedded the icosahedron in <semantics>P 1<annotation encoding="application/x-tex"> \mathbb{C}\mathrm{P}^1</annotation></semantics>, and each point in <semantics>P 1<annotation encoding="application/x-tex"> \mathbb{C}\mathrm{P}^1</annotation></semantics> is a 1-dimensional subspace of <semantics> 2<annotation encoding="application/x-tex"> \mathbb{C}^2</annotation></semantics>, so each icosahedron vertex determines such a subspace, and there is a linear function on <semantics> 2<annotation encoding="application/x-tex"> \mathbb{C}^2</annotation></semantics>, unique up to a constant factor, that vanishes on this subspace. The icosahedron has <semantics>12<annotation encoding="application/x-tex"> 12</annotation></semantics> vertices, so we get <semantics>12<annotation encoding="application/x-tex"> 12</annotation></semantics> linear functions this way. Multiplying them gives <semantics>V<annotation encoding="application/x-tex"> V</annotation></semantics>, a homogeneous polynomial of degree <semantics>12<annotation encoding="application/x-tex"> 12</annotation></semantics> on <semantics> 2<annotation encoding="application/x-tex"> \mathbb{C}^2</annotation></semantics> that vanishes on all the subspaces corresponding to icosahedron vertices! The same trick gives <semantics>E<annotation encoding="application/x-tex"> E</annotation></semantics>, which has degree <semantics>30<annotation encoding="application/x-tex"> 30</annotation></semantics> because the icosahedron has <semantics>30<annotation encoding="application/x-tex"> 30</annotation></semantics> edges, and <semantics>F<annotation encoding="application/x-tex"> F</annotation></semantics>, which has degree <semantics>20<annotation encoding="application/x-tex"> 20</annotation></semantics> because the icosahedron has <semantics>20<annotation encoding="application/x-tex"> 20</annotation></semantics> faces.

A bit of work is required to check that <semantics>V,E<annotation encoding="application/x-tex"> V,E</annotation></semantics> and <semantics>F<annotation encoding="application/x-tex"> F</annotation></semantics> are invariant under <semantics>Γ<annotation encoding="application/x-tex"> \Gamma</annotation></semantics>, instead of changing by constant factors under group transformations. Indeed, if we had copied this construction using a tetrahedron or octahedron, this would not be the case. For details, see Shurman’s book [Sh], which is free online, or van Hoboken’s nice thesis [VH].

Since both <semantics>F 3<annotation encoding="application/x-tex"> F^3</annotation></semantics> and <semantics>V 5<annotation encoding="application/x-tex"> V^5</annotation></semantics> have degree <semantics>60<annotation encoding="application/x-tex"> 60</annotation></semantics>, <semantics>F 3/V 5<annotation encoding="application/x-tex"> F^3/V^5</annotation></semantics> is homogeneous of degree zero, so it defines a rational function <semantics>:P 1P 1<annotation encoding="application/x-tex"> \mathcal{I} \colon \mathbb{C}\mathrm{P}^1 \to \mathbb{C}\mathrm{P}^1</annotation></semantics>. This function is invariant under <semantics>A 5<annotation encoding="application/x-tex"> \mathrm{A}_5</annotation></semantics> because <semantics>F<annotation encoding="application/x-tex"> F</annotation></semantics> and <semantics>V<annotation encoding="application/x-tex"> V</annotation></semantics> are invariant under <semantics>Γ<annotation encoding="application/x-tex"> \Gamma</annotation></semantics>. Since <semantics>F<annotation encoding="application/x-tex"> F</annotation></semantics> vanishes at face centers of the icosahedron while <semantics>V<annotation encoding="application/x-tex"> V</annotation></semantics> vanishes at vertices, <semantics>=F 3/V 5<annotation encoding="application/x-tex"> \mathcal{I} = F^3/V^5</annotation></semantics> equals <semantics>0<annotation encoding="application/x-tex"> 0</annotation></semantics> at face centers and <semantics><annotation encoding="application/x-tex"> \infty</annotation></semantics> at vertices. Finally, thanks to its invariance property, <semantics><annotation encoding="application/x-tex"> \mathcal{I}</annotation></semantics> takes the same value at every edge center, so we can normalize <semantics>V<annotation encoding="application/x-tex"> V</annotation></semantics> or <semantics>F<annotation encoding="application/x-tex"> F</annotation></semantics> to make this value 1. Thus, <semantics><annotation encoding="application/x-tex"> \mathcal{I}</annotation></semantics> has precisely the properties required of Klein’s icosahedral function!

The Appearance of E8

Now comes the really interesting part. Three polynomials on a 2-dimensional space must obey a relation, and <semantics>V,E<annotation encoding="application/x-tex"> V,E</annotation></semantics>, and <semantics>F<annotation encoding="application/x-tex"> F</annotation></semantics> obey a very pretty one, at least after we normalize them correctly:

<semantics>V 5+E 2+F 3=0.<annotation encoding="application/x-tex"> \displaystyle{ V^5 + E^2 + F^3 = 0. } </annotation></semantics>

We could guess this relation simply by noting that each term must have the same degree. Every <semantics>Γ<annotation encoding="application/x-tex"> \Gamma</annotation></semantics>-invariant polynomial on <semantics> 2<annotation encoding="application/x-tex"> \mathbb{C}^2</annotation></semantics> is a polynomial in <semantics>V,E<annotation encoding="application/x-tex"> V, E</annotation></semantics> and <semantics>F<annotation encoding="application/x-tex"> F</annotation></semantics>, and indeed

<semantics> 2/Γ{(V,E,F) 3:V 5+E 2+F 3=0}.<annotation encoding="application/x-tex"> \displaystyle{ \mathbb{C}^2 / \Gamma \cong \{ (V,E,F) \in \mathbb{C}^3 \colon \; V^5 + E^2 + F^3 = 0 \} . } </annotation></semantics>

This complex surface is smooth except at <semantics>V=E=F=0<annotation encoding="application/x-tex"> V = E = F = 0</annotation></semantics>, where it has a singularity. And hiding in this singularity is <semantics>E 8<annotation encoding="application/x-tex"> \mathrm{E}_8</annotation></semantics>!

To see this, we need to ‘resolve’ the singularity. Roughly, this means that we find a smooth complex surface <semantics>S<annotation encoding="application/x-tex"> S</annotation></semantics> and an onto map

that is one-to-one away from the singularity. (More precisely, if <semantics>X<annotation encoding="application/x-tex"> X</annotation></semantics> is an algebraic variety with singular points <semantics>X singX<annotation encoding="application/x-tex"> X_{\mathrm{sing}} \subset X</annotation></semantics>, <semantics>π:SX<annotation encoding="application/x-tex"> \pi \colon S \to X</annotation></semantics> is a resolution of <semantics>X<annotation encoding="application/x-tex"> X</annotation></semantics> if <semantics>S<annotation encoding="application/x-tex"> S</annotation></semantics> is smooth, <semantics>π<annotation encoding="application/x-tex"> \pi</annotation></semantics> is proper, <semantics>π 1(XX sing)<annotation encoding="application/x-tex"> \pi^{-1}(X - X_{sing})</annotation></semantics> is dense in <semantics>S<annotation encoding="application/x-tex"> S</annotation></semantics>, and <semantics>π<annotation encoding="application/x-tex"> \pi</annotation></semantics> is an isomorphism between <semantics>π 1(XX sing)<annotation encoding="application/x-tex"> \pi^{-1}(X - X_{sing})</annotation></semantics> and <semantics>XX sing<annotation encoding="application/x-tex"> X - X_{sing}</annotation></semantics>. For more details see Lamotke’s book [L].)

There are many such resolutions, but one minimal resolution, meaning that all others factor uniquely through this one:

What sits above the singularity in this minimal resolution? Eight copies of the Riemann sphere <semantics>P 1<annotation encoding="application/x-tex"> \mathbb{C}\mathrm{P}^1</annotation></semantics>, one for each dot here:

Two of these <semantics>P 1<annotation encoding="application/x-tex"> \mathbb{C}\mathrm{P}^1</annotation></semantics>s intersect in a point if their dots are connected by an edge: otherwise they are disjoint.

This amazing fact was discovered by Patrick Du Val in 1934 [DV]. Why is it true? Alas, there is not enough room in the margin, or even in the entire blog article, to explain this. The books by Kirillov [Ki] and Lamotke [L] fill in the details. But here is a clue. The <semantics>E 8<annotation encoding="application/x-tex"> \mathrm{E}_8</annotation></semantics> Dynkin diagram has ‘legs’ of lengths <semantics>5,2<annotation encoding="application/x-tex"> 5, 2</annotation></semantics> and <semantics>3<annotation encoding="application/x-tex"> 3</annotation></semantics>:

On the other hand,

<semantics>A 5v,e,f|v 5=e 2=f 3=vef=1<annotation encoding="application/x-tex"> \displaystyle{ \mathrm{A}_5 \cong \langle v, e, f | v^5 = e^2 = f^3 = v e f = 1 \rangle } </annotation></semantics>

where in terms of the rotational symmetries of the icosahedron:

<semantics>v<annotation encoding="application/x-tex"> v</annotation></semantics> is a <semantics>1/5<annotation encoding="application/x-tex"> 1/5</annotation></semantics> turn around some vertex of the icosahedron,

<semantics>e<annotation encoding="application/x-tex"> e</annotation></semantics> is a <semantics>1/2<annotation encoding="application/x-tex"> 1/2</annotation></semantics> turn around the center of an edge touching that vertex,

<semantics>f<annotation encoding="application/x-tex"> f</annotation></semantics> is a <semantics>1/3<annotation encoding="application/x-tex"> 1/3</annotation></semantics> turn around the center of a face touching that vertex,

and we must choose the sense of these rotations correctly to obtain <semantics>vef=1<annotation encoding="application/x-tex"> v e f = 1</annotation></semantics>. To get a presentation of the binary icosahedral group we drop one relation:

<semantics>Γv,e,f|v 5=e 2=f 3=vef<annotation encoding="application/x-tex"> \displaystyle{ \Gamma \cong \langle v, e, f | v^5 = e^2 = f^3 = v e f \rangle } </annotation></semantics>

The dots in the <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> Dynkin diagram correspond naturally to conjugacy classes in <semantics>Γ<annotation encoding="application/x-tex">\Gamma</annotation></semantics>, not counting the conjugacy class of the central element <semantics>1<annotation encoding="application/x-tex">-1</annotation></semantics>. Each of these conjugacy classes, in turn, gives a copy of <semantics>P 1<annotation encoding="application/x-tex">\mathbb{C}\mathrm{P}^1</annotation></semantics> in the minimal resolution of <semantics> 2/Γ<annotation encoding="application/x-tex">\mathbb{C}^2/\Gamma</annotation></semantics>.

Not only the <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> Dynkin diagram, but also the <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> lattice, can be found in the minimal resolution of <semantics> 2/Γ<annotation encoding="application/x-tex">\mathbb{C}^2/\Gamma</annotation></semantics>. Topologically, this space is a 4-dimensional manifold. Its real second homology group is an 8-dimensional vector space with an inner product given by the intersection pairing. The integral second homology is a lattice in this vector space spanned by the 8 copies of <semantics>P 1<annotation encoding="application/x-tex">\mathbb{C}P^1</annotation></semantics> we have just seen—and it is a copy of the <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> lattice [KS].

But let us turn to a more basic question: what is <semantics> 2/Γ<annotation encoding="application/x-tex">\mathbb{C}^2/\Gamma</annotation></semantics> like as a topological space? To tackle this, first note that we can identify a pair of complex numbers with a single quaternion, and this gives a homeomorphism

<semantics> 2/Γ/Γ<annotation encoding="application/x-tex"> \mathbb{C}^2/\Gamma \cong \mathbb{H}/\Gamma </annotation></semantics>

where we let <semantics>Γ<annotation encoding="application/x-tex">\Gamma</annotation></semantics> act by right multiplication on <semantics><annotation encoding="application/x-tex">\mathbb{H}</annotation></semantics>. So, it suffices to understand <semantics>/Γ<annotation encoding="application/x-tex">\mathbb{H}/\Gamma</annotation></semantics>.

Next, note that sitting inside <semantics>/Γ<annotation encoding="application/x-tex">\mathbb{H}/\Gamma</annotation></semantics> are the points coming from the unit sphere in <semantics><annotation encoding="application/x-tex">\mathbb{H}</annotation></semantics>. These points form the 3-dimensional manifold <semantics>SU(2)/Γ<annotation encoding="application/x-tex">\mathrm{SU}(2)/\Gamma</annotation></semantics>, which is called the Poincaré homology 3-sphere [KS]. This is a wonderful thing in its own right: Poincaré discovered it as a counterexample to his guess that any compact 3-manifold with the same homology as a 3-sphere is actually diffeomorphic to the 3-sphere, and it is deeply connected to <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics>. But for our purposes, what matters is that we can think of this manifold in another way, since we have a diffeomorphism

<semantics>SU(2)/ΓSO(3)/A 5.<annotation encoding="application/x-tex"> \mathrm{SU}(2)/\Gamma \cong \mathrm{SO}(3)/\mathrm{A}_5. </annotation></semantics>

The latter is just the space of all icosahedra inscribed in the unit sphere in 3d space, where we count two as the same if they differ by a rotational symmetry.

This is a nice description of the points of <semantics>/Γ<annotation encoding="application/x-tex">\mathbb{H}/\Gamma</annotation></semantics> coming from points in the unit sphere of <semantics>H<annotation encoding="application/x-tex">\H</annotation></semantics>. But every quaternion lies in some sphere centered at the origin of <semantics><annotation encoding="application/x-tex">\mathbb{H}</annotation></semantics>, of possibly zero radius. It follows that <semantics> 2/Γ/Γ<annotation encoding="application/x-tex">\mathbb{C}^2/\Gamma \cong \mathbb{H}/\Gamma</annotation></semantics> is the space of all icosahedra centered at the origin of 3d space — of arbitrary size, including a degenerate icosahedron of zero size. This degenerate icosahedron is the singular point in <semantics> 2/Γ<annotation encoding="application/x-tex">\mathbb{C}^2/\Gamma</annotation></semantics>. This is where <semantics>E 8<annotation encoding="application/x-tex">\E_8</annotation></semantics> is hiding.

Clearly much has been left unexplained in this brief account. Most of the missing details can be found in the references. But it remains unknown — at least to me — how the two constructions of <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> from the icosahedron fit together in a unified picture.

Recall what we did. First we took the binary icosahedral group <semantics>Γ<annotation encoding="application/x-tex">\Gamma \subset \mathbb{H}</annotation></semantics>, took integer linear combinations of its elements, thought of these as forming a lattice in an 8-dimensional rational vector space with a natural norm, and discovered that this lattice is a copy of the <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> lattice. Then we took <semantics> 2/Γ/Γ<annotation encoding="application/x-tex">\mathbb{C}^2/\Gamma \cong \mathbb{H}/\Gamma</annotation></semantics>, took its minimal resolution, and found that the integral 2nd homology of this space, equipped with its natural inner product, is a copy of the <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> lattice. From the same ingredients we built the same lattice in two very different ways! How are these constructions connected? This puzzle deserves a nice solution.


I thank Tong Yang for inviting me to speak on this topic at the Annual General Meeting of the Hong Kong Mathematical Society on May 20, 2017, and Guowu Meng for hosting me at the HKUST while I prepared that talk. I also thank the many people, too numerous to accurately list, who have helped me understand these topics over the years.


[CS] J. H. Conway and N. J. A. Sloane, Sphere Packings, Lattices and Groups, Springer, Berlin, 2013.

[DV] P. du Val, On isolated singularities of surfaces which do not affect the conditions of adjunction, I, II and III, Proc. Camb. Phil. Soc. 30, 453–459, 460–465, 483–491.

[KS] R. Kirby and M. Scharlemann, Eight faces of the Poincaré homology 3-sphere, Usp. Mat. Nauk. 37 (1982), 139–159. Available at

[Ki] A. Kirillov, Quiver Representations and Quiver Varieties, AMS, Providence, Rhode Island, 2016.

[Kl] F. Klein, Lectures on the Ikosahedron and the Solution of Equations of the Fifth Degree, Trüubner & Co., London, 1888. Available at

[L] K. Lamotke, Regular Solids and Isolated Singularities, Vieweg & Sohn, Braunschweig, 1986.

[N] O. Nash, On Klein’s icosahedral solution of the quintic. Available as arXiv:1308.0955.

[Sh] J. Shurman, Geometry of the Quintic, Wiley, New York, 1997. Available at

[Sl] P. Slodowy, Platonic solids, Kleinian singularities, and Lie groups, in Algebraic Geometry, Lecture Notes in Mathematics 1008, Springer, Berlin, 1983, pp. 102–138.

[VH] J. van Hoboken, Platonic Solids, Binary Polyhedral Groups, Kleinian Singularities and Lie Algebras of Type A, D, E, Master’s Thesis, University of Amsterdam, 2002. Available at

[V] M. Viazovska, The sphere packing problem in dimension 8, Ann. Math. 185 (2017), 991–1015. Available at

by john ( at December 16, 2017 03:00 AM

John Baez - Azimuth

The 600-Cell

I can’t stop thinking about the 600-cell:

It’s a ‘Platonic solid in 4 dimensions’ with 600 tetrahedral faces and 120 vertices. One reason I like it is that you can think of these vertices as forming a group: a double cover of the rotational symmetry group of the icosahedron. Another reason is that it’s a halfway house between the icosahedron and the \mathrm{E}_8 lattice. I explained all this in my last post here:

From the icosahedron to E8.

I wrote that post as a spinoff of an article I was writing for the Newsletter of the London Mathematical Society, which had a deadline attached to it. Now I should be writing something else, for another deadline. But somehow deadlines strongly demotivate me—they make me want to do anything else. So I’ve been continuing to think about the 600-cell. I posed some puzzles about it in the comments to my last post, and they led me to some interesting thoughts, which I feel like explaining. But they’re not quite solidified, so right now I just want to give a fairly concrete picture of the 600-cell, or at least its vertices.

This will be a much less demanding post than the last one—and correspondingly less rewarding. Remember the basic idea:

Points in the 3-sphere can be seen as quaternions of norm 1, and these form a group \mathrm{SU}(2) that double covers \mathrm{SO}(3). The vertices of the 600-cell are the points of a subgroup \Gamma \subset \mathrm{SU}(2) that double covers the rotational symmetry group of the icosahedron. This group \Gamma is the famous binary icosahedral group.

Thus, we can name the vertices of the 600-cell by rotations of the icosahedron—as long as we remember to distinguish between a rotation by \theta and a rotation by \theta + 2\pi. Let’s do it!

• 0° (1 of these). We can take the identity rotation as our chosen ‘favorite’ vertex of the 600-cell.

• 72° (12 of these). The nearest neighbors of our chosen vertex correspond to the rotations by the smallest angles that are symmetries of the icosahedron; these correspond to taking any of its 12 vertices and giving it a 1/5 turn clockwise.

• 120° (20 of these). The next nearest neighbors correspond to taking one of the 20 faces of the icosahedron and giving it a 1/3 turn clockwise.

• 144° (12 of these). These correspond to taking one of the vertices of the icosahedron and giving it a 2/5 turn clockwise.

• 180° (30 of these). These correspond to taking one of the edges and giving it a 1/2 turn clockwise. (Note that since we’re working in the double cover \mathrm{SU(2)} rather than \mathrm{SO}(3), giving one edge a half turn clockwise counts as different than giving the opposite edge a half turn clockwise.)

• 216° (12 of these). These correspond to taking one of the vertices of the icosahedron and giving it a 3/5 turn clockwise. (Again, this counts as different than rotating the opposite vertex by a 2/5 turn clockwise.)

• 240° (20 of these). These correspond to taking one of the faces of the icosahedron and giving it a 2/3 turn clockwise. (Again, this counts as different than rotating the opposite vertex by a 1/3 turn clockwise.)

• 288° (12 of these). These correspond to taking any of the vertices and giving it a 4/5 turn clockwise.

• 360° (1 of these). This corresponds to a full turn in any direction.

Let’s check:

1 + 12 + 20 + 12 + 30 + 12 + 20 + 12 + 1 = 120

Good! We need a total of 120 vertices.

This calculation also shows that if we move a hyperplane through the 3-sphere, which hits our favorite vertex the moment it touches the 3-sphere, it will give the following slices of the 600-cell:

• Slice 1: a point (our favorite vertex),

• Slice 2: a dodecahedron (its 12 nearest neighbors),

• Slice 3: an icosahedron (the 20 next-nearest neighbors),

• Slice 4: a dodecahedron (the 12 third-nearest neighbors),

• Slice 5: an icosidodecahedron (the 30 fourth-nearest neighbors),

• Slice 6: a dodecahedron (the 12 fifth-nearest neighbors),

• Slice 7: an icosahedron (the 20 sixth-nearest neighbors),

• Slice 8: a dodecahedron (the 12 seventh-nearest neighbors),

• Slice 9: a point (the vertex opposite our favorite).

Here’s a picture drawn by J. Gregory Moxness, illustrating this:

Note that there are 9 slices. Each corresponds to a different conjugacy class in the group \Gamma. These in turn correspond to the dots in the extended Dynkin diagram of \mathrm{E}_8, which has the usual 8 dots and one more.

The usual \mathrm{E}_8 Dynkin diagram has ‘legs’ of lengths 5, 2 and 3:

The three legs correspond to conjugacy classes in \Gamma that map to rotational symmetries of an icosahedron that preserve a vertex (5 conjugacy classes), an edge (2 conjugacy classes), and a (3 conjugacy classes)… not counting the element -1 \in \Gamma. That last element gives the extra dot in the extended Dynkin diagram.

by John Baez at December 16, 2017 01:44 AM

December 15, 2017

Christian P. Robert - xi'an's og

the decline of the French [maths] empire

In Le Monde edition of Nov 5, an article on the difficulty of maths departments to attract students, especially in master programs and in the training of secondary school maths teachers (Agrégation & CAPES), where the number of candidates usually does not reach the number of potential positions… And also on the deep changes in the training of secondary school pupils, who over the past five years have lost a considerable amount of maths bases and hence are found missing when entering the university level. (Or, put otherwise, have a lower level in maths that implies a strong modification of our own programs and possibly the addition of an extra year or at least semester to the bachelor degree…) For instance, a few weeks ago, I realised for instance that my third year class had little idea of a conditional density and teaching measure theory at this level becomes more and more of a challenge!

Filed under: Kids, University life Tagged: bachelor, Cédric Villani, French education ministry, Le Monde, measure theory, programmes, secondary schools, teaching

by xi'an at December 15, 2017 11:17 PM

Tommaso Dorigo - Scientificblogging

Diboson Resonances: A Whole New World - Part 1
After one quite frantic November, I emerged victorious two weeks ago from the delivery of a 78-pages, 49-thousand-word review titled "Hadron Collider Searches for Diboson Resonances". The article, which will be published in the prestigious "Progress in Particle and Nuclear Physics", an Elsevier journal with an impact factor above 11 (compare with Physics Letters B, IF=4.8, or Physical Review Letters, IF=8.5, to see why it's relevant), is currently in peer review, but that does not mean that I cannot make a short summary of its contents here.

read more

by Tommaso Dorigo at December 15, 2017 05:01 PM

Peter Coles - In the Dark

Trees, Graphs and the Leaving Certificate

I’m starting to get the hang of some of the differences between things here in Ireland and the United Kingdom, both domestically and in the world of work.

One of the most important points of variation that concerns academic life is the school system students go through before going to University. In the system operating in England and Wales the standard qualification for entry is the GCE A-level. Most students take A-levels in three subjects, which gives them a relatively narrow focus although the range of subjects to choose from is rather large. In Ireland the standard qualification is the Leaving Certificate, which comprises a minimum of six subjects, giving students a broader range of knowledge at the sacrifice (perhaps) of a certain amount of depth; it has been decreed for entry into this system that an Irish Leaving Certificate counts as about 2/3 of an A-level for admissions purposes, so Irish students do the equivalent of at least four A-levels, and many do more than this.

There’s a lot to be said for the increased breadth of subjects undertaken for the leaving certificate, but I have no direct experience of teaching first-year university students here yet so I can’t comment on their level of preparedness.

Coincidentally, though, one of the first emails I received this week referred to a consultation about proposed changes to the Leaving Certificate in Applied Mathematics. Not knowing much about the old syllabus, I didn’t feel there was much I could add but I had a look at the new one and was surprised to see a whole `Strand’, on Mathematical Modelling with netwworks and graphs.

The introductory blurb reads:

In this strand students learn about networks or graphs as mathematical models which can be used to investigate a wide range of real-world problems. They learn about graphs and adjacency matrices and how useful these are in solving problems. They are given further opportunity to consolidate their understanding that mathematical ideas can be represented in multiple ways. They are introduced to dynamic programming as a quantitative analysis technique used to solve large, complex problems that involve the need to make a sequence of decisions. As they progress in their understanding they will explore and appreciate the use of algorithms in problem solving as well as considering some of the wider issues involved with the use of such techniques.


Among the specific topics listed you will find:

  • Minimal Spanning trees applied to problems involving optimising networks and algorithms associated with finding these (Kruskal, Prim);  
  • Bellman’s Optimality Principal to find the shortest paths in a weighted directed network, and to be able to formulate the process algebraically;
  •  Dijkstra’s algorithm to find shortest paths in a weighted directed network; etc.


For the record I should say that I’ve actually used Minimal Spanning Trees in a research context (see, e.g., this paper) and have read (and still have) a number of books on graph theory, which I find a truly fascinating subject. It seems to me that the topics all listed above  are all interesting and they’re all useful in a range of contexts, but they do seem rather advanced topics to me for a pre-university student and will be unfamiliar to a great many potential teachers of Applied Mathematics too. It may turn out, therefore, that the students will end up getting a very superficial knowledge of this very trendy subject, when they would actually be better off getting a more solid basis in more traditional mathematical methods  so I wonder what the reaction will be to this proposal!




by telescoper at December 15, 2017 02:44 PM

December 14, 2017

The n-Category Cafe

Entropy Modulo a Prime

In 1995, the German geometer Friedrich Hirzebruch retired, and a private booklet was put together to mark the occasion. That booklet included a short note by Maxim Kontsevich entitled “The <semantics>112<annotation encoding="application/x-tex">1\tfrac{1}{2}</annotation></semantics>-logarithm”.

Kontsevich’s note didn’t become publicly available until five years later, when it was included as an appendix to a paper on polylogarithms by Philippe Elbaz-Vincent and Herbert Gangl. Towards the end, it contains the following provocative words:

Conclusion: If we have a random variable <semantics>ξ<annotation encoding="application/x-tex">\xi</annotation></semantics> which takes finitely many values with all probabilities in <semantics><annotation encoding="application/x-tex">\mathbb{Q}</annotation></semantics> then we can define not only the transcendental number <semantics>H(ξ)<annotation encoding="application/x-tex">H(\xi)</annotation></semantics> but also its “residues modulo <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>” for almost all primes <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics> !

Kontsevich’s note was very short and omitted many details. I’ll put some flesh on those bones, showing how to make sense of the sentence above, and much more.

The “<semantics>H<annotation encoding="application/x-tex">H</annotation></semantics>” that Kontsevich uses here is the symbol for entropy — or more exactly, Shannon entropy. So, I’ll begin by recalling what that is. That will pave the way for what I really want to talk about, which is a kind of entropy for probability distributions where the “probabilities” are not real numbers, but elements of the field <semantics>/p<annotation encoding="application/x-tex">\mathbb{Z}/p\mathbb{Z}</annotation></semantics> of integers modulo a prime <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>.

Let <semantics>π=(π 1,,π n)<annotation encoding="application/x-tex">\pi = (\pi_1, \ldots, \pi_n)</annotation></semantics> be a finite probability distribution. (It would be more usual to write a probability distribution as <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>, but I want to reserve that letter for prime numbers.) The entropy of <semantics>π<annotation encoding="application/x-tex">\pi</annotation></semantics> is

<semantics>H (π)= i:π i0π ilogπ i.<annotation encoding="application/x-tex"> H_\mathbb{R}(\pi) = - \sum_{i : \pi_i \neq 0} \pi_i \log \pi_i. </annotation></semantics>

Usually this is just written as <semantics>H<annotation encoding="application/x-tex">H</annotation></semantics>, but I want to emphasize the role of the real numbers here: both the probabilities <semantics>π i<annotation encoding="application/x-tex">\pi_i</annotation></semantics> and the entropy <semantics>H (π)<annotation encoding="application/x-tex">H_\mathbb{R}(\pi)</annotation></semantics> belong to <semantics><annotation encoding="application/x-tex">\mathbb{R}</annotation></semantics>.

There are applications of entropy in dozens of branches of science… but none will be relevant here! This is purely a mathematical story, though if anyone can think of any possible application or interpretation of entropy modulo a prime, I’d love to hear it.

The challenge now is to find the correct analogue of entropy when the field <semantics><annotation encoding="application/x-tex">\mathbb{R}</annotation></semantics> is replaced by the field <semantics>/p<annotation encoding="application/x-tex">\mathbb{Z}/p\mathbb{Z}</annotation></semantics> of integers mod <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>, for any prime <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>. So, we want to define a kind of entropy

<semantics>H p(π 1,,π n)/p<annotation encoding="application/x-tex"> H_p(\pi_1, \ldots, \pi_n) \in \mathbb{Z}/p\mathbb{Z} </annotation></semantics>

when <semantics>π i/p<annotation encoding="application/x-tex">\pi_i \in \mathbb{Z}/p\mathbb{Z}</annotation></semantics>.

We immediately run into an obstacle. Over <semantics><annotation encoding="application/x-tex">\mathbb{R}</annotation></semantics>, probabilities are required to be nonnegative. Indeed, the logarithm in the definition of entropy doesn’t make sense otherwise. But in <semantics>/p<annotation encoding="application/x-tex">\mathbb{Z}/p\mathbb{Z}</annotation></semantics>, there is no notion of positive or negative. So, what are we even going to define the entropy of?

We take the simplest way out: ignore the problem. So, writing

<semantics>Π n={(π 1,,π n)(/p) n:π 1++π n=1},<annotation encoding="application/x-tex"> \Pi_n = \{ (\pi_1, \ldots, \pi_n) \in (\mathbb{Z}/p\mathbb{Z})^n : \pi_1 + \cdots + \pi_n = 1 \}, </annotation></semantics>

we’re going to try to define

<semantics>H p(π)/p<annotation encoding="application/x-tex"> H_p(\pi) \in \mathbb{Z}/p\mathbb{Z} </annotation></semantics>

for each <semantics>π=(π 1,,π n)Π n<annotation encoding="application/x-tex">\pi = (\pi_1, \ldots, \pi_n) \in \Pi_n</annotation></semantics>.

Let’s try the most direct approach to doing this. That is, let’s stare at the formula defining real entropy…

<semantics>H (π)= i:π i0π ilogπ i<annotation encoding="application/x-tex"> H_\mathbb{R}(\pi) = - \sum_{i : \pi_i \neq 0} \pi_i \log \pi_i </annotation></semantics>

… and try to write down the analogous formula over <semantics>/p<annotation encoding="application/x-tex">\mathbb{Z}/p\mathbb{Z}</annotation></semantics>.

The immediate question is: what should play the role of the logarithm mod <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>?

The crucial property of the ordinary logarithm is that it converts multiplication into addition. Specifically, we’re concerned here with logarithms of nonzero probabilities, and <semantics>log<annotation encoding="application/x-tex">\log</annotation></semantics> defines a homomorphism from the multiplicative group <semantics>(0,1]<annotation encoding="application/x-tex">(0, 1]</annotation></semantics> of nonzero probabilities to the additive group <semantics><annotation encoding="application/x-tex">\mathbb{R}</annotation></semantics>.

Mod <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>, then, we want a homomorphism from the multiplicative group <semantics>(/p) ×<annotation encoding="application/x-tex">(\mathbb{Z}/p\mathbb{Z})^\times</annotation></semantics> of nonzero probabilities to the additive group <semantics>/p<annotation encoding="application/x-tex">\mathbb{Z}/p\mathbb{Z}</annotation></semantics>. And here we hit another obstacle: a simple argument using Lagrange’s theorem shows that apart from the zero map, no such homomorphism exists.

So, we seem to be stuck. Actually, we’re stuck in a way that often happens when you try to construct something new, working by analogy with something old: slavishly imitating the old situation, symbol for symbol, often doesn’t work. In the most interesting analogies, there are wrinkles.

To make some progress, instead of looking at the formula for entropy, let’s look at the properties of entropy.

The most important property is a kind of recursivity. In the language spoken by many patrons of the Café, finite probability distributions form an operad. Explicitly, this means the following.

Suppose I flip a coin. If it’s heads, I roll a die, and if it’s tails, I draw from a pack of cards. This is a two-stage process with 58 possible final outcomes: either the face of a die or a playing card. Assuming that the coin toss, die roll and card draw are all fair, the probability distribution on the 58 outcomes is

<semantics>(1/12,,1/12,1/104,,1/104),<annotation encoding="application/x-tex"> (1/12, \ldots, 1/12, 1/104, \ldots, 1/104), </annotation></semantics>

with <semantics>6<annotation encoding="application/x-tex">6</annotation></semantics> copies of <semantics>1/12<annotation encoding="application/x-tex">1/12</annotation></semantics> and <semantics>52<annotation encoding="application/x-tex">52</annotation></semantics> copies of <semantics>1/104<annotation encoding="application/x-tex">1/104</annotation></semantics>. Generally, given a probability distribution <semantics>γ=(γ 1,,γ n)<annotation encoding="application/x-tex">\gamma = (\gamma_1, \ldots, \gamma_n)</annotation></semantics> on <semantics>n<annotation encoding="application/x-tex">n</annotation></semantics> elements and, for each <semantics>i{1,,n}<annotation encoding="application/x-tex">i \in \{1, \ldots, n\}</annotation></semantics>, a probability distribution <semantics>π i=(π 1 i,,π k i i)<annotation encoding="application/x-tex">\pi^i = (\pi^i_1, \ldots, \pi^i_{k_i})</annotation></semantics> on <semantics>k i<annotation encoding="application/x-tex">k_i</annotation></semantics> elements, we get a composite distribution

<semantics>γ(π 1,,π n)=(γ 1π 1 1,,γ 1π k 1 1,,γ nπ 1 n,,γ nπ k n n)<annotation encoding="application/x-tex"> \gamma \circ (\pi^1, \ldots, \pi^n) = (\gamma_1 \pi^1_1, \ldots, \gamma_1 \pi^1_{k_1}, \ldots, \gamma_n \pi^n_1, \ldots, \gamma_n \pi^n_{k_n}) </annotation></semantics>

on <semantics>k 1++k n<annotation encoding="application/x-tex">k_1 + \cdots + k_n</annotation></semantics> elements.

For example, take the coin-die-card process above. Writing <semantics>u n<annotation encoding="application/x-tex">u_n</annotation></semantics> for the uniform distribution on <semantics>n<annotation encoding="application/x-tex">n</annotation></semantics> elements, the final distribution on <semantics>58<annotation encoding="application/x-tex">58</annotation></semantics> elements is <semantics>u 2(u 6,u 52)<annotation encoding="application/x-tex">u_2 \circ (u_6, u_{52})</annotation></semantics>, which I wrote out explicitly above.

The important recursivity property of entropy is called the chain rule, and it states that

<semantics>H (γ(π 1,,π n))=H (γ)+ i=1 nγ iH (π i).<annotation encoding="application/x-tex"> H_\mathbb{R}(\gamma \circ (\pi^1, \ldots, \pi^n)) = H_\mathbb{R}(\gamma) + \sum_{i = 1}^n \gamma_i H_\mathbb{R}(\pi^i). </annotation></semantics>

It’s easy to check that this is true. (It’s also nice to understand it in terms of information… but if I follow every tempting explanatory byway, I’ll run out of energy too soon.) And in fact, it characterizes entropy almost uniquely:

Theorem   Let <semantics>I<annotation encoding="application/x-tex">I</annotation></semantics> be a function assigning a real number <semantics>I(π)<annotation encoding="application/x-tex">I(\pi)</annotation></semantics> to each finite probability distribution <semantics>π<annotation encoding="application/x-tex">\pi</annotation></semantics>. The following are equivalent:

  • <semantics>I<annotation encoding="application/x-tex">I</annotation></semantics> is continuous in <semantics>π<annotation encoding="application/x-tex">\pi</annotation></semantics> and satisfies the chain rule;

  • <semantics>I=cH <annotation encoding="application/x-tex">I = c H_\mathbb{R}</annotation></semantics> for some constant <semantics>c<annotation encoding="application/x-tex">c \in \mathbb{R}</annotation></semantics>.

The theorem as stated is due to Faddeev, and I blogged about it earlier this year. In fact, you can weaken “continuous” to “measurable” (a theorem of Lee), but that refinement won’t be important here.

What is important is this. In our quest to imitate real entropy in <semantics>/p<annotation encoding="application/x-tex">\mathbb{Z}/p\mathbb{Z}</annotation></semantics>, we now have something to aim for. Namely: we want a sequence of functions <semantics>H p:Π n/p<annotation encoding="application/x-tex">H_p : \Pi_n \to \mathbb{Z}/p\mathbb{Z}</annotation></semantics> satisfying the obvious analogue of the chain rule. And if we’re really lucky, there will be essentially only one such sequence.

We’ll discover that this is indeed the case. Once we’ve found the right definition of <semantics>H p<annotation encoding="application/x-tex">H_p</annotation></semantics> and proved this, we can very legitimately baptize <semantics>H p<annotation encoding="application/x-tex">H_p</annotation></semantics> as “entropy mod <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>” — no matter what weird and wonderful formula might be used to define it — because it has the same characteristic properties as entropy over <semantics><annotation encoding="application/x-tex">\mathbb{R}</annotation></semantics>.

I think I’ll leave you on that cliff-hanger. If you’d like to guess what the definition of entropy mod <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics> is, go ahead! Otherwise, I’ll tell you next time.

by leinster ( at December 14, 2017 11:18 PM

Emily Lakdawalla - The Planetary Society Blog

Congress rejects graduate student tax
The Planetary Society was proud to join dozens of other scientific organizations in standing against this unnecessary and detrimental tax increase on the future scientific workforce of the United States.

December 14, 2017 09:22 PM

Peter Coles - In the Dark

A Python Toolkit for Cosmology

The programming language Python has established itself as the industry standard for researchers in physics and astronomy (as well as the many other fields, including most of those covered by the Data Innovation Research Institute which employs me part-time). It has also become the standard vehicle for teaching coding skills to undergraduates in many disciplines. In fact it looks like the first module I will be teaching in Maynooth next term is in Computational Physics, and that will be delivered using Python too. It’s been a while since I last did any significant hands-on programming, so this will provide me with a good refresher. The best way to learn something well is to have to teach it to others!

But I digress. This morning I noticed a paper by Benedikt Diemer on the arXiv with the title COLOSSUS: A python toolkit for cosmology, large-scale structure, and dark matter halos. Here is the abstract:

This paper introduces Colossus, a public, open-source python package for calculations related to cosmology, the large-scale structure of matter in the universe, and the properties of dark matter halos. The code is designed to be fast and easy to use, with a coherent, well-documented user interface. The cosmology module implements FLRW cosmologies including curvature, relativistic species, and different dark energy equations of state, and provides fast computations of the linear matter power spectrum, variance, and correlation function. The large-scale structure module is concerned with the properties of peaks in Gaussian random fields and halos in a statistical sense, including their peak height, peak curvature, halo bias, and mass function. The halo module deals with spherical overdensity radii and masses, density profiles, concentration, and the splashback radius. To facilitate the rapid exploration of these quantities, Colossus implements about 40 different fitting functions from the literature. I discuss the core routines in detail, with a particular emphasis on their accuracy. Colossus is available at

The software can be downloaded here. It looks a very useful package that includes code to calculate many of the bits and pieces used by cosmologists working on the theory of large-scale structure and galaxy evolution. It is also, I hope, an example of a trend towards greater use of open-source software, for which I congratulate the author! I think this is an important part of the campaign to create truly open science, as I blogged about here.

An important aspect of the way science works is that when a given individual or group publishes a result, it should be possible for others to reproduce it (or not, as the case may be). At present, this can’t always be done. In my own field of astrophysics/cosmology, for example, results in traditional scientific papers are often based on very complicated analyses of large data sets. This is increasingly the case in other fields too. A basic problem obviously arises when data are not made public. Fortunately in astrophysics these days researchers are pretty good at sharing their data, although this hasn’t always been the case.

However, even allowing open access to data doesn’t always solve the reproducibility problem. Often extensive numerical codes are needed to process the measurements and extract meaningful output. Without access to these pipeline codes it is impossible for a third party to check the path from input to output without writing their own version assuming that there is sufficient information to do that in the first place. That researchers should publish their software as well as their results is quite a controversial suggestion, but I think it’s the best practice for science. There isn’t a uniform policy in astrophysics and cosmology, but I sense that quite a few people out there agree with me. Cosmological numerical simulations, for example, can be performed by anyone with a sufficiently big computer using GADGET the source codes of which are freely available. Likewise, for CMB analysis, there is the excellent CAMB code, which can be downloaded at will; this is in a long tradition of openly available numerical codes, including CMBFAST and HealPix.

I suspect some researchers might be reluctant to share the codes they have written because they feel they won’t get sufficient credit for work done using them. I don’t think this is true, as researchers are generally very appreciative of such openness and publications describing the corresponding codes are generously cited. In any case I don’t think it’s appropriate to withhold such programs from the wider community, which prevents them being either scrutinized or extended as well as being used to further scientific research. In other words excessively proprietorial attitudes to data analysis software are detrimental to the spirit of open science.

Anyway, my views aren’t guaranteed to be representative of the community, so I’d like to ask for a quick show of hands via a poll…

<noscript><a href="">Take Our Poll</a></noscript>

…and you are of course welcome to comment via the usual box.

by telescoper at December 14, 2017 11:21 AM

Robert Helling - atdotde

What are the odds?
It's the time of year, you give out special problems in your classes. So this is mine for the blog. It is motivated by this picture of the home secretaries of the German federal states after their annual meeting as well as some recent discussions on Facebook:
I would like to call it Summers' problem:

Let's have two real random variables $M$ and $F$ that are drawn according to two probability distributions $\rho_{M/F}(x)$ (for starters you may both assume to be Gaussians but possibly with different mean and variance). Take $N$ draws from each and order the $2N$ results. What is the probability that the $k$ largest ones are all from $M$ rather than $F$? Express your results in terms of the $\rho_{M/F}(x)$. We are also interested in asymptotic results for $N$ large and $k$ fixed as well as $N$ and $k$ large but $k/N$ fixed.

Last bonus question: How many of the people that say that they hire only based on merit and end up with an all male board realise that by this they say that women are not as good by quite a margin?

by Robert Helling ( at December 14, 2017 08:58 AM

December 13, 2017

Emily Lakdawalla - The Planetary Society Blog

Brief note from #AGU17: Juno observes volcanism on Io
At the American Geophysical Union meeting, members of the Juno team showed observations of active volcanism on Jupiter's moon Io.

December 13, 2017 05:07 PM

Peter Coles - In the Dark

Problems with two-year degrees

I see that the Minister responsible for UK universities, Jo Johnson, has decided that universities should offer two-year degrees, claiming that this will somehow attract more students into higher education.

The idea seems to be that students will get the same `amount’ of teaching, but concentrated in two full calendar years rather than spread over three academic years. This fast-track degree will be offered at a lower level of fee than a normal three-year Bachelors programme.

I can just about accept that this will work in some disciplines and at some universities. The (private) University of Buckingham, for example, already offers such programmes. On the other hand, the University of Buckingham did not participate in the latest Research Excellence Framework, no doubt for the reason that teaching all-year round leaves its academic staff no time to do research or even attend conferences, which these days is only possible during the summer recess.

Call me old-fashioned, but I think an institution that does not combine teaching and research – and indeed one in which the teaching is not led by research – does not merit the name of `University’. The old polytechnics offered a range of valuable opportunities that complemented the traditional honours degree, but that capacity was basically eliminated in 1992 when all such institutions became universities.

Though my main objection to two-year degrees is their impact on research, there are problems from the teaching side too. One is that keeping up the intensity of full-time study throughout a whole year will, in my opinion, exacerbate the difficulty many students have managing their workload without stress or other mental health difficulties. Moreover, many students currently use the long summer vacation either to work, either to earn money to help offset the cost of study, or to participate in placements, internships or other activities to help make them more employable after graduation.

It would be particularly difficult to manage two-year degrees in STEM disciplines, as the teaching laboratories need maintenance and installation of new equipment, for which the proposed system allows no time. And how would project work fit into the fast-track system? On top of all that there’s the fact that the current fee level does not cover the cost of teaching in STEM disciplines, so having to do it faster and for less money is not going to be possible. Incidentally, many STEM students currently pursue undergraduate programmes that last four years, not three…

These points have no doubt been made before, but there is another point that is less widely understood. The fact is that a two-year Bachelors degree may not be a recognised qualification outside the UK. This is, in fact, already a problem with the four-year undergraduate programmes we call, e.g., MPhys, and regard as Masters level in this country: these are not regarded as Masters qualifications in many European countries. Perhaps this is part of some cunning plan to stop graduates leaving the UK after Brexit?

In the light of these difficulties it is no surprise to me that not a single undergraduate I’ve spoken to thinks that a two-year degree is a sensible option. If the government wants to make studying cheaper, said one Physics student I was chatting to, why don’t they just cut the fees for normal degree programmes?

The impression one gets from all this `thinking’ is that the Government increasingly regards universities as businesses that trade in a commodity called `education’, where the word ‘education’ is narrowly construed as `training’ in the skills needed for future employment. I believe a University education is (or should be) far more about developing critical thinking, problem-solving ability, intellectual curiosity than it is about teaching them, e.g., programming skills. Skills are important, of course, but we also need to educate students in what to use them for.

by telescoper at December 13, 2017 04:06 PM

December 12, 2017

Clifford V. Johnson - Asymptotia


Last week the always-interesting Maria Popova of Brain Pickings wrote a piece about the book. I was pleased to see what she wrote because it was clear that she really understood many of the several things I was trying to do in making the book. (I say this because my expectation is usually that people aren't going to click with it because it does not fit narrow presuppositions for either a non-fiction science book or for a graphic novel.) So this was a very pleasant surprise indeed. There's no point trying to paraphrase her, so let me simply point you there with this link.

The book also made the roundup of top Science Books for 2017 on NPR's [...] Click to continue reading this post

The post Noted… appeared first on Asymptotia.

by Clifford at December 12, 2017 10:03 PM

CERN Bulletin

Father Christmas came to CERN on Saturday, 2 December!

Every year, ever since its creation, the Staff Association has organised the CERN Children’s Christmas Party, bringing together 5- to 7-year-old children of employed members of the personnel. The success of the party continues to motivate the organizers of the Staff Association.

This year, the party took place on Saturday, 2 December, and no less than 240 children were welcomed in two sessions, at 13.30 and at 15.30. The children attended a show with music, tales and a speaking puppet: “Zéphirine et les légendes de Noël”.

After the show, they enjoyed a snack in Restaurant 1. We would like to thank Novae for their valuable help and generous contribution.

Then, Father Christmas himself came to give the children their presents. The Staff Association would also like to warmly thank him for taking the time to bring happiness and joy to little ones and big ones alike during the busy season!

We would also like to thank all the parents for their valuable collaboration.

Finally, we wish you all happy holidays and look forward to seeing you next year!

December 12, 2017 05:12 PM

CERN Bulletin

52 years of kindergarten – the structure has proved successful and must not disappear – let’s save our nursery and school together!

Since the beginning of 2016, the Staff Association has been in discussions with the Management to save and sustain our Nursery and School, located on the CERN site in Meyrin.

Where are we now with the discussions and what does the future hold for our Children’s Day-Care Centre and School (EVEE)?

A closer look at the creation of the Kindergarten and its management

A group of parents founded the Kindergarten at CERN in the 1960s, and in 1969, the CERN Staff Association took the structure under its aegis. This change in management resulted in a partnership agreement between CERN and the Staff Association. The agreement defined the rights and duties of both parties with regard to the Staff Association operating a kindergarten on the CERN site. Since then, the Staff Association has been the employer and manager of the structure providing early childhood services.

Development of the structure over time

In 1977, the Kindergarten changed premises and a new agreement was signed between CERN and the Staff Association. This agreement is still in force today.

More recently, the Staff Association, concerned with the wellbeing of children, and to meet the parents’ expectations, has put in place new services in concertation with the CERN Management:

  • in 2009, creation of a canteen to serve approximately 60 children per day;
  • in 2013, creation of a nursery to accommodate children from 4 months to 3 years old (around 35 toddlers);
  • in 2015, creation of a summer camp with the capacity to accommodate 40 children during the month of July.


EVEE facing budgetary difficulties – how is the crisis managed?

The Children’s Day-Care Centre and School (EVEE) structure is facing consecutive budgetary difficulties, in large part due to the establishment of the canteen and the nursery. Indeed, these two services have led to an annual structural deficit which the current financial support of the Organization and the increases in school fees do not suffice to cover.

In 2015, the EVEE Steering Committee endeavoured, in spite of great difficulty, to maximise the revenue and to contain the expenses in order to achieve a balanced budget.

In 2016, informed of the precarious situation of the EVEE structure, the CERN Management decided to put in place a working group to take stock of the situation, to assess the needs of the members of the personnel (MPE and MPA) in terms of early childhood, and to make proposals for a sustainable and viable solution together with the Staff Association.

In 2017, upon the request of the Staff Association, an audit of the accounts was carried out. This audit shows that the management is globally sound and that the optimization measures alone are not sufficient to return to a balanced budget. An adapted subvention from our “State”, CERN, is necessary.

At the same time, CERN has agreed to cover the deficit with extra subventions allowing the Staff Association to finish the year 2016-2017 and to ensure the start of the school year 2017-2018.

How has CERN responded at the end of 2017?

First response: Privatisation of the Nursery and imminent closure of the School

After more than a year of discussions to enable the Management and the Staff Association to find a sustainable and viable solution for the EVEE structure, CERN decides unilaterally to subcontract the operation of the nursery and to remove the school.

Indeed, at the end of November, an Invitation to Tender was sent out to several companies that manage multiple early childhood structures on Swiss territory, in order to take a decision at the beginning of 2018.

It was only on reading this Invitation to Tender, drawn up by the Procurement Services, that the Staff Association learned that it would cease operating the EVEE structure by 31 August 2018 and that the operation of a new structure, no longer including a school, would be entrusted to a contractor as of 1 September 2018.

How would you think the Staff Association welcomed the news after nearly 50 years of partnership with CERN? How could the parents react, and even more so the over 40 employees of the structure?

Following this announcement, and meetings at the highest levels of the Organization, commitments have been made, reassuring, in part, the employees of the structure, the parents and the employer, the Staff Association.

Second response: Outsourcing the Nursery and maintaining the School managed by the Staff Association

On Wednesday, 6 December, at a meeting to which all members of the personnel with children under 4 were invited, the Head of the HR Department, James Purvis, announced the intention of CERN to outsource the nursery and to maintain the School under the management of the Staff Association.

Moreover, the Director for Finance and Human Resources, Martin Steinacher, announced the commitment that there would be no layoffs.

Concerns persist

Despite the latest developments, the Staff Association, the parents and the employees of the EVEE structure remain concerned about the future of the EVEE structure comprising a nursery and a school.

Indeed, during a meeting with the Management, preceding that of 6 December, the Management announced the continuation of the School, contrary to what was originally stated in the Invitation to Tender, but only for a limited duration yet to be determined.

Is this still the intention of the Directorate? In that case, the commitment not to dismiss the personnel is, de facto, null!

It is not too late to sustain the partnership between CERN and the Staff Association!

The services the Staff Association has provided for many years are above all of a high quality and adapted to an international environment. In the opinion of the parents who currently have children in the structure, as well as those who used to, the service quality is remarkable.

The EVEE is also the only structure within the canton that provides care for children from 4 months to 6 years old. This unique service makes it possible to prepare children, often non-French-speaking, for integration into French or Swiss school.

The Staff Association strives to do the utmost to save the structure as it is today, not only for the sake of the unique educational offer, but also because the School generates a profit, which helps reduce the deficit of the nursery and the canteen.

Furthermore, why does CERN seek to break a long-standing partnership if no substantial savings can be achieved, and the rupture would very likely lead to a decrease in the quality of the educational offer… Not to mention the impact such a decision may have on the employees despite the Directorate’s commitment that there would be no layoffs.

The Staff Association is a responsible and reliable employer that seeks to preserve a unique, high quality educational offer, while committing to provide, with the help of parents and employees, a viable and competitive business model.

SAVE OUR NURSERY AND SCHOOL, this is what the parents, the more than 40 employees of EVEE and the employer, the Staff Association, are calling for.

December 12, 2017 04:12 PM

CERN Bulletin

Cine club

Wednesday 20 December 2017 at 20:00
CERN Council Chamber

Uncle Boonmee Who Can Recall His Past Lives

Directed by Apichatpong Weerasethakul
Thailand, 2010, 114 minutes

Suffering from acute kidney failure, Uncle Boonmee has chosen to spend his final days surrounded by his loved ones in the countryside. Surprisingly, the ghost of his deceased wife appears to care for him, and his long lost son returns home in a non-human form. Contemplating the reasons for his illness, Boonmee treks through the jungle with his family to a mysterious hilltop cave - the birthplace of his first life.

Original version Thai / French / Lao; English subtitles


Wednesday 10 January 2018 at 20:00
CERN Council Chamber


Directed by Michael Mann
USA, 2004, 115 minutes

One night in Los Angeles, cab driver Max Durocher picks up a gray-suited man named Vincent. Vincent offers Max a large sum of money to drive him to five locations around LA before the night is up. Max accepts, but realizes that Vincent is a hitman who has been hired to kill five people that night. Max is forced to drive Vincent around the City of Angels, unsure if he'll live to see sunrise.

Original version English; French subtitles


Wednesday 17 January 2018 at 20:00
CERN Council Chamber

Memories of Murder

Directed by Joon-ho Bong
South Korea, 2003, 132 minutes

In a small Korean province in 1986, three detectives struggle with the case of multiple young women being found raped and murdered by an unknown culprit.

Original version Korean; English subtitles

December 12, 2017 02:12 PM

CERN Bulletin

Results of the 2017 elections

The election of the Staff Council for the period 2018-2019 is now over and the first lesson is a turnout for the vote of 56.15 %, higher than for the previous election. This clearly shows the interest that members of the Staff Association attach to the work and dedication of their delegates. Of course we also thank all those who stood up as candidates and expressed their commitment to actively defend the interests of the staff and of CERN.

This newly-elected Staff Council (see its composition below) is truly representative of all sectors and professions of the Organization. This will be a major asset when representatives of the Staff Association discuss with Management and Member States on issues which we will have to address during the next two years.

Strong with this vote of confidence, we are certain that we can count on your active and ongoing support of our members and all personnel at CERN for the future. We know there will be no shortage of challenges. Together we will be stronger and more creative to take them on.

NEW STAFF COUNCIL - 2018-2019 mandate

1     Group A: benchmark jobs classified in grade spans 1-2-3, 2-3-4, 3-4-5 and 4-5-6.
2     Group B: benchmark jobs classified in grade spans 6-7-8 and 9-10.

December 12, 2017 01:12 PM

CERN Bulletin

Cine club

Wednesday 13 December 2017 at 20:00
CERN Council Chamber

My Winnipeg

Directed by Guy Maddin
Canada, 2007, 80 minutes

Fact, fantasy and memory are woven seamlessly together in this portrait of film-maker Guy Maddin's home town of Winnipeg, Manitoba.

Original version English; French subtitles




Wednesday 20 December 2017 at 20:00
CERN Council Chamber

Uncle Boonmee Who Can Recall His Past Lives

Directed by Apichatpong Weerasethakul
Thailand, 2010, 114 minutes

Suffering from acute kidney failure, Uncle Boonmee has chosen to spend his final days surrounded by his loved ones in the countryside. Surprisingly, the ghost of his deceased wife appears to care for him, and his long lost son returns home in a non-human form. Contemplating the reasons for his illness, Boonmee treks through the jungle with his family to a mysterious hilltop cave - the birthplace of his first life.

Original version Thai/French/Lao; English subtitles


Wednesday 10 January 2018 at 20:00
CERN Council Chamber


Directed by Michael Mann
USA, 2004, 115 minutes

One night in Los Angeles, cab driver Max Durocher picks up a gray-suited man named Vincent. Vincent offers Max a large sum of money to drive him to five locations around LA before the night is up. Max accepts, but realizes that Vincent is a hitman who has been hired to kill five people that night. Max is forced to drive Vincent around the City of Angels, unsure if he'll live to see sunrise.

Original version English; French subtitles


Wednesday 17 January 2018 at 20:00
CERN Council Chamber

Memories of Murder

Directed by Joon-ho Bong
South Korea, 2003, 132 minutes

In a small Korean province in 1986, three detectives struggle with the case of multiple young women being found raped and murdered by an unknown culprit.

Original version Korean; English subtitles

December 12, 2017 11:12 AM

December 10, 2017

John Baez - Azimuth

From the Icosahedron to E8

Here’s a draft of a little thing I’m writing for the Newsletter of the London Mathematical Society. The regular icosahedron is connected to many ‘exceptional objects’ in mathematics, and here I describe two ways of using it to construct \mathrm{E}_8. One uses a subring of the quaternions called the ‘icosians’, while the other uses Patrick du Val’s work on the resolution of Kleinian singularities. I leave it as a challenge to find the connection between these two constructions!

You can see a PDF here:

From the icosahedron to E8.

Here’s the story:

From the Icosahedron to E8

In mathematics, every sufficiently beautiful object is connected to all others. Many exciting adventures, of various levels of difficulty, can be had by following these connections. Take, for example, the icosahedron—that is, the regular icosahedron, one of the five Platonic solids. Starting from this it is just a hop, skip and a jump to the \mathrm{E}_8 lattice, a wonderful pattern of points in 8 dimensions! As we explore this connection we shall see that it also ties together many other remarkable entities: the golden ratio, the quaternions, the quintic equation, a highly symmetrical 4-dimensional shape called the 600-cell, and a manifold called the Poincaré homology 3-sphere.

Indeed, the main problem with these adventures is knowing where to stop! The story we shall tell is just a snippet of a longer one involving the McKay correspondence and quiver representations. It would be easy to bring in the octonions, exceptional Lie groups, and more. But it can be enjoyed without these esoteric digressions, so let us introduce the protagonists without further ado.

The icosahedron has a long history. According to a comment in Euclid’s Elements it was discovered by Plato’s friend Theaetetus, a geometer who lived from roughly 415 to 369 BC. Since Theaetetus is believed to have classified the Platonic solids, he may have found the icosahedron as part of this project. If so, it is one of the earliest mathematical objects discovered as part of a classification theorem. It’s hard to be sure. In any event, it was known to Plato: in his Timaeus, he argued that water comes in atoms of this shape.

The icosahedron has 20 triangular faces, 30 edges, and 12 vertices. We can take the vertices to be the four points

\displaystyle{   (0 , \pm 1 , \pm \Phi)  }

and all those obtained from these by cyclic permutations of the coordinates, where

\displaystyle{   \Phi = \frac{\sqrt{5} + 1}{2} }

is the golden ratio. Thus, we can group the vertices into three orthogonal golden rectangles: rectangles whose proportions are \Phi to 1.

In fact, there are five ways to do this. The rotational symmetries of the icosahedron permute these five ways, and any nontrivial rotation gives a nontrivial permutation. The rotational symmetry group of the icosahedron is thus a subgroup of \mathrm{S}_5. Moreover, this subgroup has 60 elements. After all, any rotation is determined by what it does to a chosen face of the icosahedron: it can map this face to any of the 20 faces, and it can do so in 3 ways. The rotational symmetry group of the icosahedron is therefore a 60-element subgroup of \mathrm{S}_5. Group theory therefore tells us that it must be the alternating group \mathrm{A}_5.

The \mathrm{E}_8 lattice is harder to visualize than the icosahedron, but still easy to characterize. Take a bunch of equal-sized spheres in 8 dimensions. Get as many of these spheres to touch a single sphere as you possibly can. Then, get as many to touch those spheres as you possibly can, and so on. Unlike in 3 dimensions, where there is ‘wiggle room’, you have no choice about how to proceed, except for an overall rotation and translation. The balls will inevitably be centered at points of the \mathrm{E}_8 lattice!

We can also characterize the \mathrm{E}_8 lattice as the one giving the densest packing of spheres among all lattices in 8 dimensions. This packing was long suspected to be optimal even among those that do not arise from lattices—but this fact was proved only in 2016, by the young mathematician Maryna Viazovska [V].

We can also describe the \mathrm{E}_8 lattice more explicitly. In suitable coordinates, it consists of vectors for which:

1) the components are either all integers or all integers plus \textstyle{\frac{1}{2}}, and

2) the components sum to an even number.

This lattice consists of all integral linear combinations of the 8 rows of this matrix:

\left( \begin{array}{rrrrrrrr}  1&-1&0&0&0&0&0&0 \\  0&1&-1&0&0&0&0&0 \\  0&0&1&-1&0&0&0&0 \\  0&0&0&1&-1&0&0&0 \\  0&0&0&0&1&-1&0&0 \\  0&0&0&0&0&1&-1&0 \\  0&0&0&0&0&1&1&0 \\  -\frac{1}{2}&-\frac{1}{2}&-\frac{1}{2}&-\frac{1}{2}&-\frac{1}{2}&-\frac{1}{2}&-\frac{1}{2}&-\frac{1}{2}   \end{array} \right)

The inner product of any row vector with itself is 2, while the inner product of distinct row vectors is either 0 or -1. Thus, any two of these vectors lie at an angle of either 90° or 120°. If we draw a dot for each vector, and connect two dots by an edge when the angle between their vectors is 120° we get this pattern:

This is called the \mathrm{E}_8 Dynkin diagram. In the first part of our story we shall find the \mathrm{E}_8 lattice hiding in the icosahedron; in the second part, we shall find this diagram. The two parts of this story must be related—but the relation remains mysterious, at least to me.

The Icosians

The quickest route from the icosahedron to \mathrm{E}_8 goes through the fourth dimension. The symmetries of the icosahedron can be described using certain quaternions; the integer linear combinations of these form a subring of the quaternions called the ‘icosians’, but the icosians can be reinterpreted as a lattice in 8 dimensions, and this is the \mathrm{E}_8 lattice [CS]. Let us see how this works.

The quaternions, discovered by Hamilton, are a 4-dimensional algebra

\displaystyle{ \mathbb{H} = \{a + bi + cj + dk \colon \; a,b,c,d\in \mathbb{R}\}  }

with multiplication given as follows:

\displaystyle{i^2 = j^2 = k^2 = -1, }
\displaystyle{i j = k = - j i  \textrm{ and cyclic permutations} }

It is a normed division algebra, meaning that the norm

\displaystyle{ |a + bi + cj + dk| = \sqrt{a^2 + b^2 + c^2 + d^2} }


|q q'| = |q| |q'|

for all q,q' \in \mathbb{H}. The unit sphere in \mathbb{H} is thus a group, often called \mathrm{SU}(2) because its elements can be identified with 2 \times 2 unitary matrices with determinant 1. This group acts as rotations of 3-dimensional Euclidean space, since we can see any point in \mathbb{R}^3 as a purely imaginary quaternion x = bi + cj + dk, and the quaternion qxq^{-1} is then purely imaginary for any q \in \mathrm{SO}(3). Indeed, this action gives a double cover

\displaystyle{   \alpha \colon \mathrm{SU}(2) \to \mathrm{SO}(3) }

where \mathrm{SO}(3) is the group of rotations of \mathbb{R}^3.

We can thus take any Platonic solid, look at its group of rotational symmetries, get a subgroup of \mathrm{SO}(3), and take its double cover in \mathrm{SU}(2). If we do this starting with the icosahedron, we see that the 60-element group \mathrm{A}_5 \subset \mathrm{SO}(3) is covered by a 120-element group \Gamma \subset \mathrm{SU}(2), called the binary icosahedral group.

The elements of \Gamma are quaternions of norm one, and it turns out that they are the vertices of a 4-dimensional regular polytope: a 4-dimensional cousin of the Platonic solids. It deserves to be called the “hypericosahedron”, but it is usually called the 600-cell, since it has 600 tetrahedral faces. Here is the 600-cell projected down to 3 dimensions, drawn using Robert Webb’s Stella software:

Explicitly, if we identify \mathbb{H} with \mathbb{R}^4, the elements of \Gamma are the points

\displaystyle{    (\pm \textstyle{\frac{1}{2}}, \pm \textstyle{\frac{1}{2}},\pm \textstyle{\frac{1}{2}},\pm \textstyle{\frac{1}{2}}) }

\displaystyle{ (\pm 1, 0, 0, 0) }

\displaystyle{  \textstyle{\frac{1}{2}} (\pm \Phi, \pm 1 , \pm 1/\Phi, 0 ),}

and those obtained from these by even permutations of the coordinates. Since these points are closed under multiplication, if we take integral linear combinations of them we get a subring of the quaternions:

\displaystyle{    \mathbb{I} = \{ \sum_{q \in \Gamma} a_q  q  : \; a_q \in \mathbb{Z} \}  \subset \mathbb{H} .}

Conway and Sloane [CS] call this the ring of icosians. The icosians are not a lattice in the quaternions: they are dense. However, any icosian is of the form a + bi + cj + dk where a,b,c, and d live in the golden field

\displaystyle{   \mathbb{Q}(\sqrt{5}) = \{ x + \sqrt{5} y : \; x,y \in \mathbb{Q}\} }

Thus we can think of an icosian as an 8-tuple of rational numbers. Such 8-tuples form a lattice in 8 dimensions.

In fact we can put a norm on the icosians as follows. For q \in \mathbb{I} the usual quaternionic norm has

\displaystyle{  |q|^2 =  x + \sqrt{5} y }

for some rational numbers x and y, but we can define a new norm on \mathbb{I} by setting

\displaystyle{ \|q\|^2 = x + y }

With respect to this new norm, the icosians form a lattice that fits isometrically in 8-dimensional Euclidean space. And this is none other than \mathrm{E}_8!

Klein’s Icosahedral Function

Not only is the \mathrm{E}_8 lattice hiding in the icosahedron; so is the \mathrm{E}_8 Dynkin diagram. The space of all regular icosahedra of arbitrary size centered at the origin has a singularity, which corresponds to a degenerate special case: the icosahedron of zero size. If we resolve this singularity in a minimal way we get eight Riemann spheres, intersecting in a pattern described by the \mathrm{E}_8 Dynkin diagram!

This remarkable story starts around 1884 with Felix Klein’s Lectures on the Icosahedron [Kl]. In this work he inscribed an icosahedron in the Riemann sphere, \mathbb{C}\mathrm{P}^1. He thus got the icosahedron’s symmetry group, \mathrm{A}_5, to act as conformal transformations of \mathbb{C}\mathrm{P}^1—indeed, rotations. He then found a rational function of one complex variable that is invariant under all these transformations. This function equals 0 at the centers of the icosahedron’s faces, 1 at the midpoints of its edges, and \infty at its vertices.

Here is Klein’s icosahedral function as drawn by Abdelaziz Nait Merzouk. The color shows its phase, while the contour lines show its magnitude:

We can think of Klein’s icosahedral function as a branched cover of the Riemann sphere by itself with 60 sheets:

\displaystyle{                \mathcal{I} \colon \mathbb{C}\mathrm{P}^1 \to \mathbb{C}\mathrm{P}^1 .}

Indeed, \mathrm{A}_5 acts on \mathbb{C}\mathrm{P}^1, and the quotient space \mathbb{C}\mathrm{P}^1/\mathrm{A}_5 is isomorphic to \mathbb{C}\mathrm{P}^1 again. The function \mathcal{I} gives an explicit formula for the quotient map \mathbb{C}\mathrm{P}^1 \to \mathbb{C}\mathrm{P}^1/\mathrm{A}_5 \cong \mathbb{C}\mathrm{P}^1.

Klein managed to reduce solving the quintic to the problem of solving the equation \mathcal{I}(z) = w for z. A modern exposition of this result is Shurman’s Geometry of the Quintic [Sh]. For a more high-powered approach, see the paper by Nash [N]. Unfortunately, neither of these treatments avoids complicated calculations. But our interest in Klein’s icosahedral function here does not come from its connection to the quintic: instead, we want to see its connection to \mathrm{E}_8.

For this we should actually construct Klein’s icosahedral function. To do this, recall that the Riemann sphere \mathbb{C}\mathrm{P}^1 is the space of 1-dimensional linear subspaces of \mathbb{C}^2. Let us work directly with \mathbb{C}^2. While \mathrm{SO}(3) acts on \mathbb{C}\mathrm{P}^1, this comes from an action of this group’s double cover \mathrm{SU}(2) on \mathbb{C}^2. As we have seen, the rotational symmetry group of the icosahedron, \mathrm{A}_5 \subset \mathrm{SO}(3), is double covered by the binary icosahedral group \Gamma \subset \mathrm{SU}(2). To build an \mathrm{A}_5-invariant rational function on \mathbb{C}\mathrm{P}^1, we should thus look for \Gamma-invariant homogeneous polynomials on \mathbb{C}^2.

It is easy to construct three such polynomials:

V, of degree 12, vanishing on the 1d subspaces corresponding to icosahedron vertices.

E, of degree 30, vanishing on the 1d subspaces corresponding to icosahedron edge midpoints.

F, of degree 20, vanishing on the 1d subspaces corresponding to icosahedron face centers.

Remember, we have embedded the icosahedron in \mathbb{C}\mathrm{P}^1, and each point in \mathbb{C}\mathrm{P}^1 is a 1-dimensional subspace of \mathbb{C}^2, so each icosahedron vertex determines such a subspace, and there is a linear function on \mathbb{C}^2, unique up to a constant factor, that vanishes on this subspace. The icosahedron has 12 vertices, so we get 12 linear functions this way. Multiplying them gives V, a homogeneous polynomial of degree 12 on \mathbb{C}^2 that vanishes on all the subspaces corresponding to icosahedron vertices! The same trick gives E, which has degree 30 because the icosahedron has 30 edges, and F, which has degree 20 because the icosahedron has 20 faces.

A bit of work is required to check that V,E and F are invariant under \Gamma, instead of changing by constant factors under group transformations. Indeed, if we had copied this construction using a tetrahedron or octahedron, this would not be the case. For details, see Shurman’s book [Sh], which is free online, or van Hoboken’s nice thesis [VH].

Since both F^3 and V^5 have degree 60, F^3/V^5 is homogeneous of degree zero, so it defines a rational function \mathcal{I} \colon \mathbb{C}\mathrm{P}^1 \to \mathbb{C}\mathrm{P}^1. This function is invariant under \mathrm{A}_5 because F and V are invariant under \Gamma. Since F vanishes at face centers of the icosahedron while V vanishes at vertices, \mathcal{I} = F^3/V^5 equals 0 at face centers and \infty at vertices. Finally, thanks to its invariance property, \mathcal{I} takes the same value at every edge center, so we can normalize V or F to make this value 1.

Thus, \mathcal{I} has precisely the properties required of Klein’s icosahedral function! And indeed, these properties uniquely characterize that function, so that function is \mathcal{I}.

The Appearance of E8

Now comes the really interesting part. Three polynomials on a 2-dimensional space must obey a relation, and V,E, and F obey a very pretty one, at least after we normalize them correctly:

\displaystyle{      V^5 + E^2 + F^3 = 0. }

We could guess this relation simply by noting that each term must have the same degree. Every \Gamma-invariant polynomial on \mathbb{C}^2 is a polynomial in V, E and F, and indeed

\displaystyle{          \mathbb{C}^2 / \Gamma \cong  \{ (V,E,F) \in \mathbb{C}^3 \colon \; V^5 + E^2 + F^3 = 0 \} . }

This complex surface is smooth except at V = E = F = 0, where it has a singularity. And hiding in this singularity is \mathrm{E}_8!

To see this, we need to ‘resolve’ the singularity. Roughly, this means that we find a smooth complex surface S and an onto map

that is one-to-one away from the singularity. (More precisely, if X is an algebraic variety with singular points X_{\mathrm{sing}} \subset X, \pi \colon S \to X is a resolution of X if S is smooth, \pi is proper, \pi^{-1}(X - X_{\textrm{sing}}) is dense in S, and \pi is an isomorphism between \pi^{-1}(X - X_{\mathrm{sing}}) and X - X_{\mathrm{sing}}. For more details see Lamotke’s book [L].)

There are many such resolutions, but one minimal resolution, meaning that all others factor uniquely through this one:

What sits above the singularity in this minimal resolution? Eight copies of the Riemann sphere \mathbb{C}\mathrm{P}^1, one for each dot here:

Two of these \mathbb{C}\mathrm{P}^1s intersect in a point if their dots are connected by an edge: otherwise they are disjoint.

This amazing fact was discovered by Patrick Du Val in 1934 [DV]. Why is it true? Alas, there is not enough room in the margin, or even in the entire blog article, to explain this. The books by Kirillov [Ki] and Lamotke [L] fill in the details. But here is a clue. The \mathrm{E}_8 Dynkin diagram has ‘legs’ of lengths 5, 2 and 3:

On the other hand,

\displaystyle{   \mathrm{A}_5 \cong \langle v, e, f | v^5 = e^2 = f^3 = v e f = 1 \rangle }

where in terms of the rotational symmetries of the icosahedron:

v is a 1/5 turn around some vertex of the icosahedron,

e is a 1/2 turn around the center of an edge touching that vertex,

f is a 1/3 turn around the center of a face touching that vertex,

and we must choose the sense of these rotations correctly to obtain vef = 1. To get a presentation of the binary icosahedral group we drop one relation:

\displaystyle{  \Gamma \cong \langle v, e, f | v^5 = e^2 = f^3 = vef \rangle }

The dots in the \mathrm{E}_8 Dynkin diagram correspond naturally to conjugacy classes in \Gamma, not counting the conjugacy class of the central element -1 \in \Gamma. Each of these conjugacy classes, in turn, gives a copy of \mathbb{C}\mathrm{P}^1 in the minimal resolution of \mathbb{C}^2/\Gamma.

Not only the \mathrm{E}_8 Dynkin diagram, but also the \mathrm{E}_8 lattice, can be found in the minimal resolution of \mathbb{C}^2/\Gamma. Topologically, this space is a 4-dimensional manifold. Its real second homology group is an 8-dimensional vector space with an inner product given by the intersection pairing. The integral second homology is a lattice in this vector space spanned by the 8 copies of \mathbb{C}P^1 we have just seen—and it is a copy of the \mathrm{E}_8 lattice [KS].

But let us turn to a more basic question: what is \mathbb{C}^2/\Gamma like as a topological space? To tackle this, first note that we can identify a pair of complex numbers with a single quaternion, and this gives a homeomorphism

\mathbb{C}^2/\Gamma \cong \mathbb{H}/\Gamma

where we let \Gamma act by right multiplication on \mathbb{H}. So, it suffices to understand \mathbb{H}/\Gamma.

Next, note that sitting inside \mathbb{H}/\Gamma are the points coming from the unit sphere in \mathbb{H}. These points form the 3-dimensional manifold \mathrm{SU}(2)/\Gamma, which is called the Poincaré homology 3-sphere [KS]. This is a wonderful thing in its own right: Poincaré discovered it as a counterexample to his guess that any compact 3-manifold with the same homology as a 3-sphere is actually diffeomorphic to the 3-sphere, and it is deeply connected to \mathrm{E}_8. But for our purposes, what matters is that we can think of this manifold in another way, since we have a diffeomorphism

\mathrm{SU}(2)/\Gamma \cong \mathrm{SO}(3)/\mathrm{A}_5.

The latter is just the space of all icosahedra inscribed in the unit sphere in 3d space, where we count two as the same if they differ by a rotational symmetry.

This is a nice description of the points of \mathbb{H}/\Gamma coming from points in the unit sphere of \mathbb{H}. But every quaternion lies in some sphere centered at the origin of \mathbb{H}, of possibly zero radius. It follows that \mathbb{C}^2/\Gamma \cong \mathbb{H}/\Gamma is the space of all icosahedra centered at the origin of 3d space—of arbitrary size, including a degenerate icosahedron of zero size. This degenerate icosahedron is the singular point in \mathbb{C}^2/\Gamma. This is where \mathrm{E}_8 is hiding.

Clearly much has been left unexplained in this brief account. Most of the missing details can be found in the references. But it remains unknown—at least to me—how the two constructions of \mathrm{E}_8 from the icosahedron fit together in a unified picture.

Recall what we did. First we took the binary icosahedral group \Gamma \subset \mathbb{H}, took integer linear combinations of its elements, thought of these as forming a lattice in an 8-dimensional rational vector space with a natural norm, and discovered that this lattice is a copy of the \mathrm{E}_8 lattice. Then we took \mathbb{C}^2/\Gamma \cong \mathbb{H}/\Gamma, took its minimal resolution, and found that the integral 2nd homology of this space, equipped with its natural inner product, is a copy of the \mathrm{E}_8 lattice. From the same ingredients we built the same lattice in two very different ways! How are these constructions connected? This puzzle deserves a nice solution.


I thank Tong Yang for inviting me to speak on this topic at the Annual General Meeting of the Hong Kong Mathematical Society on May 20, 2017, and Guowu Meng for hosting me at the HKUST while I prepared that talk. I also thank the many people, too numerous to accurately list, who have helped me understand these topics over the years.


[CS] J. H. Conway and N. J. A. Sloane, Sphere Packings, Lattices and Groups, Springer, Berlin, 2013.

[DV] P. du Val, On isolated singularities of surfaces which do not affect the conditions of adjunction, I, II and III, Proc. Camb. Phil. Soc. 30, 453–459, 460–465, 483–491.

[KS] R. Kirby and M. Scharlemann, Eight faces of the Poincaré homology 3-sphere, Usp. Mat. Nauk. 37 (1982), 139–159. Available at

[Ki] A. Kirillov, Quiver Representations and Quiver Varieties, AMS, Providence, Rhode Island, 2016.

[Kl] F. Klein, Lectures on the Ikosahedron and the Solution of Equations of the Fifth Degree, Trüubner & Co., London, 1888. Available at

[L] K. Lamotke, Regular Solids and Isolated Singularities, Vieweg & Sohn, Braunschweig, 1986.

[N] O. Nash, On Klein’s icosahedral solution of the quintic. Available at

[Sh] J. Shurman, Geometry of the Quintic, Wiley, New York, 1997. Available at

[Sl] P. Slodowy, Platonic solids, Kleinian singularities, and Lie groups, in Algebraic Geometry, Lecture Notes in Mathematics 1008, Springer, Berlin, 1983, pp. 102–138.

[VH] J. van Hoboken, Platonic Solids, Binary Polyhedral Groups, Kleinian Singularities and Lie Algebras of Type A, D, E, Master’s Thesis, University of Amsterdam, 2002. Available at

[V] M. Viazovska, The sphere packing problem in dimension 8, Ann. Math. 185 (2017), 991–1015. Available at

by John Baez at December 10, 2017 06:27 PM

John Baez - Azimuth


In certain crystals you can knock an electron out of its favorite place and leave a hole: a place with a missing electron. Sometimes these holes can move around like particles. And naturally these holes attract electrons, since they are places an electron would want to be.

Since an electron and a hole attract each other, they can orbit each other. An orbiting electron-hole pair is a bit like a hydrogen atom, where an electron orbits a proton. All of this is quantum-mechanical, of course, so you should be imagining smeared-out wavefunctions, not little dots moving around. But imagine dots if it’s easier.

An orbiting electron-hole pair is called an exciton, because while it acts like a particle in its own right, it’s really just a special kind of ‘excited’ electron—an electron with extra energy, not in its lowest energy state where it wants to be.

An exciton usually doesn’t last long: the orbiting electron and hole spiral towards each other, the electron finds the hole it’s been seeking, and it settles down.

But excitons can last long enough to do interesting things. In 1978 the Russian physicist Abrikosov wrote a short and very creative paper in which he raised the possibility that excitons could form a crystal in their own right! He called this new state of matter excitonium.

In fact his reasoning was very simple.

Just as electrons have a mass, so do holes. That sounds odd, since a hole is just a vacant spot where an electron would like to be. But such a hole can move around. It has more energy when it moves faster, and it takes force to accelerate it—so it acts just like it has a mass! The precise mass of a hole depends on the nature of the substance we’re dealing with.

Now imagine a substance with very heavy holes.

When a hole is much heavier than an electron, it will stand almost still when an electron orbits it. So, they form an exciton that’s very similar to a hydrogen atom, where we have an electron orbiting a much heavier proton.

Hydrogen comes in different forms: gas, liquid, solid… and at extreme pressures, like in the core of Jupiter, hydrogen becomes metallic. So, we should expect that excitons can come in all these different forms too!

We should be able to create an exciton gas… an exciton liquid… an exciton solid…. and under the right circumstances, a metallic crystal of excitons. Abrikosov called this metallic excitonium.

People have been trying to create this stuff for a long time. Some claim to have succeeded. But a new paper claims to have found something else: a Bose–Einstein condensate of excitons:

• Anshul Kogar, Melinda S. Rak, Sean Vig, Ali A. Husain, Felix Flicker, Young Il Joe, Luc Venema, Greg J. MacDougall, Tai C. Chiang, Eduardo Fradkin, Jasper van Wezel and Peter Abbamonte, Signatures of exciton condensation in a transition metal dichalcogenide, Science 358 (2017), 1314–1317.

A lone electron acts like a fermion, so I guess a hole does do, and if so that means an exciton acts approximately like a boson. When it’s cold, a gas of bosons will ‘condense’, with a significant fraction of them settling into the lowest energy states available. I guess excitons have been seen to do this!

There’s a fairly good simplified explanation at the University of Illinois website:

• Siv Schwink, Physicists excited by discovery of new form of matter, excitonium, 7 December 2017.

However, the picture on this page, which I used above, shows domain walls moving through crystallized excitonium. I think that’s different than a Bose-Einstein condensate!

I urge you to look at Abrikosov’s paper. It’s short and beautiful:

• Alexei Alexeyevich Abrikosov, A possible mechanism of high temperature superconductivity, Journal of the Less Common Metals
62 (1978), 451–455.

(Cool journal title. Is there a journal of the more common metals?)

In this paper, Abrikoskov points out that previous authors had the idea of metallic excitonium. Maybe his new idea was that this might be a superconductor—and that this might explain high-temperature superconductivity. The reason for his guess is that metallic hydrogen, too, is widely suspected to be a superconductor.

Later, Abrikosov won the Nobel prize for some other ideas about superconductors. I think I should read more of his papers. He seems like one of those physicists with great intuitions.

Puzzle 1. If a crystal of excitons conducts electricity, what is actually going on? That is, which electrons are moving around, and how?

This is a fun puzzle because an exciton crystal is a kind of abstract crystal created by the motion of electrons in another, ordinary, crystal. And that leads me to another puzzle, that I don’t know the answer to:

Puzzle 2. Is it possible to create a hole in excitonium? If so, it possible to create an exciton in excitonium? If so, is it possible to create meta-excitonium: an crystal of excitons in excitonium?

by John Baez at December 10, 2017 02:30 AM

December 08, 2017

Emily Lakdawalla - The Planetary Society Blog

An exoplanet-hunting space telescope turns and takes a photo of Earth
On December 10, Kepler—NASA’s prized exoplanet discovery telescope—will finally turn back and take a picture of the Earth.

December 08, 2017 11:34 PM

Lubos Motl - string vacua and pheno

Crackpots' lies about cosmic string predictions
Some days ago, I was shocked to learn that the "N*t Even Wr*ng" blog based on painful lies about string theory still exists and that its stuttering perpetrator hasn't been jailed or hanged yet.

There are already two new tirades at Peter W*it's notorious website. The newest one celebrates that a non-expert has described the multiverse as the "last refuge of cowards" at a social event. I think much of the research about the multiverse is questionable but slurs like that won't make the possibility go away. Using some irrelevant expletives from a not really scientific event as "arguments" is low-brow, indeed.

The previous text titled "String theory fails another test" is based on W*it's complete lies about the predictions of cosmic strings by state-of-the-art physical theories.

LIGO has just published constraints on the cosmic strings that in principle add some noise of a characteristic color to the oscillations that LIGO can observe. The amount of this noise from cusps and kinks was shown to be a smaller than some function of the frequencies and/or the cosmic string tension.

W*it summarizes this paper as a "failure of string theory" and declares David Gross and Joe Polchinski to be have made a losing prediction. But those statements are lies – for two main reasons.

First, "cosmic strings" can be explained as objects in string theory – and even fundamental strings of string theory may be stretched in some models and manifest themselves as cosmic strings – but "cosmic strings" are still a notion in cosmology that is independent of string theory. Cosmic strings may exist independently of string theory and are predicted by other theories in high-energy physics, starting from grand unified theories (GUT). Read e.g. the last sentence of the abstract of Tristan's 2005 master thesis. The same scientist is at the LHC now.

Second, it's simply a complete lie that string theorists have made the prediction that cosmic strings would be discovered. The discovery of cosmic strings was always a possibility – and it remains a possibility. No well-known professional string theorist has ever made the prediction that it's "more likely than not" that cosmic strings would be discovered in our lifetime, let alone a foreseeable future.

The famous string theorist that was closest to it is Joe Polchinski. There was a wave of activity surrounding cosmic strings according to string theory around 2004. This excitement was amplified by the observation of CSL-1, a cosmic string candidate, in the telescopes. If you read e.g. this 2006 blog post about CSL-1 that communicated the conclusion that CSL-1 wasn't a cosmic string, you will be reminded that Joe Polchinski had declared the probability that the cosmic strings would be discovered in a reasonable future to be 10%. So it's enough to watch it and be sort of thrilled but the number still says "probably not".

Joe Polchinski was still the most enthusiastic famous string theorist when it came to the discovery prospects for cosmic strings. W*it also tries to claim that David Gross' made a failing prediction – when he quotes Gross' sentences from 2007:
String theory is full of qualitative predictions, such as the production of black holes at the LHC or cosmic strings in the sky, and this level of prediction is perfectly acceptable in almost every other field of science. It’s only in particle physics that a theory can be thrown out if the 10th decimal place of a prediction doesn’t agree with experiment.
But it's very clear that this statement contains no prediction that was falsified – after all, W*it has been saying for years that string theorists couldn't ever make such a prediction, so he contradicts himself when he says that string theorists did it.

Gross said that the cosmic strings that exist out there – or that may exist out there – are an example of a qualitative prediction. The adjective "qualitative" is explicitly written there and it has a very good reason. The adjective is there to emphasize that string theorists couldn't calculate the tension or density of cosmic strings in the Universe at the moment when David Gross made the statement. We still cannot. So there was obviously no prediction that would imply that "cosmic strings have to be seen by this or that experiment by the year 2017" or anything of the sort.

David Gross talked about these qualitative predictions exactly because they're such a standard part of all scientific disciplines – and string theory is as scientific as other disciplines of science. He contrasted the situation with particle physics where many predictions are quantitative and extremely accurate and a tiny disagreement is enough to eliminate a theory or a hypothesis. But theories in other disciplines of science – and those include string theory in its present form of our understanding of it – don't depend on the precise quantitative observations in this fatal way. That obviously doesn't mean that the questions are unscientific.

The question whether cosmic strings exist in the Universe is obviously scientific, meaningful, deep, and important regardless of whether lying dishonest savages fail to understand the scientific character, meaning, depth, and importance. And we still don't know whether there are cosmic strings in the Universe and what their tension and/or average density is. And we're still intrigued by the possibility and ready to devour new evidence whenever it emerges. Like previous experiments, LIGO has only imposed some constraints on these numbers. But it didn't falsify the whole concept. It couldn't falsify the concept because the concept is qualitative. That doesn't mean that it's unimportant, shallow, meaningless, or unscientific. So cosmic strings will obviously keep on appearing in papers by cosmologists, GUT theorists, string theorists, and others.

I am staggered by the stupidity of the people who are willing to buy this self-evident W*it-like garbage.

Exactly the same comments apply to readers of Backreaction whose author claims that the estimate of a much higher cosmological constant "isn't even a prediction". Holy cow. It clearly is a prediction, it isn't a good one, but it's justified by the same kind of dimensional analysis etc. that is used all over physics to get estimates of so many things. The failure of this methodology in the case of the cosmological constant is obviously a rather important fact that requires a deep enough qualitative explanation. Ms Hossenfelder may only "denounce" such basic methods of physics because she has never done any real physics in her life. Her readers are constantly served pure feces as well but they don't mind – in fact, these Schweinehunds and pigs smack their lips.

by Luboš Motl ( at December 08, 2017 06:43 PM

Emily Lakdawalla - The Planetary Society Blog

The case for Venus
NASA is about to pick finalists for its next New Frontiers mission. Will Venus make the cut?

December 08, 2017 04:54 PM

December 07, 2017

John Baez - Azimuth

Wigner Crystals

I’d like to explain a conjecture about Wigner crystals, which we came up with in a discussion on Google+. It’s a purely mathematical conjecture that’s pretty simple to state, motivated by the picture above. But let me start at the beginning.

Electrons repel each other, so they don’t usually form crystals. But if you trap a bunch of electrons in a small space, and cool them down a lot, they will try to get as far away from each other as possible—and they can do this by forming a crystal!

This is sometimes called an electron crystal. It’s also called a Wigner crystal, because the great physicist Eugene Wigner predicted in 1934 that this would happen.

Only since the late 1980s have we been able to make electron crystals in the lab. Such a crystal can only form if the electron density is low enough. The reasons is that even at absolute zero, a gas of electrons has kinetic energy. At absolute zero the gas will minimize its energy. But it can’t do this by having all the electrons in a state with zero momentum, since you can’t put two electrons in the same state, thanks to the Pauli exclusion principle. So, higher momentum states need to be occupied, and this means there’s kinetic energy. And it has more if its density is high: if there’s less room in position space, the electrons are forced to occupy more room in momentum space.

When the density is high, this prevents the formation of a crystal: instead, we have lots of electrons whose wavefunctions are ‘sitting almost on top of each other’ in position space, but with different momenta. They’ll have lots of kinetic energy, so minimizing kinetic energy becomes more important than minimizing potential energy.

When the density is low, this effect becomes unimportant, and the electrons mainly try to minimize potential energy. So, they form a crystal with each electron avoiding the rest. It turns out they form a body-centered cubic: a crystal lattice formed of cubes, with an extra electron in the middle of each cube.

To know whether a uniform electron gas at zero temperature forms a crystal or not, you need to work out its so-called Wigner-Seitz radius. This is the average inter-particle spacing measured in units of the Bohr radius. The Bohr radius is the unit of length you can cook up from the electron mass, the electron charge and Planck’s constant:

\displaystyle{ a_0=\frac{\hbar^2}{m_e e^2} }

It’s mainly famous as the average distance between the electron and a proton in a hydrogen atom in its lowest energy state.

Simulations show that a 3-dimensional uniform electron gas crystallizes when the Wigner–Seitz radius is at least 106. The picture, however, shows an electron crystal in 2 dimensions, formed by electrons trapped on a thin film shaped like a disk. In 2 dimensions, Wigner crystals form when the Wigner–Seitz radius is at least 31. In the picture, the density is so low that we can visualize the electrons as points with well-defined positions.

So, the picture simply shows a bunch of points x_i trying to minimize the potential energy, which is proportional to

\displaystyle{ \sum_{i \ne j} \frac{1}{\|x_i - x_j\|} }

The lines between the dots are just to help you see what’s going on. They’re showing the Delauney triangulation, where we draw a graph that divides the plane into regions closer to one electron than all the rest, and then take the dual of that graph.

Thanks to energy minimization, this triangulation wants to be a lattice of equilateral triangles. But since such a triangular lattice doesn’t fit neatly into a disk, we also see some ‘defects’:

Most electrons have 6 neighbors. But there are also some red defects, which are electrons with 5 neighbors, and blue defects, which are electrons with 7 neighbors.

Note that there are 6 clusters of defects. In each cluster there is one more red defect than blue defect. I think this is not a coincidence.

Conjecture. When we choose a sufficiently large number of points x_i on a disk in such a way that

\displaystyle{ \sum_{i \ne j} \frac{1}{\|x_i - x_j\|} }

is minimized, and draw the Delauney triangulation, there will be 6 more vertices with 5 neighbors than vertices with 7 neighbors.

Here’s a bit of evidence for this, which is not at all conclusive. Take a sphere and triangulate it in such a way that each vertex has 5, 6 or 7 neighbors. Then here’s a cool fact: there must be 12 more vertices with 5 neighbors than vertices with 7 neighbors.

Puzzle. Prove this fact.

If we think of the picture above as the top half of a triangulated sphere, then each vertex in this triangulated sphere has 5, 6 or 7 neighbors. So, there must be 12 more vertices on the sphere with 5 neighbors than with 7 neighbors. So, it makes some sense that the top half of the sphere will contain 6 more vertices with 5 neighbors than with 7 neighbors. But this is not a proof.

I have a feeling this energy minimization problem has been studied with various numbers of points. So, there either be a lot of evidence for my conjecture, or some counterexamples that will force me to refine it. The picture shows what happens with 600 points on the disk. Maybe something dramatically different happens with 599! Maybe someone has even proved theorems about this. I just haven’t had time to look for such work.

The picture here was drawn by Arunas.rv and placed on Wikicommons on a Creative Commons Attribution-Share Alike 3.0 Unported license.

by John Baez at December 07, 2017 04:41 PM

Tommaso Dorigo - Scientificblogging

Alpha Zero Teaches Itself Chess 4 Hours, Then Beats Dad
Peter Heine Nielsen, a Danish chess Grandmaster, summarized it quite well. "I always wondered, if some superior alien race came to Earth, how they would play chess. Now I know". The architecture that beat humans at the notoriously CPU-impervious game Go, AlphaGo by Google Deep Mind, was converted to allow the machine to tackle other "closed-rules" games. Successively, the program was given the rules of chess, and a huge battery of Google's GPUs to train itself on the game. Within four hours, the alien emerged. And it is indeed a new class of player.

read more

by Tommaso Dorigo at December 07, 2017 01:26 PM

December 05, 2017

Clifford V. Johnson - Asymptotia

Anthony Zee’s Joke(?)

So I've been waiting for some time to tell you about this clever joke by eminent physicist Anthony Zee. Well, I think it is a joke, I've not checked with him yet: The final production period for The Dialogues was full of headaches, I must say, but there was one thing that made me laugh out loud, for a long time. I heard that Tony had agreed to write a blurb for the back cover of the book, but I did not see it until I was finally sent a digital copy of the back cover, somewhat after everything had (afaik) gone to print. The blurb was simple, and said:

"This is a fantastic book -- entertaining, informative, enjoyable, and thought-provoking."

I thought this was rather nicely done. Simple, to the point, generous.... but, after a while... strangely familiar. I thought about it for a while, walked over to one of my bookcases, and picked up a book. What book? My 2003 copy of the the first edition of "Quantum Field Theory in a Nutshell", by A. (for Anthony) Zee. I turned it over. The first blurb on the back says:

"This is a fantastic book -- exciting, amusing, unique, and very valuable."

The author of that blurb? Clifford V. Johnson.

Brilliantly done.

Click to continue reading this post

The post Anthony Zee’s Joke(?) appeared first on Asymptotia.

by Clifford at December 05, 2017 10:20 PM

The n-Category Cafe

The 2-Dialectica Construction: A Definition in Search of Examples

An adjunction is a pair of functors <semantics>f:AB<annotation encoding="application/x-tex">f:A\to B</annotation></semantics> and <semantics>g:BA<annotation encoding="application/x-tex">g:B\to A</annotation></semantics> along with a natural isomorphism

<semantics>A(a,gb)B(fa,b).<annotation encoding="application/x-tex"> A(a,g b) \cong B(f a,b). </annotation></semantics>

Question 1: Do we get any interesting things if we replace “isomorphism” in this definition by something else?

  • If we replace it by “function”, then the Yoneda lemma tells us we get just a natural transformation <semantics>fg1 B<annotation encoding="application/x-tex">f g \to 1_B</annotation></semantics>.
  • If we replace it by “retraction” then we get a unit and counit, as in an adjunction, satisfying one triangle identity but not the other.
  • If <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics> and <semantics>B<annotation encoding="application/x-tex">B</annotation></semantics> are 2-categories and we replace it by “equivalence”, we get a biadjunction.
  • If <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics> and <semantics>B<annotation encoding="application/x-tex">B</annotation></semantics> are 2-categories and we replace it by “adjunction”, we get a sort of lax 2-adjunction (a.k.a. “local adjunction”)

Are there other examples?

Question 2: What if we do the same thing for multivariable adjunctions?

A two-variable adjunction is a triple of functors <semantics>f:A×BC<annotation encoding="application/x-tex">f:A\times B\to C</annotation></semantics> and <semantics>g:A op×CB<annotation encoding="application/x-tex">g:A^{op}\times C\to B</annotation></semantics> and <semantics>h:B op×CA<annotation encoding="application/x-tex">h:B^{op}\times C\to A</annotation></semantics> along with natural isomorphisms

<semantics>C(f(a,b),c)B(b,g(a,c))A(a,h(b,c)).<annotation encoding="application/x-tex"> C(f(a,b),c) \cong B(b,g(a,c)) \cong A(a,h(b,c)). </annotation></semantics>

What does it mean to “replace ‘isomorphism’ by something else” here? It could mean different things, but one thing it might mean is to ask instead for a function

<semantics>A(a,h(b,c))×B(b,g(a,c))C(f(a,b),c).<annotation encoding="application/x-tex"> A(a,h(b,c)) \times B(b,g(a,c)) \to C(f(a,b),c). </annotation></semantics>

Even more intriguingly, if <semantics>A,B,C<annotation encoding="application/x-tex">A,B,C</annotation></semantics> are 2-categories, we could ask for an ordinary two-variable adjunction between these three hom-categories; this would give a certain notion of “lax two-variable 2-adjunction”. Question 2 is, are notions like this good for anything? Are there any natural examples?

Now, you may, instead, be wondering about

Question 3: In what sense is a function <semantics>A(a,h(b,c))×B(b,g(a,c))C(f(a,b),c)<annotation encoding="application/x-tex"> A(a,h(b,c)) \times B(b,g(a,c)) \to C(f(a,b),c) </annotation></semantics> a “replacement” for isomorphisms <semantics>C(f(a,b),c)B(b,g(a,c))A(a,h(b,c))<annotation encoding="application/x-tex"> C(f(a,b),c) \cong B(b,g(a,c)) \cong A(a,h(b,c)) </annotation></semantics>?

But that question, I can answer; it has to do with comparing the Chu construction and the Dialectica construction.

Last month I told you about how multivariable adjunctions form a polycategory that sits naturally inside the 2-categorical Chu construction <semantics>Chu(Cat,Set)<annotation encoding="application/x-tex">Chu(Cat,Set)</annotation></semantics>.

Now the classical Chu construction is, among other things, a way to produce <semantics>*<annotation encoding="application/x-tex">\ast</annotation></semantics>-autonomous categories, which are otherwise in somewhat short supply. At first, I found that rather disincentivizing to study either one: why would I be interested in a contrived way to construct things that don’t occur naturally? But then I realized that the same sentence would make sense if you replaced “Chu construction” with “sheaves on a site” and “<semantics>*<annotation encoding="application/x-tex">\ast</annotation></semantics>-autonomous categories” with “toposes”, and I certainly think those are interesting. So now it doesn’t bother me as much.

Anyway, there is also another general construction of <semantics>*<annotation encoding="application/x-tex">\ast</annotation></semantics>-autonomous categories (and, in fact, more general things), which goes by the odd name of the “Dialectica construction”. The categorical Dialectica construction is an abstraction, due to Valeria de Paiva, of a syntactic construction due to Gödel, which in turn is referred to as the “Dialectica interpretation” apparently because it was published in the journal Dialectica. I must say that I cannot subscribe to this as a general principle for the naming of mathematical definitions; fortunately it does not seem to have been very widely adopted.

Anyway, however execrable its name, the Dialectica construction appears quite similar to the Chu construction. Both start from a closed symmetric monoidal category <semantics>𝒞<annotation encoding="application/x-tex">\mathcal{C}</annotation></semantics> equipped with a chosen object, which in this post I’ll call <semantics>Ω<annotation encoding="application/x-tex">\Omega</annotation></semantics>. (Actually, there are various versions of both, but here I’m going to describe two versions that are maximally similar, as de Paiva did in her paper Dialectica and Chu constructions: Cousins?.) Moreover, both <semantics>Chu(𝒞,Ω)<annotation encoding="application/x-tex">Chu(\mathcal{C},\Omega)</annotation></semantics> and <semantics>Dial(𝒞,Ω)<annotation encoding="application/x-tex">Dial(\mathcal{C},\Omega)</annotation></semantics> have the same objects: triples <semantics>A=(A +,A ,A̲)<annotation encoding="application/x-tex">A=(A^+,A^-,\underline{A})</annotation></semantics> where <semantics>A +,A <annotation encoding="application/x-tex">A^+,A^-</annotation></semantics> are objects of <semantics>𝒞<annotation encoding="application/x-tex">\mathcal{C}</annotation></semantics> and <semantics>A̲:A +A Ω<annotation encoding="application/x-tex">\underline{A} : A^+ \otimes A^- \to \Omega</annotation></semantics> is a morphism in <semantics>𝒞<annotation encoding="application/x-tex">\mathcal{C}</annotation></semantics>. Finally, the morphisms <semantics>f:AB<annotation encoding="application/x-tex">f:A\to B</annotation></semantics> in both <semantics>Chu(𝒞,Ω)<annotation encoding="application/x-tex">Chu(\mathcal{C},\Omega)</annotation></semantics> and <semantics>Dial(𝒞,Ω)<annotation encoding="application/x-tex">Dial(\mathcal{C},\Omega)</annotation></semantics> consist of a pair of morphisms <semantics>f +:A +B +<annotation encoding="application/x-tex">f^+ : A^+ \to B^+</annotation></semantics> and <semantics>f :B A <annotation encoding="application/x-tex">f^- : B^- \to A^-</annotation></semantics> (note the different directions) subject to some condition.

The only difference is in the conditions. In <semantics>Chu(𝒞,Ω)<annotation encoding="application/x-tex">Chu(\mathcal{C},\Omega)</annotation></semantics>, the condition is that the composites

<semantics>A +B 1f A +A A̲Ω<annotation encoding="application/x-tex">A^+ \otimes B^- \xrightarrow{1\otimes f^-} A^+ \otimes A^- \xrightarrow{\underline{A}} \Omega </annotation></semantics>

<semantics>A +B f +1B +B B̲Ω<annotation encoding="application/x-tex">A^+ \otimes B^- \xrightarrow{f^+\otimes 1} B^+ \otimes B^- \xrightarrow{\underline{B}} \Omega </annotation></semantics>

are equal. But in <semantics>Dial(𝒞,Ω)<annotation encoding="application/x-tex">Dial(\mathcal{C},\Omega)</annotation></semantics>, we assume that <semantics>Ω<annotation encoding="application/x-tex">\Omega</annotation></semantics> is equipped with an internal preorder, and require that the first of these composites is <semantics><annotation encoding="application/x-tex">\le</annotation></semantics> the second with respect to this preorder.

Now you can probably see where Question 1 above comes from. The 2-category of categories and adjunctions sits inside <semantics>Chu(Cat,Set)<annotation encoding="application/x-tex">Chu(Cat,Set)</annotation></semantics> as the objects of the form <semantics>(A,A op,hom A)<annotation encoding="application/x-tex">(A,A^{op},hom_A)</annotation></semantics>. The analogous category sitting inside <semantics>Dial(Cat,Set)<annotation encoding="application/x-tex">Dial(Cat,Set)</annotation></semantics>, where <semantics>Set<annotation encoding="application/x-tex">Set</annotation></semantics> is regarded as an internal category in <semantics>Cat<annotation encoding="application/x-tex">Cat</annotation></semantics> in the obvious way, would consist of “generalized adjunctions” of the first sort, with simple functions <semantics>A(a,gb)B(fa,b)<annotation encoding="application/x-tex">A(a,g b) \to B(f a,b)</annotation></semantics> rather than isomorphisms. Other “2-Dialectica constructions” would yield other sorts of generalized adjunction.

What about Questions 2 and 3? Well, back up a moment: the above description of the Chu and Dialectica constructions actually exaggerates their similarity, because it omits their monoidal structures. As a mere category, <semantics>Chu(𝒞,Ω)<annotation encoding="application/x-tex">Chu(\mathcal{C},\Omega)</annotation></semantics> is clearly the special case of <semantics>Dial(𝒞,Ω)<annotation encoding="application/x-tex">Dial(\mathcal{C},\Omega)</annotation></semantics> where <semantics>Ω<annotation encoding="application/x-tex">\Omega</annotation></semantics> has a discrete preorder (i.e. <semantics>xy<annotation encoding="application/x-tex">x\le y</annotation></semantics> iff <semantics>x=y<annotation encoding="application/x-tex">x=y</annotation></semantics>). But <semantics>Chu(𝒞,Ω)<annotation encoding="application/x-tex">Chu(\mathcal{C},\Omega)</annotation></semantics> is always <semantics>*<annotation encoding="application/x-tex">\ast</annotation></semantics>-autonomous, as long as <semantics>𝒞<annotation encoding="application/x-tex">\mathcal{C}</annotation></semantics> has pullbacks; whereas for <semantics>Dial(𝒞,Ω)<annotation encoding="application/x-tex">Dial(\mathcal{C},\Omega)</annotation></semantics> to be monoidal, closed, or <semantics>*<annotation encoding="application/x-tex">\ast</annotation></semantics>-autonomous we require the preorder <semantics>Ω<annotation encoding="application/x-tex">\Omega</annotation></semantics> to have those same properties, which a discrete preorder certainly does not always. And even when a discrete preorder <semantics>Ω<annotation encoding="application/x-tex">\Omega</annotation></semantics> does have some or all those properties, the resulting monoidal structure of <semantics>Dial(𝒞,Ω)<annotation encoding="application/x-tex">Dial(\mathcal{C},\Omega)</annotation></semantics> does not coincide with that of <semantics>Chu(𝒞,Ω)<annotation encoding="application/x-tex">Chu(\mathcal{C},\Omega)</annotation></semantics>.

As happens so often, the situation is clarified by considering universal properties. That is, rather than comparing the concrete constructions of the tensor products in <semantics>Chu(𝒞,Ω)<annotation encoding="application/x-tex">Chu(\mathcal{C},\Omega)</annotation></semantics> and <semantics>Dial(𝒞,Ω)<annotation encoding="application/x-tex">Dial(\mathcal{C},\Omega)</annotation></semantics>, we should compare the functors that they represent. A morphism <semantics>ABC<annotation encoding="application/x-tex">A\otimes B\to C</annotation></semantics> in <semantics>Chu(𝒞,Ω)<annotation encoding="application/x-tex">Chu(\mathcal{C},\Omega)</annotation></semantics> consists of three mophisms <semantics>f:A +B +C +<annotation encoding="application/x-tex">f:A^+\otimes B^+\to C^+</annotation></semantics> and <semantics>g:A +C B <annotation encoding="application/x-tex">g:A^+ \otimes C^- \to B^-</annotation></semantics> and <semantics>h:B +C A <annotation encoding="application/x-tex">h:B^+ \otimes C^- \to A^-</annotation></semantics> such that a certain three morphisms <semantics>A +B +C Ω<annotation encoding="application/x-tex">A^+ \otimes B^+ \otimes C^- \to \Omega</annotation></semantics> are equal. In terms of “formal elements” <semantics>a:A +,b:B +,c:C <annotation encoding="application/x-tex">a:A^+, b:B^+,c:C^-</annotation></semantics> in the internal type theory of <semantics>𝒞<annotation encoding="application/x-tex">\mathcal{C}</annotation></semantics>, these certain three morphisms can be written as

<semantics>C̲(f(a,b),c)B̲(b,g(a,c))A̲(a,h(b,c))<annotation encoding="application/x-tex"> \underline{C}(f(a,b),c) \qquad \underline{B}(b,g(a,c)) \qquad \underline{A}(a,h(b,c)) </annotation></semantics>

just as in a two-variable adjunction. By contrast, a morphism <semantics>ABC<annotation encoding="application/x-tex">A\otimes B\to C</annotation></semantics> in <semantics>Dial(𝒞,Ω)<annotation encoding="application/x-tex">Dial(\mathcal{C},\Omega)</annotation></semantics> consists of three morphisms <semantics>f,g,h<annotation encoding="application/x-tex">f,g,h</annotation></semantics> of the same sorts, but such that

<semantics>B̲(b,g(a,c))A̲(a,h(b,c))C̲(f(a,b),c)<annotation encoding="application/x-tex"> \underline{B}(b,g(a,c)) \boxtimes \underline{A}(a,h(b,c)) \le \underline{C}(f(a,b),c) </annotation></semantics>

where <semantics><annotation encoding="application/x-tex">\boxtimes</annotation></semantics> denotes the tensor product of the monoidal preorder <semantics>Ω<annotation encoding="application/x-tex">\Omega</annotation></semantics>. Now you can probably see where Question 2 comes from: if in constructing <semantics>Dial(Cat,Set)<annotation encoding="application/x-tex">Dial(Cat,Set)</annotation></semantics> we equip <semantics>Set<annotation encoding="application/x-tex">Set</annotation></semantics> with its usual monoidal structure, we get generalized 2-variable adjunctions with a function <semantics>A(a,h(b,c))×B(b,g(a,c))C(f(a,b),c)<annotation encoding="application/x-tex">A(a,h(b,c)) \times B(b,g(a,c)) \to C(f(a,b),c)</annotation></semantics>, and for other choices of <semantics>Ω<annotation encoding="application/x-tex">\Omega</annotation></semantics> we get other kinds.

This is already somewhat of an answer to Question 3: the analogy between ordinary adjunctions and these “generalized adjunctions” is the same as between the Chu and Dialectica constructions. But it’s more satisfying to make both of those analogies precise, and we can do that by generalizing the Dialectica construction to allow <semantics>Ω<annotation encoding="application/x-tex">\Omega</annotation></semantics> to be an internal polycategory rather than merely an internal poset (or category). If this polycategory structure is representable, then we recover the original Dialectica construction. Whereas if we give an arbitrary object <semantics>Ω<annotation encoding="application/x-tex">\Omega</annotation></semantics> the (non-representable) “Frobenius-discrete” polycategory structure, in which a morphism <semantics>(x 1,,x m)(y 1,,y n)<annotation encoding="application/x-tex">(x_1,\dots,x_m) \to (y_1,\dots,y_n)</annotation></semantics> is the assertion that <semantics>x 1==x m=y 1==y n<annotation encoding="application/x-tex">x_1=\cdots=x_m=y_1=\cdots=y_n</annotation></semantics>, then we recover the original Chu construction.

For a general internal polycategory <semantics>Ω<annotation encoding="application/x-tex">\Omega</annotation></semantics>, the resulting “Dialectica-Chu” construction will be only a polycategory. But it is representable in the Dialectica case if <semantics>Ω<annotation encoding="application/x-tex">\Omega</annotation></semantics> is representable, and it is representable in the Chu case if <semantics>𝒞<annotation encoding="application/x-tex">\mathcal{C}</annotation></semantics> has pullbacks. This explains why the tensor products in <semantics>Chu(𝒞,Ω)<annotation encoding="application/x-tex">Chu(\mathcal{C},\Omega)</annotation></semantics> and <semantics>Dial(𝒞,Ω)<annotation encoding="application/x-tex">Dial(\mathcal{C},\Omega)</annotation></semantics> look different: they are representing two instances of the same functor, but they represent it for different reasons.

So… what about Questions 1 and 2? In other words: if the reason I care about the Chu construction is because it’s an abstraction of multivariable adjunctions, why should I care about the Dialectica construction?

by shulman ( at December 05, 2017 06:35 AM

December 04, 2017

Andrew Jaffe - Leaves on the Line

WMAP Breaks Through

It was announced this morning that the WMAP team has won the $3 million Breakthrough Prize. Unlike the Nobel Prize, which infamously is only awarded to three people each year, the Breakthrough Prize was awarded to the whole 27-member WMAP team, led by Chuck Bennett, Gary Hinshaw, Norm Jarosik, Lyman Page, and David Spergel, but including everyone through postdocs and grad students who worked on the project. This is great, and I am happy to send my hearty congratulations to all of them (many of whom I know well and am lucky to count as friends).

I actually knew about the prize last week as I was interviewed by Nature for an article about it. Luckily I didn’t have to keep the secret for long. Although I admit to a little envy, it’s hard to argue that the prize wasn’t deserved. WMAP was ideally placed to solidify the current standard model of cosmology, a Universe dominated by dark matter and dark energy, with strong indications that there was a period of cosmological inflation at very early times, which had several important observational consequences. First, it made the geometry of the Universe — as described by Einstein’s theory of general relativity, which links the contents of the Universe with its shape — flat. Second, it generated the tiny initial seeds which eventually grew into the galaxies that we observe in the Universe today (and the stars and planets within them, of course).

By the time WMAP released its first results in 2003, a series of earlier experiments (including MAXIMA and BOOMERanG, which I had the privilege of being part of) had gone much of the way toward this standard model. Indeed, about ten years one of my Imperial colleagues, Carlo Contaldi, and I wanted to make that comparison explicit, so we used what were then considered fancy Bayesian sampling techniques to combine the data from balloons and ground-based telescopes (which are collectively known as “sub-orbital” experiments) and compare the results to WMAP. We got a plot like the following (which we never published), showing the main quantity that these CMB experiments measure, called the power spectrum (which I’ve discussed in a little more detail here). The horizontal axis corresponds to the size of structures in the map (actually, its inverse, so smaller is to the right) and the vertical axis to how large the the signal is on those scales.

Grand unified spectrum

As you can see, the suborbital experiments, en masse, had data at least as good as WMAP on most scales except the very largest (leftmost; this is because you really do need a satellite to see the entire sky) and indeed were able to probe smaller scales than WMAP (to the right). Since then, I’ve had the further privilege of being part of the Planck Satellite team, whose work has superseded all of these, giving much more precise measurements over all of these scales: PlanckCl

Am I jealous? Ok, a little bit.

But it’s also true, perhaps for entirely sociological reasons, that the community is more apt to trust results from a single, monolithic, very expensive satellite than an ensemble of results from a heterogeneous set of balloons and telescopes, run on (comparative!) shoestrings. On the other hand, the overall agreement amongst those experiments, and between them and WMAP, is remarkable.

And that agreement remains remarkable, even if much of the effort of the cosmology community is devoted to understanding the small but significant differences that remain, especially between one monolithic and expensive satellite (WMAP) and another (Planck). Indeed, those “real and serious” (to quote myself) differences would be hard to see even if I plotted them on the same graph. But since both are ostensibly measuring exactly the same thing (the CMB sky), any differences — even those much smaller than the error bars — must be accounted for almost certainly boil down to differences in the analyses or misunderstanding of each team’s own data. Somewhat more interesting are differences between CMB results and measurements of cosmology from other, very different, methods, but that’s a story for another day.

by Andrew at December 04, 2017 05:48 PM

December 03, 2017

Lubos Motl - string vacua and pheno

A notorious string critic remains invisible at a string event
Are opinions and tirades as important as research and results?

Two days ago, I described my shock after the Quanta Magazine edited a sentence about the set of candidates for a theory of everything according to the order issued by a scientific nobody who had nothing to do with the interview with Witten.

Natalie Wolchover literally thanked (!) Sabine Hossenfelder for providing her absolutely deluded feedback:
Thanks to Sabine, I realized that Edward Witten was just totally misguided. After all, he's just an old white male and those suck. Sabine Hossenfelder told me that there are lots and lots of candidates for a theory of everything and lots and lots of people like Edward Witten, for example the surfer dude Garrett Lisi. I have absolutely no reason not to trust her so I immediately edited the propaganda imprinted into my article by the old white male dinosaur.
She didn't use these words exactly but my words describe more clearly what was actually going on. Crackpots such as Ms Hossenfelder simply control science journalism these days. They have nurtured their contacts, they have the right politics which is what the science journalists actually place at the top, and that's why these disgusting SJWs may overshadow Witten or anyone else.

But I don't want to report just the bad and shocking news of this sort. There are sometimes events that could have evolved insanely but that didn't. On December 1st, Brooklyn saw an example of those. A cultural foundation called the Pioneer Works organized a debate about string theory, Scientific Controversy No 13.

The room was full, mostly of younger people (hundreds of them), Jim Simons funded the event, and popular science writer Janna Levin hosted it. It turns out that a Peter W*it was in the audience but he hasn't talked to anybody and no one has noticed him. If you weren't familiar with these things for years, Peter W*it was one of the most annoying, most hostile, and most dishonest anti-physics demagogues who appeared in the pop science press all the time some 11 years ago.

What could have happened is the following: The host, e.g. Janna Levin, could have said:
Wow, dear visitors, screw our panelists, Clifford Johnson (who got some space to promote his new book, The Dialogues, which has lots of his own impressive illustrations) and especially David Gross. Because of some miracle, we have a true hero here. Let me introduce Peter W*it. Applause. Now, you can go home, Clifford Johnson and David Gross. We will talk to him instead.
Thank God, that didn't happen, the piece of šit was treated as a piece of šit, indeed. But some of the sentences voiced at the event were weird, anyway. We learn that at the very beginning, Janna Levin asked both Johnson and Gross whether they were for or against string theory or neutral.

Peter W*it reports that "it flustered David Gross a bit". Well, it surely flusters me, and not just a little bit. Janna Levin is one of the talking heads who often paints herself as a pundit who is familiar with modern physics and events in it. Does she really need to ask David Gross – and, in fact, even Clifford Johnson – whether they are for string theory or against? In fact, it's worse than that. She must have been really ignorant about everything because an answer by Gross – an answer that perhaps deliberately said nothing clearly – was interpreted by Levin as if Gross were a string agnostic. Wow.

David Gross internally acknowledged that the time for obfuscating jokes was over because her ignorance was serious and responded by saying that he's been married to string theory for 50 years and doesn't plan any divorce now.

I don't want to waste my time with detailed comments about new anti-physics tirades by crackpot Mr Peter W*it – which contain no ideas, let alone new ones – but let me say a few words about one sentence because it conveys some delusion that is widespread:
This flustered Gross a bit (he’s one of the world’s most well-known and vigorous proponents of string theory)
OK, so W*it agrees it's silly for Levin not to know that Gross is pro-string. But look at the detailed explanation why it's silly. It's silly because "Gross is a well-known and vigorous proponent" of string theory. Is it really the first relevant explanation?

What's going on is that Mr W*it is using language that makes the scientific research itself – and its results – look irrelevant. People are important because of their opinions, W*it wants you to think. So one divides the world to proponents and opponents.

But David Gross isn't important because he's an advocate of something, or because he speaks vigorously. He's important because he has moved our knowledge of physics – and even string theory – forward. And by a lot.

What's really relevant – and what a host of physics debates who is as competent as Janna Levin pretends to be should know – is that he has written lots of papers about physics and also string theory. What's important isn't that he is a proponent of string theory; what's important is that he is an important string theorist!

INSPIRE tells us that he has written some 200+ published and 300+ total number of scientific papers. The total citation count is about 40,000. 19 papers stand above 500 citations.

You know, a citation isn't an easy job. It really means that the author of the followup "mostly or completely" reads your paper. Not every reader cites you so in average, some dozens of highly qualified people spend their time by going through the dozens of pages of your paper which are much more difficult than reading a romantic novel. That's needed for you to earn one citation and Gross has some 40,000 of those.

Look at the list of Gross' 19 papers above 500 citations. One paper with Callan is there from the late 1960s. Then you have the Nobel-prize-related papers about QCD (one with Wilczek), asymptotic freedom, and the structure of the gauge theory's vacuum, anomalies (with Jackiw). Aside from others, there are some 5 papers about the heterotic string from 1985-1987 – Gross is one of the players in the "Princeton quartet" that discovered the heterotic string. A Gross-Witten paper is there to discuss stringy modification of Einstein's equations, and two Gross-Mende papers discussed the high-energy scattering in string theory.

If you look at the groups of papers above 250 or 100 citations, you will find some newer ones – newer papers often have fewer citations "mostly" just because they're newer. You will see that Gross has participated in many important developments, including some very recent ones in string theory. He's had important papers about AdS/CFT (including some analysis of Wilson loops and the stringy dual of QCD), two-dimensional string theory, non-commutative gauge theory within and outside string theory, string field theory, and many many other topics.

Note that lots of grad students sometimes face the need to study Gross' papers (and lots of other papers) in quite some detail. And they remain silent and modest. Janna Levin who paints herself as a physics guru not only fails to study Gross' papers on string theory and other things. She doesn't even know (one superficial bit of information) whether he is "pro string theory"!

Gross is a charismatic and assertive guy and it's great. But the core what makes him a "string theorist" isn't his opinion but his contributions and knowledge. The activists such as W*it, Hossenfelder, Horgan, and dozens of others constantly brainwash their mentally crippled audiences into believing that results and research don't matter. What matters is the "right opinions" and just writing a hostile rant full of lies, idiocy, and hatred – e.g. what the "Not Even Wrong" weblog is all about – is pretty much on par with Gross' contributions to physics. W*it's or Hossenfelder's "work" may be opposite in sign to Gross' but it is the "work" of equal importance in its absolute value, W*it et al. implicitly claim.

Sorry, comrades, but it's not and that's why Gross is a top physicist while you're just worthless deceitful piles of whining excrements, along with the brain-dead scum that keeps on visiting your blogs as of 2017.

And that's the memo.

by Luboš Motl ( at December 03, 2017 05:53 PM

December 02, 2017

ZapperZ - Physics and Physicists

Atomic Age Began 75 Years Ago Today
December 2, 1942, to be exact.

This is an article on the history of the first controlled nuclear fission that was conducted at the University of Chicago 75 years ago that marked the beginning of the atomic/nuclear age.

They called this 20x6x25-foot setup Chicago Pile Number One, or CP-1 for short – and it was here they obtained world’s the first controlled nuclear chain reaction on December 2, 1942. A single random neutron was enough to start the chain reaction process once the physicists assembled CP-1. The first neutron would induce fission on a uranium nucleus, emitting a set of new neutrons. These secondary neutrons hit carbon nuclei in the graphite and slowed down. Then they’d run into other uranium nuclei and induce a second round of fission reactions, emit even more neutrons, and on and on. The cadmium control rods made sure the process wouldn’t continue indefinitely, because Fermi and his team could choose exactly how and where to insert them to control the chain reaction.

Sadly, other than a commemorative statue/plaque, there's not much left of this historic site. One of the outcome of this work is the creation of Argonne National Lab just outside of Chicago, where, I believe, the research on nuclear chain reaction continued at that time. Argonne now no longer carries any nuclear research work.


by ZapperZ ( at December 02, 2017 03:17 PM

Lubos Motl - string vacua and pheno

Pure \(AdS_3\) gravity from monster group spin networks
A fifth of my research topics that make me most excited have something to do with the three-dimensional pure Anti de Sitter space gravity. In 2007, Witten pointed out that there is a perfect candidate for the dual boundary CFT, one that has the monster group as the global symmetry.

The monster group is the largest among the 26 or 27 "sporadic groups" in the classification of all the simple finite groups. The CFT – which was the player that proved the "monstrous moonshine" – may be constructed from bosonic strings propagating on the (24-dimensional space divided by) the Leech lattice, the most interesting even self-dual lattice in 24 dimensions, the only one among 24 of those that doesn't have any "length squared equals to two" lattice sites.

I didn't have enough space here for a picture of Witten and a picture of a monster so I merged them. Thanks to Warren Siegel who took the photograph of Cyborg-Witten.

The absence of these sites represents to the absence of any massless fields. So the corresponding gravity only has massive objects, the black hole microstates, and they transform as representations of the monster group. I will only discuss the monster group CFT with the "minimum radius" – Davide Gaiotto has proven that the infinite family of the larger CFTs cannot exist, at least not for all the radii and with the required monster group symmetry, because there's no good candidate for a spin field corresponding to a conjugacy class.

I think that the single CFT with the single radius is sufficiently fascinating a playground to test lots of ideas in quantum gravity – and especially the relationship between the continuous and discrete structures (including global and gauge groups) in the bulk and on the boundary.

It's useful to look at the list of irreducible representations of the monster group for at least 10 minutes. There are 194 different irreps – which, by Schur's tricks, means that there are 194 conjugacy classes in the monster group. Don't forget that the order of any element has to be a supersingular prime.

However, you will only find 170 different dimensionalities of the irreps. For years ;-), I have assumed that it means that 146 dimensionalities are unique while for 24 others, the degeneracy is two – so the total number of irreps is 146*1+24*2 = 194. It makes sense to think that some of the representations are complex and they're complex conjugate to each other, in pairs.

Well, just very very recently ;-), I looked very very carefully, made a histogram and saw that one dimension of the irreps, namely the dimension
5 514 132 424 881 463 208 443 904,
(5.5 American septillions) appears thrice – like the 8-dimensional representation of \(SO(8)\) appears in "three flavors" due to triality. Why hasn't anyone told me about the "tripled" irrep of the monster group? I am sure that all monster minds know about this triplet of representations in the kindergarten but I didn't. So the right answer is that there are 147 representations uniquely given by their dimension, 22 dimensionalities appear twice, and 1 dimensionality (above) appears thrice.

BTW those 5.5 septillions has the factor of \(2^{43}\) – a majority of the factor \(2^{46}\) in the number of elements in the monster group – and no factors of three. This large power of two is similar to the spinor representations (e.g. in the triality example above).

Fine. Among the 194 irreps, there's obviously the 1-dimensional "singlet" representation. Each group has singlets. The first nontrivial representation is 196,883-dimensional. This many states, along with a singlet (so 196,884 in total), appear on the first excited level of the CFT – so there are 196,884 black hole microstates in pure \(AdS_3\) gravity with the minimum positive mass (this number appears as a coefficient in an expension of the \(j\)-invariant, a fact that was known as the "monstrous moonshine" decades before this "not so coincidence" was explained). This level of black holes has some energy and nicely enough,\[

196,883\sim \exp(4\pi),

\] as Witten was very aware, and this approximate relationship is no coincidence. So the entropy at this level is roughly \(S\approx 4\pi\) which corresponds to \(S=A/4G\) i.e. \(A\approx 16\pi G\). Note that the "areas" are really lengths in 2+1 dimensions and Newton's constant has the units of length, too. The entropy proportional to \(\pi\) is almost a matter of common sense for those who have ever calculated entropy of 3D black holes using stringy methods. But it's also fascinating for me because of my research on quasinormal modes and loop quantum gravity.

The real part of the asymptotic, highly-damped quasinormal modes was approaching\[

\frac{\ln 3}{8\pi G M}

\] where \(M\) is the Schwarzschild black hole mass. The argument \(3\) in the logarithm could have been interpreted as the degeneracy of some links in \(SO(3)\) spin networks – and that's why I or we were treated as prophets among the loop quantum gravity and other discrete cultists and why my and our paper got overcited (although we still loved them). It's a totally unnatural number that appears there by coincidence, and I – and I and Andy Neitzke – gave full analytic proofs that the number is \(3\) exactly. It's not a big deal, it's a coincidence, and \(3\) is a simple enough number so that it can appear by chance.

But the funny thing is that the quasinormal frequency becomes a more natural expression if \(3\) is replaced with another dimension of an irrep. Fifteen years ago, I would play with its being replaced by \(248\) of \(E_8\) which could have been relevant in 11-dimensional M-theory, and so on. (\(E_8\) appears on boundaries of M-theory, as shown by Hořava and Witten, but is also useful to classify fluxes in the bulk of M-theory spacetimes in a K-theory-like way, as argued by Diaconescu, Moore, and Witten. Note that "Witten" is in all these author lists so there could be some extra unknown dualities.) And while no convincing theory has come out of it, I still find it plausible that something like that might be relevant in M-theory. The probability isn't too high for M-theory, however, because M-theory doesn't seem to be "just" about the fluxes, so the bulk \(E_8\) shouldn't be enough to parameterize all of the physical states.

But let's replace \(3\) with \(196,883\) or \(196,884\), the dimension of the smallest nontrivial irrep of the monster group (perhaps plus one). You will get\[

\frac{\ln 196,883}{8\pi G M} \approx \frac{1}{2GM}

\] The \(\pi\) canceled and the expression for the frequency dramatically simplified. Very generally, this nice behavior may heuristically lead you to study Chern-Simons-like or loop-quantum-gravity-like structures where the groups \(SU(2)\) or \(SO(3)\) or \(SL(2,\CC)\) which have 3-dimensional representations is replaced with the discrete, monster group.

A fascinating fact is that aside from this numerological observation, I've had numerous other reasons to consider Chern-Simons-like theories based on the finite, monster group. Which ones?

Well, one reason is simple. The boundary CFT of Witten's has the monster group as its global symmetry. So the monster group is analogous e.g. to \(SO(6)\) in the \({\mathcal N}=4\) supersymmetric gauge theory in \(D=4\) which is dual to the \(AdS_5\) vacuum of type IIB string theory. Just like the \(SO(6)\) becomes a gauge group in the bulk gravitational theory (symmetry groups have to be gauged in quantum gravity theories; this one is a Kaluza-Klein-style local group), the monster group should analogously be viewed as a gauge group in the \(AdS_3\) gravitational bulk.

On top of that, there are gauge groups in \(AdS_3\) gravity. In 1988, the same Witten has showed the relationship between the Chern-Simons theory and 3D gravity. It was a duality at the level of precision of the 1980s although decades later, Witten told us that the duality isn't exact non-perturbatively etc. But that Chern-Simons theory replacing the fields in 3-dimensional gravity could be correct in principle. Just the gauge group could be incorrect.

Well, maybe it's enough to replace \(SL(2,\CC)\) and similar groups with the monster group.

One must understand what we mean by a Chern-Simons theory with a discrete gauge group and how to work with it. Those of us who are loop quantum gravity experts ;-) are extremely familiar with the spin network such as

This is how the LQG cultists imagine the structure of the 3+1-dimensional spacetime at the Planckian level. There is obviously no evidence that this is the right theory, nothing seems to work, nothing nice happens when the 4D gravity is linked to those gauge fields in this way, no problem or puzzle of quantum gravity is solved by these pictures. But the spin networks are still a cute, important way to parameterize some wave functionals that depend on a gauge field. Well, I guess that Roger Penrose, and not Lee Smolin, should get the credit for the aspects of the spin networks that have a chance to be relevant or correct somewhere.

If you consider an \(SU(2)\) gauge field in a 3-dimensional space, you may calculate the "open Wilson lines", the transformations induced by the path-ordered exponential of the integral of the gauge field over some line interval. It takes values in the group itself. As an operator, it transforms as \(R\) according to the transformations at the initial point, \(\bar R\) according to the final point – you need to pick a reference representation where the transformations are considered. And you may create gauge-invariant operators by connecting these open Wilson lines – whose edges are specified by a transformation – using vertices that bring you the Clebsch-Gordan coefficients capable of connecting three (or more) representations at the vertex.

Above, you see a spin network. The edges carry labels like \(j=1/2\), the non-trivial irreps of \(SU(2)\). They're connected at the vertices so that the addition of the angular momentum allows the three values of \(j\) to be "coupled". For \(SU(2)\), the Clebsch-Gordan coefficients are otherwise unique. Each irrep appears at most once in the tensor product of any pair of irreps.

Now, my proposal to derive the right bulk description of the \(AdS_3\) gravity is to identify an \(SO(3)\) Chern-Simons-style description of the 3D gravity and replace all the 3-dimensional representations – in \(SO(3)\), the half-integral spin irreps are prohibited – with the monster group.

In this replacement, it should be true that a majority of the edges of the spin network carry \(j=1\) i.e. the 3-dimensional representation. And that 3-dimensional representation is replaced with the \(196,883\)-dimensional one in the monster group case. Otherwise the structures should be analogous. I tend to believe that the relevant spin networks should be allowed to be attached to the boundary of the Anti de Sitter space, and therefore resemble something that is called Witten's diagrams – the appearance of "Witten" seems like another coincidence here ;-) because I don't know of good arguments (older than mine) relating these different ideas from Witten's assorted papers.

Note that the 196,883-dimensional representation is vastly smaller than the larger irreps: the next smallest one is 21-million-dimensional, more than 100 times larger. And it's also useful to see how the tensor product of two copies of the \(d_2=196,883\)-dimensional irrep decompose to irreps. We have:\[

d_2^2 = 2(d_5+d_4+d_1) + d_2.

\] Both sides are equal to 38,762,915,689, almost 39 billion. So the singlet appears twice, much like the fifth and fourth representation. But the same 196,883-dimensional representation appears exactly once (and the third, 21-million-dimensional one is absent). It means that there's exactly one cubic vertex that couples three 196,883-dimensional representations. On top of that, because of the "two singlets" \(2d_1\) on the right hand side above, there are two ways to define the quadratic form on two 196,883-dimensional representation.

I think that in some limit, the spin networks with the "edges 196,883" only will dominate, and the extra subtlety is that each of these edges may or may not include a "quadratic vertex" that switches us to the "less usual" singlet among the two. The presence or absence of this quadratic vertex could basically have the same effect as if there were two different types of the 196,883-dimensional irrep, unless I miss some important detail which I probably do.

Now, there might exist a spin-network-like description of the black hole microstates in \(AdS_3\) and the reason why it works could be a relatively minor variation of the proof of the approximate equivalence of the Chern-Simons theory and the three-dimensional general relativity. The mass of the black hole microstates could be obtained from some "complexity of the spin network" – some weighted number of vertices in the network etc. which could follow from the \(\int A\wedge F\)-style Hamiltonians.

I believe that according to some benchmarks, the \(AdS_3\) pure gravity vacuum should be the "simplest" or "most special" vacuum of quantum gravity. The gauge group is purely discrete which is probably an exception. That's related to the complete absence of the massless fields or excitations which is also an exception. And things just should be more or less solvable and the solution could be a clever variation of the equivalences that have already been written in the literature.

If some deep new conceptual principles are hacked in the case of the monstrous \(AdS_3\) gravity, the remaining work needed to understand the logic of all quantum gravity vacua could be as easy as a generalization of the finite group's representation theory to Lie groups and infinite-dimensional gauged Lie groups. Those also have irreps and conjugacy classes and the relationships between those could be a clever version of the proof that the old matrix model is equivalent to free fermions. Such a unified principle obeyed by all quantum gravity vacua should apply to spacetimes, world sheets, as well as configuration spaces of effective field theories.

by Luboš Motl ( at December 02, 2017 12:58 PM

December 01, 2017

Clifford V. Johnson - Asymptotia

The Geometry Door

Now that #thedialoguesbook is out is I get even more people telling me how they can’t draw. I don’t believe them. Just as with science (and other subjects), everybody has a doorway in to a subject. It is just a matter of taking time to finding your individual Door. Individual doors is what makes us all wonderfully different. For me it is mostly geometry that is my Door. It gives a powerful way to see things, but isn’t the only way. Moreover, I have to work hard to not be trapped by it sometimes. But it is how I truly see things most often - through geometry. Wonderful geometry everywhere.

-cvj Click to continue reading this post

The post The Geometry Door appeared first on Asymptotia.

by Clifford at December 01, 2017 09:38 PM

Lubos Motl - string vacua and pheno

Hossenfelder has the power to edit, negate Witten's answers in interviews
I've argued that the recent Quanta Magazine interview with Edward Witten has shown some deep differences between the culture of actual top theoretical physicists and the culture of pop science which has largely identified itself with the slogans by the notorious critics of modern physics. Despite all of his diplomacy and caution, Witten simply had to say lots of things that contradict the orthodoxy of the pop science world.
Young Sheldon, off-topic: in the newest episode of the sitcom, we could have seen Elon Musk whose SpaceX stole the ideas how to land on the ocean from the 8-year-old Sheldon Cooper and made Sheldon's notebook disappear. It's great he played it because that's how I imagine Musk's companies to operate whenever they do something right. ;-)
Unsurprisingly, these deep disagreements had extra consequences. One of the answers that Edward Witten "dared" to say was that M-theory was our candidate for a description unifying separate theoretical formalisms quantifying particles and forces that exist or may exist in Nature. Wolchover, the journalist, announced her interview on Twitter and one dissatisfied reaction by Ms Hossenfelder was posted soon afterwards:

Hossenfelder repeats the insane 2010 meme by Nude Socialist that Edward Witten is one of approximately 7 similar geniuses – the list includes Garrett Lisi and a former spouse of Lee Smolin – who have proposed comparably promising theories of everything. Needless to say, none of the "alternative theories" above could be called by a "candidate for a theory of everything" by a sane person who knows the basic stuff about the limiting and approximate theories that contemporary theoretical physics uses and why they're hard to be unified.

But even if you just don't get any of the physics, you should be able to understand some sociology. Take INSPIRE, the database of papers on high-energy physics, and compare the publication and citation record of Garrett Lisi and Edward Witten. They have some 90 and 130,000 citations, respectively – and I need to emphasize that the first number is not 90,000. ;-)

The difference is more than three orders of magnitude. Even a decade after his "groundbreaking" work, the surfer dude's quantifiable impact is more than three orders of magnitude weaker than Witten's. On top of that, every citation among 90 Lisi's citations is vastly less credible than the average followup of Witten's papers. There are very good reasons to say that the people who have followed "theories of everything" for some years and who consider Lisi and Witten to be peers (or even think that Lisi is superior) suffer from a serious psychiatric illness.

I think that this is not Hossenfelder's case – she must know that what she writes are just plain and absurd lies, lies whose propagation would be convenient for herself personally.

OK, so the fringe would-be researcher Ms Sabine Hossenfelder has suggested that it was politically incorrect for Edward Witten to say that the contemporary physics only has one viable candidate for a theory of everything, namely string/M-theory.

Even if you hypothetically imagined that Hossenfelder's proposition about "numerous alternatives" was right while Edward Witten was wrong, it shouldn't have mattered, should it have? Wolchover interviewed Edward Witten, not Sabine Hossenfelder, so the answers should be aligned with the views of Edward Witten, not Sabine Hossenfelder. But look what happened.

(Just to be sure, Connes' theory is in no way a step towards a theory of everything. It has nothing to say about any problem related to quantum gravity. It's an unusual, non-commutative-geometry-based proposal to unify the non-gravitational forces, like in GUT. At the end, it produces some subset of the effective field theories we know. The subset doesn't look natural and it has produced predictions that were falsified. For example, Connes predicted a Higgs boson of mass \(170\GeV\) which was ironically the first value that was ruled out experimentally – by the Tevatron in 2008. Some huge, TOE-like statements can always be made by someone, perhaps even by Connes, but it is spectacularly obvious to everyone who has at least the basic background that these ambitious claims have nothing to do with reality.)

Hossenfelder's dumb comment was not only taken into account. The interview was actually edited and Hossenfelder was even thanked for that! So for some time, Witten's answer in the interview probably contained some statement of the type "M-theory is just one of many similar alternatives and I, who first conjectured the existence of M-theory, am on par with monster minds such as Garrett Lisi". The Quanta Magazine didn't edit the answer quite this colorfully but it did edit it, as Wolchover's tweet says, so that Witten's proposition was severely altered.

Wolchover isn't a bad journalist but even independently of any knowledge of the beef of physics (i.e. even if you assumed that it's just OK for a science journalist not to know that all claims by Hossenfelder about physics of the recent 40 years are wrong), I think that this retroactive edit was an example of a breakdown of the basic journalistic integrity. You simply cannot severely edit the answers by EW in an interview just because someone else, SH, would prefer a totally different answer. Such a change is nothing less than a serious deception of the reader. And note that the Quanta Magazine, funded mainly by Jim Simons, a rather legendary physical mathematician (and a co-author of a theory that Witten wrote lots of important followups about), is surely among the outlets where it's least expected that journalists distort scientists' views. It's almost certainly worse in all truly mainstream media.

Now, I am virtually the only person in the world who reacts – at a frequency that detectably differs from zero – to these constant scams by the various activists, journalists, pop science advocates and hysterical "critics of science". But in this case, I had a pal. David Simmons-Duffin, an assistant professor at Caltech and a co-author of the newest paper co-written by Witten, may still be grateful for the good enough furniture I sold him at Harvard, shortly before escaping the U.S. on the 2007 Independence Day when my visa was expiring.

Or maybe he had deeper reasons than the furniture. Hopefully. ;-)

OK, Wolchover was told – and she could literally remain ignorant of this elementary fact if David remained silent – that most theoretical physicists would agree with my (scientific) comments about the "encounter of pop science and Edward Witten", and not with the comments by Sabine Hossenfelder. David was careful to restrict the approval by the adjective "scientific" to be sure that he didn't dare to accuse someone from believing in some right-wing or otherwise politically incorrect views about broader questions. I will return to a discussion whether these things are really as separate as David suggests.

Right now, Witten's answer in the interview says "M-theory is the candidate for the better description" which sounds OK enough. At the top, the article says (seemingly on behalf of Ms Wolchover) that M-theory is the "leading candidate" and a clarification at the bottom says that an edit was made. Maybe the statements were edited twice and they're tolerable now. But the very fact that Hossenfelder's hostile opinion about the number of theories of everything was incorporated to an interview that should have built on Witten's views on physics is something that I consider absolutely shocking. In effect, we seem to live in a society where a scientific Niemand of Hossenfelder's caliber stands above Edward Witten and has the power to "correct" his statements about Witten's field made in any media outlet on Earth.

Clearly, the main purpose of David's tweet was to inform Natalie Wolchover, the journalist, about some "basic sentiments" that are widespread among actual professional theoretical physicists – and indeed, about actual beliefs of Edward Witten's, too. But David sent a copy to Sabine as well who reacted angrily:

Hossenfelder "doesn't care" what Witten's recent co-authors think about the opinions of theoretical physicists about an interview. In the repetitively, stupidly sounding joke, she has clearly added at least one redundant level of recursion that shouldn't be there. ;-) But she, apparently rightfully, assumes that journalists do care what she thinks about what Witten should think.

Again, the irony is that Hossenfelder believes that Wolchover should care what Hossenfelder thinks about theories of everything but she doesn't care what people around Witten actually believe, even though it was an interview with Witten that is at the center of all these discussions. Thankfully, Wolchover at least replied that "she cared" about David's reports – and blamed Hossenfelder's arrogant "I don't care" reaction to Hossenfelder's wacky sense of humor.

To assure everyone that this is not the end of her jihad against physics, Sabine Hossenfelder posted another obnoxious tirade against modern physics. By studying SUSY, string theory, inflation, or the multiverse, scientists have stopped doing science, she claims, and we need to pick a new word instead of science to describe what they're doing. A fairy-tale science? Higher speculations? Fake physics? She mentions proposals by some of her true soulmates who are considered sources of worthless and hostile pseudointellectual junk by everyone who has a clue.

SUSY, string theory, inflation, and to a lesser degree the multiverse are examples of science par excellence. It's the critics of science like Hossenfelder and the assorted activists who are at most fake scientists and who have nothing to do with the actual science. They're not cracking equations that may link or do link observable quantities with each other. They are doing politics for the stupid masses.

By the way, Hossenfelder's argumentation becomes almost indistinguishable from the deluded tirades by the postmodern pseudo-philosophers and feminist bitches who would say that science is just another cultural habit, a tool of oppression, and so on. Hossenfelder wrote, among many similar things:
“Science,” then, is an emergent concept that arises in communities of people with a shared work practices. “Communities of practice,” as the sociologists say.
Wow, so just like the "sociologists" say, the readers is invited to believe that every "community" that shares work practices is equally justified to describe itself as a "group of scientists" who do "science". Perhaps if they say that they're looking for "useful descriptions of Nature", that is certainly enough. For this reason, science is on par with the ritualistic dances of savages in the Pacific Ocean, the "sociologists" say.

But this "sociological" view is just flabbergastingly stupid. Science obviously isn't any group of practices of any community. After all, science doesn't really depend on a community at all and some of the most important scientists were true solitaires in their work – and often outside their work, too. The scientific method is a rather particular template how to make progress while learning the truth about Nature. This template had to be discovered or invented – by Galileo and perhaps a few others – and these principles have to be kept, otherwise it's not science and, more importantly, otherwise it's really not working and doesn't systematically bring us closer to a deeper understanding of the world. Hypotheses must imply something that may be expressed sufficiently accurately, ideally and typically in the mathematical language, and the network of these assumptions of the theories and their logical implications are elaborated upon and finally compared to the facts that are known for certain – and the facts that are known to be certain ultimately come from experiments, observations, and mathematically solid proofs.

Only a very small percentage of people in the world actually does science related to theoretical physics and if it makes sense to talk about a community – especially a community of theoretical physicists – there is really just one global community. It's doing science defined by the same general template. The relevant theories and questions have become much more advanced than they were in the past – the theories are more unifying, they require a deeper, more difficult, and more abstract mathematical formalism, the experimental tests of the new things are increasingly expensive and often impossible to be made in a foreseeable future, and this forces the researchers to be even more careful, intellectually penetrating, and employing increasingly indirect strategies to probe questions. But those are quantitative changes reflecting the change of the phenomena that are investigated by the cutting-edge science. They are not negations of the basic meaning of the scientific rigor – and its dependence on honesty, mental power, patience, and the illegitimacy of philosophically justified dogmas.

Let me return to the comment by David that the actual working physicists may only declare the agreement with my "scientific" views about the interview and other things. It's true but it's a pity. David implicitly declares the view that politics and science are sharply separated. So I could be thrilled by lots of amazing discoveries made by the people who are overwhelmingly politically left-wingers – and they, if they're similarly honest as scientists, should be able to proudly say that they agree with my (a right-winger) many of my multi-layered comments about physics. Surely some people with Che Guevara T-shirts have been doing so, too. ;-)

When this setup of "politics separated from science" works, it works and it's great if it works. But the problem that David and others seem to overlook is that people like Sabine Hossenfelder – and basically everyone on the list of her "soulmates" who participate in this jihad against physics – are doing everything they can to obfuscate the boundary between science and politics. It wasn't your humble correspondent who would write blog posts addressing both groups of questions, scientific and political ones, because I find their increasingly intense mixing desirable. Those blog posts were a reaction to events that reflected the "dirty" mixing of science and politics. How many times did I have to explain that one can't understand any physics through sociology and similar things?

The likes of Hossenfelder and Šmoits are full of the word "science" but what they're actually up to is a disgusting political movement that is trying to brainwash millions of gullible morons and turn them against science. And to do so, the likes of Hossenfelder find it very convenient to pretend that they are or they could be equally excellent scientists – the peers of Witten – themselves. And there are millions of people who are ignorant enough so that they buy this nonsense. Maybe these brainwashed laymen are honest – they are just sufficiently intellectually limited which makes them unable to figure out that all these critics are scientifically worthless relatively to Witten but also hundreds of others.

David, you and others don't want to participate in a fight that is political which is an understandable reflection of an idealist, morally pure scientific ethics. But that desire won't make this fight go away. You may deny it but this fight is taking place, anyway (because the likes of Hossenfelder are deliberately waging it), and it has far-reaching consequences for the future of science in the real world.

You may combine this "pure focus on science" with some polite, "nice", politically correct attitudes, while persuading yourself that these things have nothing to do with each other. But in reality, the political attitudes influence the future strength of science in the society – and the ability of wise kids of future generations to do pure science as their job – tremendously, surely much more strongly than some extra $10,000 that a physics grad student will lose according to an (excellent) planned tax reform (I actually saved over $10,000 every year as a grad student, so under the new system, my budget would be just balanced – which would probably increased my desire that I simply had to stay in the U.S. for a few years as a postdoc or more). When the public overwhelmingly buys the idea that Edward Witten or the IAS at Princeton don't do anything that would go beyond what a surfer dude may do in Hawaii while surfing, the efforts to selectively defund high-brow science will accelerate.

Every time you're silent when someone like Ms Hossenfelder spreads this hostile garbage, you are helping this movement to win and destroy pure science – as a realistically sustainable occupation that hundreds or thousands of people can do – in the coming years. Every time you are allowing someone to get a degree for political reasons such as her sex or skin color, you are increasing the chances that you have produced a new weapon that will be used to obfuscate the separation of science and politics and to attack science using political tools. If someone got her PhD or jobs because of (identity) politics, be sure that she will be grateful to (identity) politics and will help (identity) politics to defeat science.

One more comment – about the wrong classification of topics. A week ago, I posted two comments under an article Our Bargain at the 4gravitons blog. The second one – mainly explaining that some people's screaming that they have to make this or that amazing progress very soon – was just a wishful thinking (analogous to planners in socialist planned economies). And the owner of the blog erased it because he has a "policy not to allow comments that are about politics".

But this justification was obviously completely fraudulent because the whole original text, Our Bargain, was about politics. It was mainly about funding and self-funding of scientists and the period for which a financial injection should last and other things. These topics are actually totally political in character. My deleted comment was actually much less political than his original text. So the explanation of the erasure was just plain dishonest.

It's very obvious what's going on. My comment was erased as a political one because it wasn't sufficiently respectful towards the left-wing orthodoxy. It's even possible – because of his policies, I can't know for certain and I can (and I must) only speculate – that Tetragraviton loves the planned economy and can't tolerate any criticism of it! Right-wing comments are often erased with the explanation that they're political while totally analogous left-wing comments – and even left-wing original articles – may be kept and sometimes even thanked to. The double standards must be absolutely self-evident to any honest person, including a left-winger.

These double standards are not only unjust. They are ultimately very harmful to science. By insisting that they're "primarily" scholars who are loyal to the left-wing orthodoxy of the Academia, the scientists help the actual "leftists-in-chief" to succeed in their plans and one of them involves the obfuscation of the boundary between science and politics and the eradication of pure science as we have known it for centuries. It's bad when you, the leftists who still do excellent research, don't realize what you're actually helping to do with science by your subtle and not so subtle endorsements of the radical left-wing positions. And it's even worse if you realize it but you keep on doing it, anyway.

Idealized science is separate from politics and the best physics groups are still close enough to this ideal. But the actual particular discoveries are being made by real scientists in the real world and that world is affected by politics – and by political movements of science haters such as Ms Hossenfelder, Mr Woit, and hundreds of others who simply don't want tolerate science because they're shocked by the fact that they're not good enough to practice it themselves. So you should better understand some of this politics and the consequences of some of your actions – including affirmative action – that you incorrectly believe to he harmless.

And that's the memo.

by Luboš Motl ( at December 01, 2017 02:16 PM

Tommaso Dorigo - Scientificblogging

When Ignorance Kills Human Progress, And A Petition You Should Sign
An experiment designed to study neutrinos at the Gran Sasso Laboratories in Italy is under attack by populistic media. Why should you care? Because it's a glaring example of the challenges we face in the XXI century in our attempt to foster the progress of the human race.
What is a neutrino? Nothing - it's a particle as close to nothing as you can imagine. Almost massless, almost perfectly non-interacting, and yet incredibly mysterious and the key to the solution of many riddles in fundamental physics and cosmology. But it's really nothing you should worry about, or care about, if you want to lead your life oblivious of the intricacies of subnuclear physics. Which is fine of course - unless you try to use your ignorance to stop progress.

read more

by Tommaso Dorigo at December 01, 2017 11:43 AM

November 30, 2017

ZapperZ - Physics and Physicists

How Valuable Are Scientists In Politics?
Some time I read a piece that reflects my sentiments almost to a "T". This is one such example.

In the back page section of this months (Nov. 2017) APS News called.... wait for it... "The Back Page", Andrew Zwicker Princeton Plasma Physics Lab also a legislator in the state of New Jersey, US, reflects on the lack of scientists, and scientific methodology in politics and government. I completely agree on this part that I'm quoting here:

As scientists we are, by nature and training, perpetually skeptical yet constantly open to new ideas. We are guided by data, by facts, by evidence to make decisions and eventually come to a conclusion that we immediately question. We strive to understand the "big picture", and we understand the limitations of our conclusions and predictions. Imagine how different the political process would be if everyone in office took a data-driven, scientific approach to creating legislation instead of one based on who can make the best argument for a particular version of the "facts".

Anyone who has followed this blog for a length of time would have noticed my comments many times on this subject, especially in regards to scientists or physicists in the US Congress (right now there's only one left, Bill Foster). I have always poinpointed the major problem with people that we elect, that the public tends to vote for people who agree with their views, rather than individuals who are able to think, who have a clear-cut way of figuring out who to ask or where to look to seek answer. In other words, if a monkey agrees with their view on a number of issues, even that monkey can get elected, regardless of whether that monkey can think rationally.

It is why we have politicians bunkered-in with their views rather than thinking of what is the right or appropriate thing to do based on the facts. This is also why it is so important to teach science, and about science, especially on arriving at an idea or conclusion rationally and analytically, to students who are NOT going to go into science. Law schools should make it compulsory that their students understand science, not for the sake of the material, but rather as a method to think things through.

Unfortunately, I'm skeptical for any of that to happen, which is why the crap that we are seeing in politics right now will never change.


by ZapperZ ( at November 30, 2017 07:40 PM

Axel Maas - Looking Inside the Standard Model

Reaching closure – completing a review
I did not publish anything here within the last few months, as the review I am writingtook up much more time than expected. A lot of interesting project developments happened also during this time. I will write on them as well later, so that nobody will miss out on the insights we gained and the fun we had with them.

But now, I want to write about how the review comes along. It has now grown into a veritable almost 120 page document. And actually most of it is texts and formulas, and only very few figures. This makes for a lot of content. Right now, it has reached the status of a release candidate 2. This means I have distributed it to many of my colleagues to comment on it. I also used the draft as lecture notes for a lecture on its contents at a winter school in Odense/Denmark (where I actually wrote this blog entry). Why? Because I wanted to have feedback. What can be understood, and what may I have misunderstood? After all, this review not only looks at my own research. Rather, it compiles knowledge from more than a hundred scientists over 45 years. In fact, some of the results I write about have been obtained before I was born. Especially,I could have overlooked results. With by now dozens of new papers per day, this can easily happen. I have collected more than 330 relevant articles, which I refer to in the review.

And, of course, I could have misunderstood other people’s results or made mistakes. This needs to be avoided in a review as good as possible.

Indeed, I had many discussions by now on various aspects of the research I review. I got comments and was challenged. In the end, there was always either a conclusion or the insight that some points, believed to be clear, are not as entirely clear as it seemed. There are always more loopholes, more subtleties, than one anticipates. By this, the review became better, and could collect more insights from many brilliant scientists. And likewise I myself learned a lot.

In the end, I learned two very important lessons about the physics I review.

The first is that many more things are connected than I expected. Some issues, which looked to my like a parenthetical remark in the beginning became first remarks at more than one place and ultimately became an issue of their on.

The second is that the standard modelof particle physics is even more special and more balanced than I thought. I was never really thinking that the standard model is so terrible special. Just one theory among many which happen to fit experiments. But really it is an extremely finely adjusted machinery. Every cog in it is important, and even slight changes will make everything fall apart. All the elements are in constant connection with each other, and influence each other.

Does this mean anything? Good question. Perhaps it is a sign of an underlying ordering principle. But if it is, I cannot see it (yet?). Perhaps this is just an expression of how a law of nature must be – perfectly balanced. At any rate, it gave me a new perspective of what the standard model is.

So, as I anticipated writing this review gave me a whole new perspectiveand a lot of insights. Partly by formulating questions and answers more precisely. But, and probably more importantly, I had to explain it to others, and to either successfully defend or adapt it or even correct it.

In addition, two of the most important lessons about understanding physics I learned were the following:

One: Take your theory seriously. Do not take a shortcut or use some experience. Literally understand what it means and only then start to interpret.

Two: Pose your questions (and answers) clearly. Every statement should have a well-defined meaning. Never be vague when you want to make a scientific statement. Be always able to back up a question of “what do you mean by this?” by a precise definition. This seems obvious, but is something you tend to be cavalier about. Don’t.

So, writing a review not only helps in summarizing knowledge. It also helps to understand this knowledge and realize its implications. And, probably fortunately, it poses new questions. What they are, and what we do about, this is something I will write about in the future.

So, how does it proceed now? In two weeks I have to deliver the review to the journal which mandated it. At the same time (watch my twitteraccount) it will become available on the preprint server, the standard repository of all elementary particle physics knowledge. Then you can see for yourself what I wrote, and wrote about

by Axel Maas ( at November 30, 2017 05:15 PM

Lubos Motl - string vacua and pheno

Pop science meets Edward Witten
Off-topic, science: China's DAMPE's dark matter signal

Natalie Wolchover is among the best popular writers about theoretical physics. But when I read her interview with Edward Witten at the Quanta Magazine,
A Physicist’s Physicist Ponders the Nature of Reality,
the clash of cultures seemed amusingly obvious to me. Witten is much smarter than myself and he also loves to formulate things in ways that look insanely diplomatic or cautious to me but I can still feel that his underlying sentiments are extremely close to mine.

They have discussed the conceptual and, I would say, emotional aspects of the (2,0) theory, M-theory, dualities, Wheeler's "it from bit", tennis, a hypothetical new overarching description of all of physics, and other things. It looks so obvious that Wolchover "wanted" to hear completely different answers than she did! ;-)

OK, let me start to comment on the interview. Wolchover explains he is a physicists' physicist, geniuses' genius, and a Fields Medal winner, among other things. She managed to interview him but he managed to make her invisible on the stone paths. Well, I felt some dissatisfaction already there. We were told about the children's drawings and piles of papers in Witten's office, not to mention a picture of a woman's buttocks in a vase. After some introduction to dualities and M-theory etc., we we told about her first question.

OK, Wolchover asks why Witten was interested in dualities which physicists sometimes talk about recently. Here it's not quite waterproof but even in the first question, I could hear her "why would you study something so boring and ugly like dualities?". Well, exchanges that appear later in the interview have reinforced this initial impression of mine. I surely think that Wolchover is totally turned off by dualities and, like almost all laymen, she doesn't appreciate them at all.

Witten answered that dualities give you new tools to study questions that looked hopelessly hard. Some examples are sketched, including a few words on the AdS/CFT. Wolchover asks about (AdS) holography but Witten redirects the discussion to a more general concept, the dualities of all the kinds, and says that it's often more than two descriptions that are dual to each other. Again, I think that you can see tension between the two people concerning "what should be discussed and/or celebrated". In this situation, Wolchover seems eager to repeat some of the usual clichés about holography in quantum gravity while Witten wants to emphasize some more general features of all dualities and what they mean.

The following question by Wolchover is the aforementioned confirmation of her negative attitude towards dualities:
Given this web of relationships and the issue of how hard it is to characterize all duality, do you feel that this reflects a lack of understanding of the structure, or is it that we’re seeing the structure, only it’s very complicated?
Do you see her proposed two answers? Both of them are "negative". Either dualities mean that we're dumb; or they mean that the structure is "complicated" by which she rather clearly means "disappointingly ugly". But both of these propositions are completely upside down. Dualities mean that physicists are smart, not dumb; and they imply that the underlying structure is more beautiful, robust, constrained, and exceptional than we have thought. I am pretty sure that Witten would basically agree with these words of mine but there's this tendency of his to avoid any disagreements so he prefers not to address the apparent underlying assumptions of such questions directly. He wants the people – e.g. Wolchover in this case – to find the wisdom themselves. But can they? Does it work? If you don't tell Ms Wolchover that dualities should be studied and celebrated rather than spitted upon, can she figure it out herself? I doubt so.

Witten's answer is interesting. He doesn't know whether there is some "simplified description" i.e. one that would make dualities and other things manifest. We don't have it so it's obvious that we must accept the possibility that no such description exists. Nati Seiberg seems to believe that such a description exists. It's a matter of faith at this point.

But Witten makes a general important point (which I have made many times on this blog, too): It's not only mathematics that is central in theoretical physics. It's mathematics that is hard for mathematicians. For mathematicians, it's even hard to rigorously define a quantum field theory and/or prove its existence. It's even harder with the concepts that string theory forces you to add. Why is it so? Well, I would say that the need for "mathematics that is hard for mathematicians" simply means that the Universe is even more mathematical than the contemporary mathematicians. Contemporary mathematician still discuss objects that are too close to the everyday life while the concepts needed to discuss the laws of physics at the fundamental level are even more abstract, more mathematical.

After Wolchover asks him about the relationship between mathematics and physics, Witten returns to the question about the simplified formulation of quantum field theories etc. and says that he tends to believe that nothing of the sort exists, he can't imagine it. What is Wolchover's question right afterwards? You may guess! :-)
You can’t imagine it at all?
Witten has just said that he couldn't imagine. Why would you ask "can't you imagine it at all"? Wasn't his previous sentence sufficiently clear? Or does she believe that the words "at all" provide the conversation with an exceptionally high added value? It's clear what's going on here. The statement that "the simplified universal definition of quantum field theory probably doesn't exist" is a heresy. It's politically incorrect and the question "you can't imagine it at all?" means nothing else than "recant it immediately!". ;-)

If "I can't imagine such a description" is so shocking to Ms Wolchover, can we ask her: And do you know what such a description should look like? ;-) Obviously, she can't. No one can. If someone could, Witten would have probably learned about it already.

Well, after her "recant it", Witten didn't recant it. He said "No, I can't". If he can't imagine it, he can't imagine it. Among reasonable scientists, there just can't be similar taboos about such questions. Of course the view that such a description doesn't exist is entirely possible. It may very well be true. Wolchover asked a question about the (2,0) theory in 5+1 dimensions – it seems to me that it was a pre-engineered, astroturf question because it doesn't seem natural that she would need to ask about the (2,0) theory. And Witten says that it's a theory that we can't define by Lagrangian-like quantization of a known classical system. But there's a huge body of evidence that the theory exists and its existence also makes lots of the properties of theories in lower-dimensional spacetimes manifest – e.g. the \(SL(2,\ZZ)\) S-duality group of the \({\mathcal N}=4\) supersymmetric gauge theory in \(D=4\).

Witten ends up saying that the question "is there a six-dimensional theory with a list of properties" is a more fundamental restatement of the statements about the dualities. Well, it's also a "deeper way of thinking" than just constructing some quantum theories by a quantization of a particular classical system. The previous sentence is mine but I think he would agree with it, too.

Wolchover's jihad against dualities apparently continued:
Dualities sometimes make it hard to maintain a sense of what’s real in the world, given that there are radically different ways you can describe a single system. How would you describe what’s real or fundamental?
Great. So Witten was asked "what's real". She clearly wants some of the dual descriptions of the same physics "not to be real", to be banned or killed and declared "unreal" or "blasphemous in physics" – so that the dualities are killed, too. Well, all of the dual descriptions are exactly equally real – that's why we talk about dualities at all. But she doesn't reveal her intent explicitly so the question is just "what's real".

Needless to say, "what's real" is an extremely vague question from a physicist's viewpoint. Almost any question about physics, science, or anything else could be framed as a version of a "what's real" question. "What's real" may be asked as an elaboration building on basically any previous reason. People may ask whether something is real just to confirm that they should trust some answer they were previously given. People may ask whether the eigenvalues of Hermitian operators are real and they are, in the technical sense of "real". They may ask whether quarks are real – they are even though they can't exist in isolation. They may use the word "real" for "useful scientific concepts", for "gauge-invariant observables". Lots of things may be said to be "real" or "unreal" for dozens of reasons that are ultimately very different.

The question doesn't mean anything, not even in the context of dualities – except for the fact that I mentioned, namely that concepts used to describe theories on both or all sides of a duality are equally real. OK, what can Witten answer to a question "what's real"? He's not your humble correspondent so he doesn't explode in a profound and vitally important tirade about Wolchover's meaningless vague questions. Instead, he said:
What aspect of what’s real are you interested in? What does it mean that we exist? Or how do we fit into our mathematical descriptions?
This is just Witten's way of saying "Please think about the rubbish question you have asked. Can you see that it has no beef and it can mean anything?" OK, so Witten said that her question could very well be interpreted as a question by the New Age religious people who are constantly high and who ask whether the Universe is real at all, and so on. But he gave her another option: Do you want to keep on discussing our mathematical description of the Universe?

I can only see the written interview, not the emotions. But I would probably bet that the adrenaline was elevated. Wolchover reacted to Witten's answer by a special tweet:

The tweet sounds like "Witten has given the most original answer (a counter-question) to the question what's real" in the history so far. (Well, I actually respond in almost the same way to "what's real" when I am expected to be polite.) But what I actually read in between the lines is "look, Witten has answered my very deep philosophical question disrespectfully, please help me to spread the idea that he's quite a jerk". ;-)

OK, so which kind of "what's real" do you want to discuss, Ms Wolchover? The latter, the mathematical descriptions, she answers.

Witten keeps on talking about the hypothetical "simpler unified description that clarifies everything". At this point of the interview, it's already staggeringly obvious that Wolchover tries to impose the faith in the existence of this description on Witten but shockingly enough, she finds out that Edward Witten doesn't automatically accept beliefs provided by popular writers to him. Witten's answer is a damn good argument – which I have been well aware of for decades – why this whole search for a single universal description of a TOE may be misguided:
Well, unfortunately, even if it’s correct I can’t guarantee it would help. Part of what makes it difficult to help is that the description we have now, even though it’s not complete, does explain an awful lot. And so it’s a little hard to say, even if you had a truly better description or a more complete description, whether it would help in practice.
The point is that we already have some descriptions that simply must be correct at some rather high level of accuracy. They may be close enough to some observations – they are really helpful to explain the observations. So if you add a new, at least equally correct description of all of physics, you must still explain why that new description basically reduces one to the known and successful ones in some situations or limits. In practice, we will always use the limiting, old descriptions when they work and they will almost certainly be the descriptions of choice for some situations even if we find a deeper description.

Dualities relate so many different environments and vacua that the underlying hypothetical "universal description" must be extremely flexible. It just can't say anything "particular" about the spectrum of particles and other things because those properties may be extremely diverse. So if such a deeper universal description exists, it has to be "at most" a paradigm that justifies the known descriptions – and perhaps allows us to compute tiny corrections in these theories even more accurately or completely precisely (at least in principle). But you simply shouldn't expect a new description that is both universal and directly useful (or even "simplified") to analyze the particular situations!

Another exchange is about M-theory. Witten says that it's totally settled that the theory exists today but we still don't know too much more than in the mid 1990s what the theory is. Some new progress in the bulk-based description of gravitational theories would be useful – I completely agree (too much focus has been on the boundary CFT description in this duality) – but Witten doesn't have too much useful stuff to say except that it's probably more abstract and vague about the spacetime than we are used to from existing descriptions. This "I have nothing useful to say" is a sentence he modestly says often. Well, most other people have 500 times less useful things to say but they present themselves as if they were megagods flying above Witten. The contrast between monster mind Witten's almost unlimited modesty and lots of speculative mediocre minds' unlimited hype and narcissism couldn't be more obvious.

Witten mentions that some days ago, he read Wheeler's "it from bit" texts. Now, he's more tolerant towards similar vague stuff, we hear, because he's older. When Witten was younger and wrote his 36th paper, the best thing in the world was immediately the planned 37th paper, of course, and so on, we learn. ;-) But now he's ready to read some less serious stuff such as Wheeler's "it from bit". This increased tolerance may be partly due to the lower relative difference between the numbers 363 and 364, relatively to 36 and 37. ;-)

Nevertheless, his reactions are still basically the same to mine. Wheeler's comments about physics – "information is physical" – are hopelessly vague and carry no information, Witten reacts in the same way as your humble correspondent. On top of that, Wheeler talked about "bits" but he must have meant "qubits" – the term wasn't usual in those times but Wheeler hopefully meant it, otherwise the text would have been really dumb.

And while the spacetime is probably emergent, that's not a good reason to abandon the continuum of the real numbers. Like your humble correspondent, Witten sees evidence that you should better not try to throw away the continuum from physics. Discrete physics with no connections to the continuum just can't do almost anything. To get rid of the reals is an unpromising starting point. Witten also mentions a self-observing Wheeler's picture of an eye and suggests that the observer's being a part of the world that is observed could hide some extra wisdom to be understood. Well, I am agnostic about some ill-defined progress of this kind, I am just pretty sure that the particular ideas that have been proposed to prove this meme are bogus.

One of the last questions by Wolchover was "Do you consider Wheeler a hero?". And Witten just answered "No". Witten just wanted to see what "it from bit" could have meant but I am afraid he just confirmed his expectations that the essay had no beef at all. Witten described Wheeler as a guy who wanted to make big jumps by thousands of years while Witten has been doing incremental advances. Well, 100,000+ citations worth of those, I would add. That's how Witten confirms Wolchover's point that he preferred progress through calculations than vague visions. He also mentioned he likes to play tennis although he doesn't expect to win Wimbledon for several more years.

To summarize, I think that Wolchover must have seen that she comes from a culture that constantly hypes and worships some vague and would-be ambitious statements by big mouths, that constantly needs to worship authors of such vague visions, that is annoyed by mathematics and everything that looks complicated or that has many aspects or many solutions, and so on, and she was forced to see that the methodology and the value system of a top-tier physicist – and, indeed, most top-tier physicists – is extremely different.

And that's the memo.

P.S.: At this moment, there are 8 comments under the interview. By the Pentcho Valev crackpot, by another crackpot who fights QCD, another one that thinks that he has a theory competing with string/M-theory, the fourth crackpot who believes that dualities contradict mathematical logic, and a few more. There's quite a company over there. I am fortunate to have some of you, the brilliant readers, because if I only saw comments like on those websites, I would surely conclude that any writing like that is a waste of time.

P.P.S.: There are 20 comments now. A Vietnamese thinker has an "educated guess" that Carlo Rovelli is more impressive than Witten. ;-) Zarzuelazen talks about Sean Carroll and "nonlocality", and random mixtures of entropy with other things. Someone else quotes Hossenfelder's seven theories of everything – six cranks who are Witten's peers because it was written somewhere in the cesspool of the Internet. Indeed, quite a company.

by Luboš Motl ( at November 30, 2017 03:19 PM

November 28, 2017

ZapperZ - Physics and Physicists

Employee Used A "Faraday Cage" To Hide Is Whereabout
This is one way to be "invisible".

An employee in Perth, Australia, used the metallic package from a snack to shield his device that has a GPS and locate his whereabouts. He then went golfing... many times, during his work hours.

The tribunal found that the packet was deliberately used to operate  as an elaborate “Faraday cage” - an enclosure which can block electromagnetic fields - and prevented his employer knowing his location. The cage set-up was named after English scientist Michael Faraday, who in 1836 observed that a continuous covering of conductive material could be used to block electromagnetic fields.

Now, if it works for his device, it should work to shield our credit cards as an RFID shield, don't you think? There's no reason to buy those expensive wallet or credit-card envelopes. Next time you have a Cheetos or potato chips, save those bags and wrap your wallet with them! :)


by ZapperZ ( at November 28, 2017 07:07 PM

November 27, 2017

John Baez - Azimuth

A Universal Snake-like Continuum

It sounds like jargon from a bad episode of Star Trek. But it’s a real thing. It’s a monstrous object that lives in the plane, but is impossible to draw.

Do you want to see how snake-like it is? Okay, but beware… this video clip is a warning:

This snake-like monster is also called the ‘pseudo-arc’. It’s the limit of a sequence of curves that get more and more wiggly. Here are the 5th and 6th curves in the sequence:

Here are the 8th and 10th:

But what happens if you try to draw the pseudo-arc itself, the limit of all these curves? It turns out to be infinitely wiggly—so wiggly that any picture of it is useless.

In fact Wayne Lewis and Piotr Minic wrote a paper about this, called Drawing the pseudo-arc. That’s where I got these pictures. The paper also shows stage 200, and it’s a big fat ugly black blob!

But the pseudo-arc is beautiful if you see through the pictures to the concepts, because it’s a universal snake-like continuum. Let me explain. This takes some math.

The nicest metric spaces are compact metric spaces, and each of these can be written as the union of connected components… so there’s a long history of interest in compact connected metric spaces. Except for the empty set, which probably doesn’t deserve to be called connected, these spaces are called continua.

Like all point-set topology, the study of continua is considered a bit old-fashioned, because people have been working on it for so long, and it’s hard to get good new results. But on the bright side, what this means is that many great mathematicians have contributed to it, and there are lots of nice theorems. You can learn about it here:

• W. T. Ingraham, A brief historical view of continuum theory,
Topology and its Applications 153 (2006), 1530–1539.

• Sam B. Nadler, Jr, Continuum Theory: An Introduction, Marcel Dekker, New York, 1992.

Now, if we’re doing topology, we should really talk not about metric spaces but about metrizable spaces: that is, topological spaces where the topology comes from some metric, which is not necessarily unique. This nuance is a way of clarifying that we don’t really care about the metric, just the topology.

So, we define a continuum to be a nonempty compact connected metrizable space. When I think of this I think of a curve, or a ball, or a sphere. Or maybe something bigger like the Hilbert cube: the countably infinite product of closed intervals. Or maybe something full of holes, like the Sierpinski carpet:

or the Menger sponge:

Or maybe something weird like a solenoid:

Very roughly, a continuum is ‘snake-like’ if it’s long and skinny and doesn’t loop around. But the precise definition is a bit harder:

We say that an open cover 𝒰 of a space X refines an open cover 𝒱 if each element of 𝒰 is contained in an element of 𝒱. We call a continuum X snake-like if each open cover of X can be refined by an open cover U1, …, Un such that for any i, j the intersection of Ui and Uj is nonempty iff i and j are right next to each other.

Such a cover is called a chain, so a snake-like continuum is also called chainable. But ‘snake-like’ is so much cooler: we should take advantage of any opportunity to bring snakes into mathematics!

The simplest snake-like continuum is the closed unit interval [0,1]. It’s hard to think of others. But here’s what Mioduszewski proved in 1962: the pseudo-arc is a universal snake-like continuum. That is: it’s a snake-like continuum, and it has continuous map onto every snake-like continuum!

This is a way of saying that the pseudo-arc is the most complicated snake-like continuum possible. A bit more precisely: it bends back on itself as much as possible while still going somewhere! You can see this from the pictures above, or from the construction on Wikipedia:

• Wikipedia, Pseudo-arc.

I like the idea that there’s a subset of the plane with this simple ‘universal’ property, which however is so complicated that it’s impossible to draw.

Here’s the paper where these pictures came from:

• Wayne Lewis and Piotr Minic, Drawing the pseudo-arc, Houston J. Math. 36 (2010), 905–934.

The pseudo-arc has other amazing properties. For example, it’s ‘indecomposable’. A nonempty connected closed subset of a continuum is a continuum in its own right, called a subcontinuum, and we say a continuum is indecomposable if it is not the union of two proper subcontinua.

It takes a while to get used to this idea, since all the examples of continua that I’ve listed so far are decomposable except for the pseudo-arc and the solenoid!

Of course a single point is an indecomposable continuum, but that example is so boring that people sometimes exclude it. The first interesting example was discovered by Brouwer in 1910. It’s the intersection of an infinite sequence of sets like this:

It’s called the Brouwer–Janiszewski–Knaster continuum or buckethandle. Like the solenoid, it shows up as an attractor in some chaotic dynamical systems.

It’s easy to imagine how if you write the buckethandle as the union of two closed proper subsets, at least one will be disconnected. And note: you don’t even need these subsets to be disjoint! So, it’s an indecomposable continuum.

But once you get used to indecomposable continua, you’re ready for the next level of weirdness. An even more dramatic thing is a hereditarily indecomposable continuum: one for which each subcontinuum is also indecomposable.

Apart from a single point, the pseudo-arc is the unique hereditarily indecomposable snake-like continuum! I believe this was first proved here:

• R. H. Bing, Concerning hereditarily indecomposable continua, Pacific J. Math. 1 (1951), 43–51.

Finally, here’s one more amazing fact about the pseudo-arc. To explain it, I need a bunch more nice math:

Every continuum arises as a closed subset of the Hilbert cube. There’s an obvious way to define the distance between two closed subsets of a compact metric space, called the Hausdorff distance—if you don’t know about this already, it’s fun to reinvent it yourself. The set of all closed subsets of a compact metric space thus forms a metric space in its own right—and by the way, the Blaschke selection theorem says this metric space is again compact!

Anyway, this stuff means that there’s a metric space whose points are all subcontinua of the Hilbert cube, and we don’t miss out on any continua by looking at these. So we can call this the space of all continua.

Now for the amazing fact: pseudo-arcs are dense in the space of all continua!

I don’t know who proved this. It’s mentioned here:

• Trevor L. Irwin and Salawomir Solecki, Projective Fraïssé limits and the pseudo-arc.

but they refer to this paper as a good source for such facts:

• Wayne Lews, The pseudo-arc, Bol. Soc. Mat. Mexicana (3) 5 (1999), 25–77.

Abstract. The pseudo-arc is the simplest nondegenerate hereditarily indecomposable continuum. It is, however, also the most important, being homogeneous, having several characterizations, and having a variety of useful mapping properties. The pseudo-arc has appeared in many areas of continuum theory, as well as in several topics in geometric topology, and is beginning to make its appearance in dynamical systems. In this monograph, we give a survey of basic results and examples involving the pseudo-arc. A more complete treatment will be given in a book dedicated to this topic, currently under preparation by this author. We omit formal proofs from this presentation, but do try to give indications of some basic arguments and construction techniques. Our presentation covers the following major topics: 1. Construction 2. Homogeneity 3. Characterizations 4. Mapping properties 5. Hyperspaces 6. Homeomorphism groups 7. Continuous decompositions 8. Dynamics.

It may seem surprising that one can write a whole book about the pseudo-arc… but if you like continua, it’s a fundamental structure just like spheres and cubes!

by John Baez at November 27, 2017 12:11 AM

November 26, 2017

Clifford V. Johnson - Asymptotia

Pleasant Discovery

So I don’t know about you, but I’ve been really enjoying the new series of Star Trek. I started watching Star Trek Discovery because I was one of the science advisors they talked with from the early writing stages to finish, building some of the science ideas, concepts, and tone for their reimagining of the Star Trek universe.

Over many months after the initial meeting with all the writers, I would take calls from individual writers and researchers and give them ideas or phrases they could use and so forth. But much of the work was done blind, which is to say I had very little context for most of what they were asking for advice for. I think they do it this way because they wanted to protect a lot of the material from leaking because, well, it’s Star Trek! Yes, you'll know from my various writings and interviews about science advising that this is not usually my preferred way of working as an advisor, but I was happy to help in this case and make an exception because after all of this is a huge show that has a tradition of inspiring people about science over many generations. so it could be of value, just by virtue of some of the little science ideas that I helped sprinkled in there, however accurately or inaccurately. The bottom line is that Star Trek just isn’t about accuracy, it’s about inspiration and dreams.

Well, needless to say, when it came out I was curious to see how they used [...] Click to continue reading this post

The post Pleasant Discovery appeared first on Asymptotia.

by Clifford at November 26, 2017 11:01 PM

November 24, 2017

Clifford V. Johnson - Asymptotia

Two Events!

(Image above courtesy of Cellar Door Books in Riverside, CA.)

Happy Thanksgiving! This coming week, there'll be two events that might be of interest to people either in the Los Angeles area, or the New York area.

The first is an event (Tues. 28th Nov., 7pm, Co-sponsored by LARB and Chevalier's Books) centered around my new book, the Dialogues. It is the first such LA event, starting with a chat with writer and delightful conversationalist [...] Click to continue reading this post

The post Two Events! appeared first on Asymptotia.

by Clifford at November 24, 2017 05:44 PM

Sean Carroll - Preposterous Universe


This year we give thanks for a simple but profound principle of statistical mechanics that extends the famous Second Law of Thermodynamics: the Jarzynski Equality. (We’ve previously given thanks for the Standard Model Lagrangian, Hubble’s Law, the Spin-Statistics Theorem, conservation of momentum, effective field theory, the error bar, gauge symmetry, Landauer’s Principle, the Fourier Transform, Riemannian Geometry, and the speed of light.)

The Second Law says that entropy increases in closed systems. But really it says that entropy usually increases; thermodynamics is the limit of statistical mechanics, and in the real world there can be rare but inevitable fluctuations around the typical behavior. The Jarzynski Equality is a way of quantifying such fluctuations, which is increasingly important in the modern world of nanoscale science and biophysics.

Our story begins, as so many thermodynamic tales tend to do, with manipulating a piston containing a certain amount of gas. The gas is of course made of a number of jiggling particles (atoms and molecules). All of those jiggling particles contain energy, and we call the total amount of that energy the internal energy U of the gas. Let’s imagine the whole thing is embedded in an environment (a “heat bath”) at temperature T. That means that the gas inside the piston starts at temperature T, and after we manipulate it a bit and let it settle down, it will relax back to T by exchanging heat with the environment as necessary.

Finally, let’s divide the internal energy into “useful energy” and “useless energy.” The useful energy, known to the cognoscenti as the (Helmholtz) free energy and denoted by F, is the amount of energy potentially available to do useful work. For example, the pressure in our piston may be quite high, and we could release it to push a lever or something. But there is also useless energy, which is just the entropy S of the system times the temperature T. That expresses the fact that once energy is in a highly-entropic form, there’s nothing useful we can do with it any more. So the total internal energy is the free energy plus the useless energy,

U = F + TS. \qquad \qquad (1)

Our piston starts in a boring equilibrium configuration a, but we’re not going to let it just sit there. Instead, we’re going to push in the piston, decreasing the volume inside, ending up in configuration b. This squeezes the gas together, and we expect that the total amount of energy will go up. It will typically cost us energy to do this, of course, and we refer to that energy as the work Wab we do when we push the piston from a to b.

Remember that when we’re done pushing, the system might have heated up a bit, but we let it exchange heat Q with the environment to return to the temperature T. So three things happen when we do our work on the piston: (1) the free energy of the system changes; (2) the entropy changes, and therefore the useless energy; and (3) heat is exchanged with the environment. In total we have

W_{ab} = \Delta F_{ab} + T\Delta S_{ab} - Q_{ab}.\qquad \qquad (2)

(There is no ΔT, because T is the temperature of the environment, which stays fixed.) The Second Law of Thermodynamics says that entropy increases (or stays constant) in closed systems. Our system isn’t closed, since it might leak heat to the environment. But really the Second Law says that the total of the last two terms on the right-hand side of this equation add up to a positive number; in other words, the increase in entropy will more than compensate for the loss of heat. (Alternatively, you can lower the entropy of a bottle of champagne by putting it in a refrigerator and letting it cool down; no laws of physics are violated.) One way of stating the Second Law for situations such as this is therefore

W_{ab} \geq \Delta F_{ab}. \qquad \qquad (3)

The work we do on the system is greater than or equal to the change in free energy from beginning to end. We can make this inequality into an equality if we act as efficiently as possible, minimizing the entropy/heat production: that’s an adiabatic process, and in practical terms amounts to moving the piston as gradually as possible, rather than giving it a sudden jolt. That’s the limit in which the process is reversible: we can get the same energy out as we put in, just by going backwards.

Awesome. But the language we’re speaking here is that of classical thermodynamics, which we all know is the limit of statistical mechanics when we have many particles. Let’s be a little more modern and open-minded, and take seriously the fact that our gas is actually a collection of particles in random motion. Because of that randomness, there will be fluctuations over and above the “typical” behavior we’ve been describing. Maybe, just by chance, all of the gas molecules happen to be moving away from our piston just as we move it, so we don’t have to do any work at all; alternatively, maybe there are more than the usual number of molecules hitting the piston, so we have to do more work than usual. The Jarzynski Equality, derived 20 years ago by Christopher Jarzynski, is a way of saying something about those fluctuations.

One simple way of taking our thermodynamic version of the Second Law (3) and making it still hold true in a world of fluctuations is simply to say that it holds true on average. To denote an average over all possible things that could be happening in our system, we write angle brackets \langle \cdots \rangle around the quantity in question. So a more precise statement would be that the average work we do is greater than or equal to the change in free energy:

\displaystyle \left\langle W_{ab}\right\rangle \geq \Delta F_{ab}. \qquad \qquad (4)

(We don’t need angle brackets around ΔF, because F is determined completely by the equilibrium properties of the initial and final states a and b; it doesn’t fluctuate.) Let me multiply both sides by -1, which means we  need to flip the inequality sign to go the other way around:

\displaystyle -\left\langle W_{ab}\right\rangle \leq -\Delta F_{ab}. \qquad \qquad (5)

Next I will exponentiate both sides of the inequality. Note that this keeps the inequality sign going the same way, because the exponential is a monotonically increasing function; if x is less than y, we know that ex is less than ey.

\displaystyle e^{-\left\langle W_{ab}\right\rangle} \leq e^{-\Delta F_{ab}}. \qquad\qquad (6)

(More typically we will see the exponents divided by kT, where k is Boltzmann’s constant, but for simplicity I’m using units where kT = 1.)

Jarzynski’s equality is the following remarkable statement: in equation (6), if we exchange  the exponential of the average work e^{-\langle W\rangle} for the average of the exponential of the work \langle e^{-W}\rangle, we get a precise equality, not merely an inequality:

\displaystyle \left\langle e^{-W_{ab}}\right\rangle = e^{-\Delta F_{ab}}. \qquad\qquad (7)

That’s the Jarzynski Equality: the average, over many trials, of the exponential of minus the work done, is equal to the exponential of minus the free energies between the initial and final states. It’s a stronger statement than the Second Law, just because it’s an equality rather than an inequality.

In fact, we can derive the Second Law from the Jarzynski equality, using a math trick known as Jensen’s inequality. For our purposes, this says that the exponential of an average is less than the average of an exponential, e^{\langle x\rangle} \leq \langle e^x \rangle. Thus we immediately get

\displaystyle e^{-\left\langle W_{ab}\right\rangle} \leq \left\langle e^{-W_{ab}}\right\rangle = e^{-\Delta F_{ab}}, \qquad\qquad (8)

as we had before. Then just take the log of both sides to get \langle W_{ab}\rangle \geq \Delta F_{ab}, which is one way of writing the Second Law.

So what does it mean? As we said, because of fluctuations, the work we needed to do on the piston will sometimes be a bit less than or a bit greater than the average, and the Second Law says that the average will be greater than the difference in free energies from beginning to end. Jarzynski’s Equality says there is a quantity, the exponential of minus the work, that averages out to be exactly the exponential of minus the free-energy difference. The function e^{-W} is convex and decreasing as a function of W. A fluctuation where W is lower than average, therefore, contributes a greater shift to the average of e^{-W} than a corresponding fluctuation where W is higher than average. To satisfy the Jarzynski Equality, we must have more fluctuations upward in W than downward in W, by a precise amount. So on average, we’ll need to do more work than the difference in free energies, as the Second Law implies.

It’s a remarkable thing, really. Much of conventional thermodynamics deals with inequalities, with equality being achieved only in adiabatic processes happening close to equilibrium. The Jarzynski Equality is fully non-equilibrium, achieving equality no matter how dramatically we push around our piston. It tells us not only about the average behavior of statistical systems, but about the full ensemble of possibilities for individual trajectories around that average.

The Jarzynski Equality has launched a mini-revolution in nonequilibrium statistical mechanics, the news of which hasn’t quite trickled to the outside world as yet. It’s one of a number of relations, collectively known as “fluctuation theorems,” which also include the Crooks Fluctuation Theorem, not to mention our own Bayesian Second Law of Thermodynamics. As our technological and experimental capabilities reach down to scales where the fluctuations become important, our theoretical toolbox has to keep pace. And that’s happening: the Jarzynski equality isn’t just imagination, it’s been experimentally tested and verified. (Of course, I remain just a poor theorist myself, so if you want to understand this image from the experimental paper, you’ll have to talk to someone who knows more about Raman spectroscopy than I do.)

by Sean Carroll at November 24, 2017 02:04 AM

November 23, 2017

The n-Category Cafe

Real Sets

Good news! Janelidze and Street have tackled some puzzles that are perennial favorites here on the <semantics>n<annotation encoding="application/x-tex">n</annotation></semantics>-Café:

  • George Janelidze and Ross Street, Real sets, Tbilisi Mathematical Journal, 10 (2017), 23–49.

Abstract. After reviewing a universal characterization of the extended positive real numbers published by Denis Higgs in 1978, we define a category which provides an answer to the questions:

• what is a set with half an element?

• what is a set with π elements?

The category of these extended positive real sets is equipped with a countable tensor product. We develop somewhat the theory of categories with countable tensors; we call the commutative such categories series monoidal and conclude by only briefly mentioning the non-commutative possibility called ω-monoidal. We include some remarks on sets having cardinalities in <semantics>[,]<annotation encoding="application/x-tex">[-\infty,\infty]</annotation></semantics>.

First they define a series magma, which is a set <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics> equipped with an element <semantics>0<annotation encoding="application/x-tex">0</annotation></semantics> and a summation function

<semantics>:A A<annotation encoding="application/x-tex"> \sum \colon A^{\mathbb{N}} \to A </annotation></semantics>

obeying a nice generalization of the law <semantics>a+0=0+a=a<annotation encoding="application/x-tex">a + 0 = 0 + a = a</annotation></semantics>. Then they define a series monoid in which this summation function obeys a version of the commutative law.

(Yeah, the terminology here seems a bit weird: their summation function already has associativity built in, so their ‘series magma’ is associative and their ‘series monoid’ is also commutative!)

The forgetful functor from series monoids to sets has a left adjoint, and as you’d expect, the free series monoid on the one-element set is <semantics>{}<annotation encoding="application/x-tex">\mathbb{N} \cup \{\infty\}</annotation></semantics>. A more interesting series monoid is <semantics>[0,]<annotation encoding="application/x-tex">[0,\infty]</annotation></semantics>, and one early goal of the paper is to recall Higgs’ categorical description of this. That’s Denis Higgs. Peter Higgs has a boson, but Denis Higgs has a nice theorem.

First, some preliminaries:

Countable products of series monoids coincide with countable coproducts, just as finite products of commutative monoids coincide with finite coproducts.

There is a tensor product of series monoids, which is very similar to the tensor product of commutative monoids —- or, to a lesser extent, the more familiar tensor product of abelian groups. Monoids with respect to this tensor product are called series rigs. For abstract nonsense reasons, because <semantics>{}<annotation encoding="application/x-tex">\mathbb{N} \cup \{\infty\}</annotation></semantics> is the free series monoid on one elements, it also becomes a series rig… with the usual multiplication and addition. (Well, more or less usual: if you’re not familiar with this stuff, a good exercise is to figure out what <semantics>0<annotation encoding="application/x-tex">0</annotation></semantics> times <semantics><annotation encoding="application/x-tex">\infty</annotation></semantics> must be.)

Now for the characterization of <semantics>[0,]<annotation encoding="application/x-tex">[0,\infty]</annotation></semantics>. Given an endomorphism <semantics>f:AA<annotation encoding="application/x-tex">f \colon A \to A</annotation></semantics> of a series monoid <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics> you can define a new endomorphism <semantics>f¯:AA<annotation encoding="application/x-tex">\overline{f} \colon A \to A</annotation></semantics> by

<semantics>f¯=f+ff+fff+<annotation encoding="application/x-tex"> \overline{f} = f + f\circ f + f \circ f \circ f + \cdots </annotation></semantics>

where the infinite sum is defined using the series monoid structure on <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics>. Following Higgs, Janelidze and Street define a Zeno morphism to be an endomorphism <semantics>hmapsAA<annotation encoding="application/x-tex">h \maps A \to A</annotation></semantics> such that

<semantics>h¯=1 A<annotation encoding="application/x-tex"> \overline{h} = 1_A </annotation></semantics>

The reason for this name is that in <semantics>[0,]<annotation encoding="application/x-tex">[0,\infty]</annotation></semantics> we have

<semantics>1=12+(12) 2+(12) 3+<annotation encoding="application/x-tex"> 1 = \frac{1}{2} + \left(\frac{1}{2}\right)^2 + \left(\frac{1}{2}\right)^3 + \cdots </annotation></semantics>

putting us in mind of Zeno’s paradox:

That which is in locomotion must arrive at the half-way stage before it arrives at the goal. — Aristotle, Physics VI:9, 239b10.

So, it makes lots of sense to think of any Zeno morphism <semantics>h:AA<annotation encoding="application/x-tex">h \colon A \to A</annotation></semantics> as a ‘halving’ operation. Hence the name <semantics>h<annotation encoding="application/x-tex">h</annotation></semantics>.

In particular, one can show any Zeno morphism obeys

<semantics>h+h=1 A<annotation encoding="application/x-tex"> h + h = 1_A </annotation></semantics>

Higgs called a series monoid equipped with a Zeno morphism a magnitude module, and he showed that the free magnitude module on one element is <semantics>[0,]<annotation encoding="application/x-tex">[0,\infty]</annotation></semantics>. By the same flavor of abstract nonsense as before, this implies that <semantics>[0,]<annotation encoding="application/x-tex">[0,\infty]</annotation></semantics> is a series rig…. with the usual addition and multiplication.


Next, Janelidze and Street categorify the entire discussion so far! They define a ‘series monoidal category’ to be a category <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics> with an object <semantics>0A<annotation encoding="application/x-tex">0 \in A</annotation></semantics> and summation functor

<semantics>:A A<annotation encoding="application/x-tex"> \sum \colon A^{\mathbb{N}} \to A </annotation></semantics>

obeying some reasonable properties… up to natural isomorphisms that themselves obey some reasonable properties. So, it’s a category where we can add infinite sequences of objects. For example, every series monoid gives a series monoidal category with only identity morphisms. The maps between series monoidal categories are called ‘series monoidal functors’.

They define a ‘Zeno functor’ to be a series monoidal functor <semantics>h:AA<annotation encoding="application/x-tex">h \colon A \to A</annotation></semantics> obeying a categorified version of the definition of Zeno morphism. A series monoidal category with a Zeno functor is called a ‘magnitude category’.

As you’d guess, there are also ‘magnitude functors’ and ‘magnitude natural transformations’, giving a 2-category <semantics>MgnCat<annotation encoding="application/x-tex">MgnCat</annotation></semantics>. There’s a forgetful 2-functor

<semantics>U:MgnCatCat<annotation encoding="application/x-tex"> U \colon MgnCat \to Cat </annotation></semantics>

and it has a left adjoint (or, as Janelidze and Street say, a left ‘biadjoint’)

<semantics>F:CatMgnCat<annotation encoding="application/x-tex"> F \colon Cat \to MgnCat </annotation></semantics>

Applying <semantics>F<annotation encoding="application/x-tex">F</annotation></semantics> to the terminal category <semantics>1<annotation encoding="application/x-tex">1</annotation></semantics>, they get a magnitude category <semantics>RSet g<annotation encoding="application/x-tex">RSet_g</annotation></semantics> of positive real sets. These are like sets, but their cardinality can be anything in <semantics>[0,]<annotation encoding="application/x-tex">[0,\infty]</annotation></semantics>!

For example, Janelidze and Street construct a positive real set of cardinality <semantics>π<annotation encoding="application/x-tex">\pi</annotation></semantics>. Unfortunately they do it starting from the binary expansion of <semantics>π<annotation encoding="application/x-tex">\pi</annotation></semantics>, so it doesn’t connect in a very interesting way with anything I know about the number <semantics>π<annotation encoding="application/x-tex">\pi</annotation></semantics>.

What’s that little subscript <semantics>g<annotation encoding="application/x-tex">g</annotation></semantics>? Well, unfortunately <semantics>RSet g<annotation encoding="application/x-tex">RSet_g</annotation></semantics> is a groupoid: the only morphisms between positive real sets we get from this construction are the isomorphisms.

So, there’s a lot of great stuff here, but apparently a lot left to do.

Digressive Postlude

There is more to say, but I need to get going — I have to walk 45 minutes to Paris 7 to talk to Mathieu Anel about symplectic geometry, and then have lunch with him and Paul-André Melliès. Paul-André kindly invited me to participate in his habilitation defense on Monday, along with Gordon Plotkin, André Joyal, Jean-Yves Girard, Thierry Coquand, Pierre-Louis Curien, George Gonthier, and my friend Karine Chemla (an expert on the history of Chinese mathematics). Paul-André has some wonderful ideas on linear logic, Frobenius pseudomonads, game semantics and the like, and we want to figure out more precisely how all this stuff is connected to topological quantum field theory. I think nobody has gotten to the bottom of this! So, I hope to spend more time here, figuring it out with Paul-André.

by john ( at November 23, 2017 07:58 PM

November 19, 2017

ZapperZ - Physics and Physicists

Can A Simple Physics Error Cast Doubt On A da Vinci Painting?
It seems that the recent auction of a Leonardo da Vinci painting (for $450 million no less) has what everyone seems to call a physics flaw. It involves the crystal orb that is being held in the painting.

A major flaw in the painting — which is the only one of da Vinci's that remains in private hands — makes some historians think it's a fake. The crystal orb in the image doesn't distort light in the way that natural physics does, which would be an unusual error for da Vinci.

My reaction when I first read this is that, it is not as if da Vinci was painting this live with the actual Jesus Christ holding the orb. So either he made a mistake, or he knew what he was doing and didn't think it would matter. I don't think this observation is enough to call the painting a fake.

Still, it may make a good class example in Intro Physics optics.


by ZapperZ ( at November 19, 2017 02:58 AM

November 17, 2017

Tommaso Dorigo - Scientificblogging

My Interview On Physics Today
Following the appearance of Kent Staley's review of my book "Anomaly!" in the November 2017 issue of Physics Today, the online site of the magazine offers, starting today, an interview with yours truly. I think the piece is quite readable and I encourage you to give it a look. Here I only quote a couple of passages for the laziest readers.

read more

by Tommaso Dorigo at November 17, 2017 11:33 PM

ZapperZ - Physics and Physicists

Reviews of "The Quantum Labyrinth"
Paul Halpern's story of "when Feynman met Wheeler" in his book "The Quantum Labyrinth" has two interesting reviews that you can read (here and here). In the history of physics and human civilization, the meeting of the minds of these two giants in the world of physics must be rank up there with other partnerships, such as Lennon and McCartney, Hewlett and Packard, peanut butter and jelly, etc....

I have not read the book yet, and probably won't get to it till some time next year. But if you have read it, I'd like to hear what you think of it.


by ZapperZ ( at November 17, 2017 08:03 PM

November 09, 2017

Robert Helling - atdotde

Why is there a supercontinent cycle?
One of the most influential books of my early childhood was my "Kinderatlas"
There were many things to learn about the world (maps were actually only the last third of the book) and for example I blame my fascination for scuba diving on this book. Also last year, when we visited the Mont-Doré in Auvergne and I had to explain how volcanos are formed to my kids to make them forget how many stairs were still ahead of them to the summit, I did that while mentally picturing the pages in that book about plate tectonics.

But there is one thing I about tectonics that has been bothering me for a long time and I still haven't found a good explanation for (or at least an acknowledgement that there is something to explain): Since the days of Alfred Wegener we know that the jigsaw puzzle pieces of the continents fit in a way that geologists believe that some hundred million years ago they were all connected as a supercontinent Pangea.
Pangea animation 03.gif
By Original upload by en:User:Tbower - USGS animation A08, Public Domain, Link

In fact, that was only the last in a series of supercontinents, that keep forming and breaking up in the "supercontinent cycle".
By SimplisticReps - Own work, CC BY-SA 4.0, Link

So here is the question: I am happy with the idea of several (say $N$) plates roughly containing a continent each that a floating around on the magma driven by all kinds of convection processes in the liquid part of the earth. They are moving around in a pattern that looks to me to be pretty chaotic (in the non-technical sense) and of course for random motion you would expect that from time to time two of those collide and then maybe stick for a while.

Then it would be possible that also a third plate collides with the two but that would be a coincidence (like two random lines typically intersect but if you have three lines they would typically intersect in pairs but typically not in a triple intersection). But to form a supercontinent, you need all $N$ plates to miraculously collide at the same time. This order-$N$ process seems to be highly unlikely when random let alone the fact that it seems to repeat. So this motion cannot be random (yes, Sabine, this is a naturalness argument). This needs an explanation.

So, why, every few hundred million years, do all the land masses of the earth assemble on side of the earth?

One explanation could for example be that during those tines, the center of mass of the earth is not in the symmetry center so the water of the oceans flow to one side of the earth and reveals the seabed on the opposite side of the earth. Then you would have essentially one big island. But this seems not to be the case as the continents (those parts that are above sea-level) appear to be stable on much longer time scales. It is not that the seabed comes up on one side and the land on the other goes under water but the land masses actually move around to meet on one side.

I have already asked this question whenever I ran into people with a geosciences education but it is still open (and I have to admit that in a non-zero number of cases I failed to even make the question clear that an $N$-body collision needs an explanation). But I am sure, you my readers know the answer or even better can come up with one.

by Robert Helling ( at November 09, 2017 09:35 AM

October 24, 2017

Andrew Jaffe - Leaves on the Line

The Chandrasekhar Mass and the Hubble Constant


first direct detection of gravitational waves was announced in February of 2015 by the LIGO team, after decades of planning, building and refining their beautiful experiment. Since that time, the US-based LIGO has been joined by the European Virgo gravitational wave telescope (and more are planned around the globe).

The first four events that the teams announced were from the spiralling in and eventual mergers of pairs of black holes, with masses ranging from about seven to about forty times the mass of the sun. These masses are perhaps a bit higher than we expect to by typical, which might raise intriguing questions about how such black holes were formed and evolved, although even comparing the results to the predictions is a hard problem depending on the details of the statistical properties of the detectors and the astrophysical models for the evolution of black holes and the stars from which (we think) they formed.

Last week, the teams announced the detection of a very different kind of event, the collision of two neutron stars, each about 1.4 times the mass of the sun. Neutron stars are one possible end state of the evolution of a star, when its atoms are no longer able to withstand the pressure of the gravity trying to force them together. This was first understood by S Chandrasekhar in the early years of the 20th Century, who realised that there was a limit to the mass of a star held up simply by the quantum-mechanical repulsion of the electrons at the outskirts of the atoms making up the star. When you surpass this mass, known, appropriately enough, as the Chandrasekhar mass, the star will collapse in upon itself, combining the electrons and protons into neutrons and likely releasing a vast amount of energy in the form of a supernova explosion. After the explosion, the remnant is likely to be a dense ball of neutrons, whose properties are actually determined fairly precisely by similar physics to that of the Chandrasekhar limit (discussed for this case by Oppenheimer, Volkoff and Tolman), giving us the magic 1.4 solar mass number.

(Last week also coincidentally would have seen Chandrasekhar’s 107th birthday, and Google chose to illustrate their home page with an animation in his honour for the occasion. I was a graduate student at the University of Chicago, where Chandra, as he was known, spent most of his career. Most of us students were far too intimidated to interact with him, although it was always seen as an auspicious occasion when you spotted him around the halls of the Astronomy and Astrophysics Center.)

This process can therefore make a single 1.4 solar-mass neutron star, and we can imagine that in some rare cases we can end up with two neutron stars orbiting one another. Indeed, the fact that LIGO saw one, but only one, such event during its year-and-a-half run allows the teams to constrain how often that happens, albeit with very large error bars, between 320 and 4740 events per cubic gigaparsec per year; a cubic gigaparsec is about 3 billion light-years on each side, so these are rare events indeed. These results and many other scientific inferences from this single amazing observation are reported in the teams’ overview paper.

A series of other papers discuss those results in more detail, covering the physics of neutron stars to limits on departures from Einstein’s theory of gravity (for more on some of these other topics, see this blog, or this story from the NY Times). As a cosmologist, the most exciting of the results were the use of the event as a “standard siren”, an object whose gravitational wave properties are well-enough understood that we can deduce the distance to the object from the LIGO results alone. Although the idea came from Bernard Schutz in 1986, the term “Standard siren” was coined somewhat later (by Sean Carroll) in analogy to the (heretofore?) more common cosmological standard candles and standard rulers: objects whose intrinsic brightness and distances are known and so whose distances can be measured by observations of their apparent brightness or size, just as you can roughly deduce how far away a light bulb is by how bright it appears, or how far away a familiar object or person is by how big how it looks.

Gravitational wave events are standard sirens because our understanding of relativity is good enough that an observation of the shape of gravitational wave pattern as a function of time can tell us the properties of its source. Knowing that, we also then know the amplitude of that pattern when it was released. Over the time since then, as the gravitational waves have travelled across the Universe toward us, the amplitude has gone down (further objects look dimmer sound quieter); the expansion of the Universe also causes the frequency of the waves to decrease — this is the cosmological redshift that we observe in the spectra of distant objects’ light.

Unlike LIGO’s previous detections of binary-black-hole mergers, this new observation of a binary-neutron-star merger was also seen in photons: first as a gamma-ray burst, and then as a “nova”: a new dot of light in the sky. Indeed, the observation of the afterglow of the merger by teams of literally thousands of astronomers in gamma and x-rays, optical and infrared light, and in the radio, is one of the more amazing pieces of academic teamwork I have seen.

And these observations allowed the teams to identify the host galaxy of the original neutron stars, and to measure the redshift of its light (the lengthening of the light’s wavelength due to the movement of the galaxy away from us). It is most likely a previously unexceptional galaxy called NGC 4993, with a redshift z=0.009, putting it about 40 megaparsecs away, relatively close on cosmological scales.

But this means that we can measure all of the factors in one of the most celebrated equations in cosmology, Hubble’s law: cz=Hd, where c is the speed of light, z is the redshift just mentioned, and d is the distance measured from the gravitational wave burst itself. This just leaves H₀, the famous Hubble Constant, giving the current rate of expansion of the Universe, usually measured in kilometres per second per megaparsec. The old-fashioned way to measure this quantity is via the so-called cosmic distance ladder, bootstrapping up from nearby objects of known distance to more distant ones whose properties can only be calibrated by comparison with those more nearby. But errors accumulate in this process and we can be susceptible to the weakest rung on the chain (see recent work by some of my colleagues trying to formalise this process). Alternately, we can use data from cosmic microwave background (CMB) experiments like the Planck Satellite (see here for lots of discussion on this blog); the typical size of the CMB pattern on the sky is something very like a standard ruler. Unfortunately, it, too, needs to calibrated, implicitly by other aspects of the CMB pattern itself, and so ends up being a somewhat indirect measurement. Currently, the best cosmic-distance-ladder measurement gives something like 73.24 ± 1.74 km/sec/Mpc whereas Planck gives 67.81 ± 0.92 km/sec/Mpc; these numbers disagree by “a few sigma”, enough that it is hard to explain as simply a statistical fluctuation.

Unfortunately, the new LIGO results do not solve the problem. Because we cannot observe the inclination of the neutron-star binary (i.e., the orientation of its orbit), this blows up the error on the distance to the object, due to the Bayesian marginalisation over this unknown parameter (just as the Planck measurement requires marginalization over all of the other cosmological parameters to fully calibrate the results). Because the host galaxy is relatively nearby, the teams must also account for the fact that the redshift includes the effect not only of the cosmological expansion but also the movement of galaxies with respect to one another due to the pull of gravity on relatively large scales; this so-called peculiar velocity has to be modelled which adds further to the errors.

This procedure gives a final measurement of 70.0+12-8.0, with the full shape of the probability curve shown in the Figure, taken directly from the paper. Both the Planck and distance-ladder results are consistent with these rather large error bars. But this is calculated from a single object; as more of these events are seen these error bars will go down, typically by something like the square root of the number of events, so it might not be too long before this is the best way to measure the Hubble Constant.


[Apologies: too long, too technical, and written late at night while trying to get my wonderful not-quite-three-week-old daughter to sleep through the night.]

by Andrew at October 24, 2017 10:44 AM

October 17, 2017

Matt Strassler - Of Particular Significance

The Significance of Yesterday’s Gravitational Wave Announcement: an FAQ

Yesterday’s post on the results from the LIGO/VIRGO network of gravitational wave detectors was aimed at getting information out, rather than providing the pedagogical backdrop.  Today I’m following up with a post that attempts to answer some of the questions that my readers and my personal friends asked me.  Some wanted to understand better how to visualize what had happened, while others wanted more clarity on why the discovery was so important.  So I’ve put together a post which  (1) explains what neutron stars and black holes are and what their mergers are like, (2) clarifies why yesterday’s announcement was important — and there were many reasons, which is why it’s hard to reduce it all to a single soundbite.  And (3) there are some miscellaneous questions at the end.

First, a disclaimer: I am *not* an expert in the very complex subject of neutron star mergers and the resulting explosions, called kilonovas.  These are much more complicated than black hole mergers.  I am still learning some of the details.  Hopefully I’ve avoided errors, but you’ll notice a few places where I don’t know the answers … yet.  Perhaps my more expert colleagues will help me fill in the gaps over time.

Please, if you spot any errors, don’t hesitate to comment!!  And feel free to ask additional questions whose answers I can add to the list.


What are neutron stars and black holes, and how are they related?

Every atom is made from a tiny atomic nucleus, made of neutrons and protons (which are very similar), and loosely surrounded by electrons. Most of an atom is empty space, so it can, under extreme circumstances, be crushed — but only if every electron and proton convert to a neutron (which remains behind) and a neutrino (which heads off into outer space.) When a giant star runs out of fuel, the pressure from its furnace turns off, and it collapses inward under its own weight, creating just those extraordinary conditions in which the matter can be crushed. Thus: a star’s interior, with a mass one to several times the Sun’s mass, is all turned into a several-mile(kilometer)-wide ball of neutrons — the number of neutrons approaching a 1 with 57 zeroes after it.

If the star is big but not too big, the neutron ball stiffens and holds its shape, and the star explodes outward, blowing itself to pieces in a what is called a core-collapse supernova. The ball of neutrons remains behind; this is what we call a neutron star. It’s a ball of the densest material that we know can exist in the universe — a pure atomic nucleus many miles(kilometers) across. It has a very hard surface; if you tried to go inside a neutron star, your experience would be a lot worse than running into a closed door at a hundred miles per hour.

If the star is very big indeed, the neutron ball that forms may immediately (or soon) collapse under its own weight, forming a black hole. A supernova may or may not result in this case; the star might just disappear. A black hole is very, very different from a neutron star. Black holes are what’s left when matter collapses irretrievably upon itself under the pull of gravity, shrinking down endlessly. While a neutron star has a surface that you could smash your head on, a black hole has no surface — it has an edge that is simply a point of no return, called a horizon. In Einstein’s theory, you can just go right through, as if passing through an open door. You won’t even notice the moment you go in. [Note: this is true in Einstein’s theory. But there is a big controversy as to whether the combination of Einstein’s theory with quantum physics changes the horizon into something novel and dangerous to those who enter; this is known as the firewall controversy, and would take us too far afield into speculation.]  But once you pass through that door, you can never return.

Black holes can form in other ways too, but not those that we’re observing with the LIGO/VIRGO detectors.

Why are their mergers the best sources for gravitational waves?

One of the easiest and most obvious ways to make gravitational waves is to have two objects orbiting each other.  If you put your two fists in a pool of water and move them around each other, you’ll get a pattern of water waves spiraling outward; this is in rough (very rough!) analogy to what happens with two orbiting objects, although, since the objects are moving in space, the waves aren’t in a material like water.  They are waves in space itself.

To get powerful gravitational waves, you want objects each with a very big mass that are orbiting around each other at very high speed. To get the fast motion, you need the force of gravity between the two objects to be strong; and to get gravity to be as strong as possible, you need the two objects to be as close as possible (since, as Isaac Newton already knew, gravity between two objects grows stronger when the distance between them shrinks.) But if the objects are large, they can’t get too close; they will bump into each other and merge long before their orbit can become fast enough. So to get a really fast orbit, you need two relatively small objects, each with a relatively big mass — what scientists refer to as compact objects. Neutron stars and black holes are the most compact objects we know about. Fortunately, they do indeed often travel in orbiting pairs, and do sometimes, for a very brief period before they merge, orbit rapidly enough to produce gravitational waves that LIGO and VIRGO can observe.

Why do we find these objects in pairs in the first place?

Stars very often travel in pairs… they are called binary stars. They can start their lives in pairs, forming together in large gas clouds, or even if they begin solitary, they can end up pairing up if they live in large densely packed communities of stars where it is common for multiple stars to pass nearby. Perhaps surprisingly, their pairing can survive the collapse and explosion of either star, leaving two black holes, two neutron stars, or one of each in orbit around one another.

What happens when these objects merge?

Not surprisingly, there are three classes of mergers which can be detected: two black holes merging, two neutron stars merging, and a neutron star merging with a black hole. The first class was observed in 2015 (and announced in 2016), the second was announced yesterday, and it’s a matter of time before the third class is observed. The two objects may orbit each other for billions of years, very slowly radiating gravitational waves (an effect observed in the 70’s, leading to a Nobel Prize) and gradually coming closer and closer together. Only in the last day of their lives do their orbits really start to speed up. And just before these objects merge, they begin to orbit each other once per second, then ten times per second, then a hundred times per second. Visualize that if you can: objects a few dozen miles (kilometers) across, a few miles (kilometers) apart, each with the mass of the Sun or greater, orbiting each other 100 times each second. It’s truly mind-boggling — a spinning dumbbell beyond the imagination of even the greatest minds of the 19th century. I don’t know any scientist who isn’t awed by this vision. It all sounds like science fiction. But it’s not.

How do we know this isn’t science fiction?

We know, if we believe Einstein’s theory of gravity (and I’ll give you a very good reason to believe in it in just a moment.) Einstein’s theory predicts that such a rapidly spinning, large-mass dumbbell formed by two orbiting compact objects will produce a telltale pattern of ripples in space itself — gravitational waves. That pattern is both complicated and precisely predicted. In the case of black holes, the predictions go right up to and past the moment of merger, to the ringing of the larger black hole that forms in the merger. In the case of neutron stars, the instants just before, during and after the merger are more complex and we can’t yet be confident we understand them, but during tens of seconds before the merger Einstein’s theory is very precise about what to expect. The theory further predicts how those ripples will cross the vast distances from where they were created to the location of the Earth, and how they will appear in the LIGO/VIRGO network of three gravitational wave detectors. The prediction of what to expect at LIGO/VIRGO thus involves not just one prediction but many: the theory is used to predict the existence and properties of black holes and of neutron stars, the detailed features of their mergers, the precise patterns of the resulting gravitational waves, and how those gravitational waves cross space. That LIGO/VIRGO have detected the telltale patterns of these gravitational waves. That these wave patterns agree with Einstein’s theory in every detail is the strongest evidence ever obtained that there is nothing wrong with Einstein’s theory when used in these combined contexts.  That then in turn gives us confidence that our interpretation of the LIGO/VIRGO results is correct, confirming that black holes and neutron stars really exist and really merge. (Notice the reasoning is slightly circular… but that’s how scientific knowledge proceeds, as a set of detailed consistency checks that gradually and eventually become so tightly interconnected as to be almost impossible to unwind.  Scientific reasoning is not deductive; it is inductive.  We do it not because it is logically ironclad but because it works so incredibly well — as witnessed by the computer, and its screen, that I’m using to write this, and the wired and wireless internet and computer disk that will be used to transmit and store it.)


What makes it difficult to explain the significance of yesterday’s announcement is that it consists of many important results piled up together, rather than a simple takeaway that can be reduced to a single soundbite. (That was also true of the black hole mergers announcement back in 2016, which is why I wrote a long post about it.)

So here is a list of important things we learned.  No one of them, by itself, is earth-shattering, but each one is profound, and taken together they form a major event in scientific history.

First confirmed observation of a merger of two neutron stars: We’ve known these mergers must occur, but there’s nothing like being sure. And since these things are too far away and too small to see in a telescope, the only way to be sure these mergers occur, and to learn more details about them, is with gravitational waves.  We expect to see many more of these mergers in coming years as gravitational wave astronomy increases in its sensitivity, and we will learn more and more about them.

New information about the properties of neutron stars: Neutron stars were proposed almost a hundred years ago and were confirmed to exist in the 60’s and 70’s.  But their precise details aren’t known; we believe they are like a giant atomic nucleus, but they’re so vastly larger than ordinary atomic nuclei that can’t be sure we understand all of their internal properties, and there are debates in the scientific community that can’t be easily answered… until, perhaps, now.

From the detailed pattern of the gravitational waves of this one neutron star merger, scientists already learn two things. First, we confirm that Einstein’s theory correctly predicts the basic pattern of gravitational waves from orbiting neutron stars, as it does for orbiting and merging black holes. Unlike black holes, however, there are more questions about what happens to neutron stars when they merge. The question of what happened to this pair after they merged is still out — did the form a neutron star, an unstable neutron star that, slowing its spin, eventually collapsed into a black hole, or a black hole straightaway?

But something important was already learned about the internal properties of neutron stars. The stresses of being whipped around at such incredible speeds would tear you and I apart, and would even tear the Earth apart. We know neutron stars are much tougher than ordinary rock, but how much more? If they were too flimsy, they’d have broken apart at some point during LIGO/VIRGO’s observations, and the simple pattern of gravitational waves that was expected would have suddenly become much more complicated. That didn’t happen until perhaps just before the merger.   So scientists can use the simplicity of the pattern of gravitational waves to infer some new things about how stiff and strong neutron stars are.  More mergers will improve our understanding.  Again, there is no other simple way to obtain this information.

First visual observation of an event that produces both immense gravitational waves and bright electromagnetic waves: Black hole mergers aren’t expected to create a brilliant light display, because, as I mentioned above, they’re more like open doors to an invisible playground than they are like rocks, so they merge rather quietly, without a big bright and hot smash-up.  But neutron stars are big balls of stuff, and so the smash-up can indeed create lots of heat and light of all sorts, just as you might naively expect.  By “light” I mean not just visible light but all forms of electromagnetic waves, at all wavelengths (and therefore at all frequencies.)  Scientists divide up the range of electromagnetic waves into categories. These categories are radio waves, microwaves, infrared light, visible light, ultraviolet light, X-rays, and gamma rays, listed from lowest frequency and largest wavelength to highest frequency and smallest wavelength.  (Note that these categories and the dividing lines between them are completely arbitrary, but the divisions are useful for various scientific purposes.  The only fundamental difference between yellow light, a radio wave, and a gamma ray is the wavelength and frequency; otherwise they’re exactly the same type of thing, a wave in the electric and magnetic fields.)

So if and when two neutron stars merge, we expect both gravitational waves and electromagnetic waves, the latter of many different frequencies created by many different effects that can arise when two huge balls of neutrons collide.  But just because we expect them doesn’t mean they’re easy to see.  These mergers are pretty rare — perhaps one every hundred thousand years in each big galaxy like our own — so the ones we find using LIGO/VIRGO will generally be very far away.  If the light show is too dim, none of our telescopes will be able to see it.

But this light show was plenty bright.  Gamma ray detectors out in space detected it instantly, confirming that the gravitational waves from the two neutron stars led to a collision and merger that produced very high frequency light.  Already, that’s a first.  It’s as though one had seen lightning for years but never heard thunder; or as though one had observed the waves from hurricanes for years but never observed one in the sky.  Seeing both allows us a whole new set of perspectives; one plus one is often much more than two.

Over time — hours and days — effects were seen in visible light, ultraviolet light, infrared light, X-rays and radio waves.  Some were seen earlier than others, which itself is a story, but each one contributes to our understanding of what these mergers are actually like.

Confirmation of the best guess concerning the origin of “short” gamma ray bursts:  For many years, bursts of gamma rays have been observed in the sky.  Among them, there seems to be a class of bursts that are shorter than most, typically lasting just a couple of seconds.  They come from all across the sky, indicating that they come from distant intergalactic space, presumably from distant galaxies.  Among other explanations, the most popular hypothesis concerning these short gamma-ray bursts has been that they come from merging neutron stars.  The only way to confirm this hypothesis is with the observation of the gravitational waves from such a merger.  That test has now been passed; it appears that the hypothesis is correct.  That in turn means that we have, for the first time, both a good explanation of these short gamma ray bursts and, because we know how often we observe these bursts, a good estimate as to how often neutron stars merge in the universe.

First distance measurement to a source using both a gravitational wave measure and a redshift in electromagnetic waves, allowing a new calibration of the distance scale of the universe and of its expansion rate:  The pattern over time of the gravitational waves from a merger of two black holes or neutron stars is complex enough to reveal many things about the merging objects, including a rough estimate of their masses and the orientation of the spinning pair relative to the Earth.  The overall strength of the waves, combined with the knowledge of the masses, reveals how far the pair is from the Earth.  That by itself is nice, but the real win comes when the discovery of the object using visible light, or in fact any light with frequency below gamma-rays, can be made.  In this case, the galaxy that contains the neutron stars can be determined.

Once we know the host galaxy, we can do something really important.  We can, by looking at the starlight, determine how rapidly the galaxy is moving away from us.  For distant galaxies, the speed at which the galaxy recedes should be related to its distance because the universe is expanding.

How rapidly the universe is expanding has been recently measured with remarkable precision, but the problem is that there are two different methods for making the measurement, and they disagree.   This disagreement is one of the most important problems for our understanding of the universe.  Maybe one of the measurement methods is flawed, or maybe — and this would be much more interesting — the universe simply doesn’t behave the way we think it does.

What gravitational waves do is give us a third method: the gravitational waves directly provide the distance to the galaxy, and the electromagnetic waves directly provide the speed of recession.  There is no other way to make this type of joint measurement directly for distant galaxies.  The method is not accurate enough to be useful in just one merger, but once dozens of mergers have been observed, the average result will provide important new information about the universe’s expansion.  When combined with the other methods, it may help resolve this all-important puzzle.

Best test so far of Einstein’s prediction that the speed of light and the speed of gravitational waves are identical: Since gamma rays from the merger and the peak of the gravitational waves arrived within two seconds of one another after traveling 130 million years — that is, about 5 thousand million million seconds — we can say that the speed of light and the speed of gravitational waves are both equal to the cosmic speed limit to within one part in 2 thousand million million.  Such a precise test requires the combination of gravitational wave and gamma ray observations.

Efficient production of heavy elements confirmed:  It’s long been said that we are star-stuff, or stardust, and it’s been clear for a long time that it’s true.  But there’s been a puzzle when one looks into the details.  While it’s known that all the chemical elements from hydrogen up to iron are formed inside of stars, and can be blasted into space in supernova explosions to drift around and eventually form planets, moons, and humans, it hasn’t been quite as clear how the other elements with heavier atoms — atoms such as iodine, cesium, gold, lead, bismuth, uranium and so on — predominantly formed.  Yes they can be formed in supernovas, but not so easily; and there seem to be more atoms of heavy elements around the universe than supernovas can explain.  There are many supernovas in the history of the universe, but the efficiency for producing heavy chemical elements is just too low.

It was proposed some time ago that the mergers of neutron stars might be a suitable place to produce these heavy elements.  Even those these mergers are rare, they might be much more efficient, because the nuclei of heavy elements contain lots of neutrons and, not surprisingly, a collision of two neutron stars would produce lots of neutrons in its debris, suitable perhaps for making these nuclei.   A key indication that this is going on would be the following: if a neutron star merger could be identified using gravitational waves, and if its location could be determined using telescopes, then one would observe a pattern of light that would be characteristic of what is now called a “kilonova” explosion.   Warning: I don’t yet know much about kilonovas and I may be leaving out important details. A kilonova is powered by the process of forming heavy elements; most of the nuclei produced are initially radioactive — i.e., unstable — and they break down by emitting high energy particles, including the particles of light (called photons) which are in the gamma ray and X-ray categories.  The resulting characteristic glow would be expected to have a pattern of a certain type: it would be initially bright but would dim rapidly in visible light, with a long afterglow in infrared light.  The reasons for this are complex, so let me set them aside for now.  The important point is that this pattern was observed, confirming that a kilonova of this type occurred, and thus that, in this neutron star merger, enormous amounts of heavy elements were indeed produced.  So we now have a lot of evidence, for the first time, that almost all the heavy chemical elements on and around our planet were formed in neutron star mergers.  Again, we could not know this if we did not know that this was a neutron star merger, and that information comes only from the gravitational wave observation.


Did the merger of these two neutron stars result in a new black hole, a larger neutron star, or an unstable rapidly spinning neutron star that later collapsed into a black hole?

We don’t yet know, and maybe we won’t know.  Some scientists involved appear to be leaning toward the possibility that a black hole was formed, but others seem to say the jury is out.  I’m not sure what additional information can be obtained over time about this.

If the two neutron stars formed a black hole, why was there a kilonova?  Why wasn’t everything sucked into the black hole?

Black holes aren’t vacuum cleaners; they pull things in via gravity just the same way that the Earth and Sun do, and don’t suck things in some unusual way.  The only crucial thing about a black hole is that once you go in you can’t come out.  But just as when trying to avoid hitting the Earth or Sun, you can avoid falling in if you orbit fast enough or if you’re flung outward before you reach the edge.

The point in a neutron star merger is that the forces at the moment of merger are so intense that one or both neutron stars are partially ripped apart.  The material that is thrown outward in all directions, at an immense speed, somehow creates the bright, hot flash of gamma rays and eventually the kilonova glow from the newly formed atomic nuclei.  Those details I don’t yet understand, but I know they have been carefully studied both with approximate equations and in computer simulations such as this one and this one.  However, the accuracy of the simulations can only be confirmed through the detailed studies of a merger, such as the one just announced.  It seems, from the data we’ve seen, that the simulations did a fairly good job.  I’m sure they will be improved once they are compared with the recent data.




Filed under: Astronomy, Gravitational Waves Tagged: black holes, Gravitational Waves, LIGO, neutron stars

by Matt Strassler at October 17, 2017 04:03 PM

October 16, 2017

Sean Carroll - Preposterous Universe

Standard Sirens

Everyone is rightly excited about the latest gravitational-wave discovery. The LIGO observatory, recently joined by its European partner VIRGO, had previously seen gravitational waves from coalescing black holes. Which is super-awesome, but also a bit lonely — black holes are black, so we detect the gravitational waves and little else. Since our current gravitational-wave observatories aren’t very good at pinpointing source locations on the sky, we’ve been completely unable to say which galaxy, for example, the events originated in.

This has changed now, as we’ve launched the era of “multi-messenger astronomy,” detecting both gravitational and electromagnetic radiation from a single source. The event was the merger of two neutron stars, rather than black holes, and all that matter coming together in a giant conflagration lit up the sky in a large number of wavelengths simultaneously.

Look at all those different observatories, and all those wavelengths of electromagnetic radiation! Radio, infrared, optical, ultraviolet, X-ray, and gamma-ray — soup to nuts, astronomically speaking.

A lot of cutting-edge science will come out of this, see e.g. this main science paper. Apparently some folks are very excited by the fact that the event produced an amount of gold equal to several times the mass of the Earth. But it’s my blog, so let me highlight the aspect of personal relevance to me: using “standard sirens” to measure the expansion of the universe.

We’re already pretty good at measuring the expansion of the universe, using something called the cosmic distance ladder. You build up distance measures step by step, determining the distance to nearby stars, then to more distant clusters, and so forth. Works well, but of course is subject to accumulated errors along the way. This new kind of gravitational-wave observation is something else entirely, allowing us to completely jump over the distance ladder and obtain an independent measurement of the distance to cosmological objects. See this LIGO explainer.

The simultaneous observation of gravitational and electromagnetic waves is crucial to this idea. You’re trying to compare two things: the distance to an object, and the apparent velocity with which it is moving away from us. Usually velocity is the easy part: you measure the redshift of light, which is easy to do when you have an electromagnetic spectrum of an object. But with gravitational waves alone, you can’t do it — there isn’t enough structure in the spectrum to measure a redshift. That’s why the exploding neutron stars were so crucial; in this event, GW170817, we can for the first time determine the precise redshift of a distant gravitational-wave source.

Measuring the distance is the tricky part, and this is where gravitational waves offer a new technique. The favorite conventional strategy is to identify “standard candles” — objects for which you have a reason to believe you know their intrinsic brightness, so that by comparing to the brightness you actually observe you can figure out the distance. To discover the acceleration of the universe, for example,  astronomers used Type Ia supernovae as standard candles.

Gravitational waves don’t quite give you standard candles; every one will generally have a different intrinsic gravitational “luminosity” (the amount of energy emitted). But by looking at the precise way in which the source evolves — the characteristic “chirp” waveform in gravitational waves as the two objects rapidly spiral together — we can work out precisely what that total luminosity actually is. Here’s the chirp for GW170817, compared to the other sources we’ve discovered — much more data, almost a full minute!

So we have both distance and redshift, without using the conventional distance ladder at all! This is important for all sorts of reasons. An independent way of getting at cosmic distances will allow us to measure properties of the dark energy, for example. You might also have heard that there is a discrepancy between different ways of measuring the Hubble constant, which either means someone is making a tiny mistake or there is something dramatically wrong with the way we think about the universe. Having an independent check will be crucial in sorting this out. Just from this one event, we are able to say that the Hubble constant is 70 kilometers per second per megaparsec, albeit with large error bars (+12, -8 km/s/Mpc). That will get much better as we collect more events.

So here is my (infinitesimally tiny) role in this exciting story. The idea of using gravitational-wave sources as standard sirens was put forward by Bernard Schutz all the way back in 1986. But it’s been developed substantially since then, especially by my friends Daniel Holz and Scott Hughes. Years ago Daniel told me about the idea, as he and Scott were writing one of the early papers. My immediate response was “Well, you have to call these things `standard sirens.'” And so a useful label was born.

Sadly for my share of the glory, my Caltech colleague Sterl Phinney also suggested the name simultaneously, as the acknowledgments to the paper testify. That’s okay; when one’s contribution is this extremely small, sharing it doesn’t seem so bad.

By contrast, the glory attaching to the physicists and astronomers who pulled off this observation, and the many others who have contributed to the theoretical understanding behind it, is substantial indeed. Congratulations to all of the hard-working people who have truly opened a new window on how we look at our universe.

by Sean Carroll at October 16, 2017 03:52 PM

Matt Strassler - Of Particular Significance

A Scientific Breakthrough! Combining Gravitational and Electromagnetic Waves

Gravitational waves are now the most important new tool in the astronomer’s toolbox.  Already they’ve been used to confirm that large black holes — with masses ten or more times that of the Sun — and mergers of these large black holes to form even larger ones, are not uncommon in the universe.   Today it goes a big step further.

It’s long been known that neutron stars, remnants of collapsed stars that have exploded as supernovas, are common in the universe.  And it’s been known almost as long that sometimes neutron stars travel in pairs.  (In fact that’s how gravitational waves were first discovered, indirectly, back in the 1970s.)  Stars often form in pairs, and sometimes both stars explode as supernovas, leaving their neutron star relics in orbit around one another.  Neutron stars are small — just ten or so kilometers (miles) across.  According to Einstein’s theory of gravity, a pair of stars should gradually lose energy by emitting gravitational waves into space, and slowly but surely the two objects should spiral in on one another.   Eventually, after many millions or even billions of years, they collide and merge into a larger neutron star, or into a black hole.  This collision does two things.

  1. It makes some kind of brilliant flash of light — electromagnetic waves — whose details are only guessed at.  Some of those electromagnetic waves will be in the form of visible light, while much of it will be in invisible forms, such as gamma rays.
  2. It makes gravitational waves, whose details are easier to calculate and which are therefore distinctive, but couldn’t have been detected until LIGO and VIRGO started taking data, LIGO over the last couple of years, VIRGO over the last couple of months.

It’s possible that we’ve seen the light from neutron star mergers before, but no one could be sure.  Wouldn’t it be great, then, if we could see gravitational waves AND electromagnetic waves from a neutron star merger?  It would be a little like seeing the flash and hearing the sound from fireworks — seeing and hearing is better than either one separately, with each one clarifying the other.  (Caution: scientists are often speaking as if detecting gravitational waves is like “hearing”.  This is only an analogy, and a vague one!  It’s not at all the same as acoustic waves that we can hear with our ears, for many reasons… so please don’t take it too literally.)  If we could do both, we could learn about neutron stars and their properties in an entirely new way.

Today, we learned that this has happened.  LIGO , with the world’s first two gravitational observatories, detected the waves from two merging neutron stars, 130 million light years from Earth, on August 17th.  (Neutron star mergers last much longer than black hole mergers, so the two are easy to distinguish; and this one was so close, relatively speaking, that it was seen for a long while.)  VIRGO, with the third detector, allows scientists to triangulate and determine roughly where mergers have occurred.  They saw only a very weak signal, but that was extremely important, because it told the scientists that the merger must have occurred in a small region of the sky where VIRGO has a relative blind spot.  That told scientists where to look.

The merger was detected for more than a full minute… to be compared with black holes whose mergers can be detected for less than a second.  It’s not exactly clear yet what happened at the end, however!  Did the merged neutron stars form a black hole or a neutron star?  The jury is out.

At almost exactly the moment at which the gravitational waves reached their peak, a blast of gamma rays — electromagnetic waves of very high frequencies — were detected by a different scientific team, the one from FERMI. FERMI detects gamma rays from the distant universe every day, and a two-second gamma-ray-burst is not unusual.  And INTEGRAL, another gamma ray experiment, also detected it.   The teams communicated within minutes.   The FERMI and INTEGRAL gamma ray detectors can only indicate the rough region of the sky from which their gamma rays originate, and LIGO/VIRGO together also only give a rough region.  But the scientists saw those regions overlapped.  The evidence was clear.  And with that, astronomy entered a new, highly anticipated phase.

Already this was a huge discovery.  Brief gamma-ray bursts have been a mystery for years.  One of the best guesses as to their origin has been neutron star mergers.  Now the mystery is solved; that guess is apparently correct. (Or is it?  Probably, but the gamma ray discovery is surprisingly dim, given how close it is.  So there are still questions to ask.)

Also confirmed by the fact that these signals arrived within a couple of seconds of one another, after traveling for over 100 million years from the same source, is that, indeed, the speed of light and the speed of gravitational waves are exactly the same — both of them equal to the cosmic speed limit, just as Einstein’s theory of gravity predicts.

Next, these teams quickly told their astronomer friends to train their telescopes in the general area of the source. Dozens of telescopes, from every continent and from space, and looking for electromagnetic waves at a huge range of frequencies, pointed in that rough direction and scanned for anything unusual.  (A big challenge: the object was near the Sun in the sky, so it could be viewed in darkness only for an hour each night!) Light was detected!  At all frequencies!  The object was very bright, making it easy to find the galaxy in which the merger took place.  The brilliant glow was seen in gamma rays, ultraviolet light, infrared light, X-rays, and radio.  (Neutrinos, particles that can serve as another way to observe distant explosions, were not detected this time.)

And with so much information, so much can be learned!

Most important, perhaps, is this: from the pattern of the spectrum of light, the conjecture seems to be confirmed that the mergers of neutron stars are important sources, perhaps the dominant one, for many of the heavy chemical elements — iodine, iridium, cesium, gold, platinum, and so on — that are forged in the intense heat of these collisions.  It used to be thought that the same supernovas that form neutron stars in the first place were the most likely source.  But now it seems that this second stage of neutron star life — merger, rather than birth — is just as important.  That’s fascinating, because neutron star mergers are much more rare than the supernovas that form them.  There’s a supernova in our Milky Way galaxy every century or so, but it’s tens of millenia or more between these “kilonovas”, created in neutron star mergers.

If there’s anything disappointing about this news, it’s this: almost everything that was observed by all these different experiments was predicted in advance.  Sometimes it’s more important and useful when some of your predictions fail completely, because then you realize how much you have to learn.  Apparently our understanding of gravity, of neutron stars, and of their mergers, and of all sorts of sources of electromagnetic radiation that are produced in those merges, is even better than we might have thought. But fortunately there are a few new puzzles.  The X-rays were late; the gamma rays were dim… we’ll hear more about this shortly, as NASA is holding a second news conference.

Some highlights from the second news conference:

  • New information about neutron star interiors, which affects how large they are and therefore how exactly they merge, has been obtained
  • The first ever visual-light image of a gravitational wave source, from the Swope telescope, at the outskirts of a distant galaxy; the galaxy’s center is the blob of light, and the arrow points to the explosion.

  • The theoretical calculations for a kilonova explosion suggest that debris from the blast should rather quickly block the visual light, so the explosion dims quickly in visible light — but infrared light lasts much longer.  The observations by the visible and infrared light telescopes confirm this aspect of the theory; and you can see evidence for that in the picture above, where four days later the bright spot is both much dimmer and much redder than when it was discovered.
  • Estimate: the total mass of the gold and platinum produced in this explosion is vastly larger than the mass of the Earth.
  • Estimate: these neutron stars were formed about 10 or so billion years ago.  They’ve been orbiting each other for most of the universe’s history, and ended their lives just 130 million years ago, creating the blast we’ve so recently detected.
  • Big Puzzle: all of the previous gamma-ray bursts seen up to now have always had shone in ultraviolet light and X-rays as well as gamma rays.   But X-rays didn’t show up this time, at least not initially.  This was a big surprise.  It took 9 days for the Chandra telescope to observe X-rays, too faint for any other X-ray telescope.  Does this mean that the two neutron stars created a black hole, which then created a jet of matter that points not quite directly at us but off-axis, and shines by illuminating the matter in interstellar space?  This had been suggested as a possibility twenty years ago, but this is the first time there’s been any evidence for it.
  • One more surprise: it took 16 days for radio waves from the source to be discovered, with the Very Large Array, the most powerful existing radio telescope.  The radio emission has been growing brighter since then!  As with the X-rays, this seems also to support the idea of an off-axis jet.
  • Nothing quite like this gamma-ray burst has been seen — or rather, recognized — before.  When a gamma ray burst doesn’t have an X-ray component showing up right away, it simply looks odd and a bit mysterious.  Its harder to observe than most bursts, because without a jet pointing right at us, its afterglow fades quickly.  Moreover, a jet pointing at us is bright, so it blinds us to the more detailed and subtle features of the kilonova.  But this time, LIGO/VIRGO told scientists that “Yes, this is a neutron star merger”, leading to detailed study from all electromagnetic frequencies, including patient study over many days of the X-rays and radio.  In other cases those observations would have stopped after just a short time, and the whole story couldn’t have been properly interpreted.



Filed under: Astronomy, Gravitational Waves

by Matt Strassler at October 16, 2017 03:10 PM

October 13, 2017

Sean Carroll - Preposterous Universe

Mind-Blowing Quantum Mechanics

Trying to climb out from underneath a large pile of looming (and missed) deadlines, and in the process I’m hoping to ramp back up the real blogging. In the meantime, here are a couple of videos to tide you over.

First, an appearance a few weeks ago on Joe Rogan’s podcast. Rogan is a professional comedian and mixed-martial arts commentator, but has built a great audience for his wide-ranging podcast series. One of the things that makes him a good interviewer is his sincere delight in the material, as evidenced here by noting repeatedly that his mind had been blown. We talked for over two and a half hours, covering cosmology and quantum mechanics but also some bits about AI and pop culture.

And here’s a more straightforward lecture, this time at King’s College in London. The topic was “Extracting the Universe from the Wave Function,” which I’ve used for a few talks that ended up being pretty different in execution. This one was aimed at undergraduate physics students, some of whom hadn’t even had quantum mechanics. So the first half is a gentle introduction to many-worlds theory and why it’s the best version of quantum mechanics, and the second half tries to explain our recent efforts to emerge space itself out of quantum entanglement.

I was invited to King’s by Eugene Lim, one of my former grad students and now an extremely productive faculty member in his own right. It’s always good to see your kids grow up to do great things!

by Sean Carroll at October 13, 2017 03:01 PM

October 09, 2017

Alexey Petrov - Symmetry factor

Non-linear teaching

I wanted to share some ideas about a teaching method I am trying to develop and implement this semester. Please let me know if you’ve heard of someone doing something similar.

This semester I am teaching our undergraduate mechanics class. This is the first time I am teaching it, so I started looking into a possibility to shake things up and maybe apply some new method of teaching. And there are plenty offered: flipped classroom, peer instruction, Just-in-Time teaching, etc.  They all look to “move away from the inefficient old model” where there the professor is lecturing and students are taking notes. I have things to say about that, but not in this post. It suffices to say that most of those approaches are essentially trying to make students work (both with the lecturer and their peers) in class and outside of it. At the same time those methods attempt to “compartmentalize teaching” i.e. make large classes “smaller” by bringing up each individual student’s contribution to class activities (by using “clickers”, small discussion groups, etc). For several reasons those approaches did not fit my goal this semester.

Our Classical Mechanics class is a gateway class for our physics majors. It is the first class they take after they are done with general physics lectures. So the students are already familiar with the (simpler version of the) material they are going to be taught. The goal of this class is to start molding physicists out of students: they learn to simplify problems so physics methods can be properly applied (that’s how “a Ford Mustang improperly parked at the top of the icy hill slides down…” turns into “a block slides down the incline…”), learn to always derive the final formula before plugging in numbers, look at the asymptotics of their solutions as a way to see if the solution makes sense, and many other wonderful things.

So with all that I started doing something I’d like to call non-linear teaching. The gist of it is as follows. I give a lecture (and don’t get me wrong, I do make my students talk and work: I ask questions, we do “duels” (students argue different sides of a question), etc — all of that can be done efficiently in a class of 20 students). But instead of one homework with 3-4 problems per week I have two types of homework assignments for them: short homeworks and projects.

Short homework assignments are single-problem assignments given after each class that must be done by the next class. They are designed such that a student need to re-derive material that we discussed previously in class with small new twist added. For example, in the block-down-to-incline problem discussed in class I ask them to choose coordinate axes in a different way and prove that the result is independent of the choice of the coordinate system. Or ask them to find at which angle one should throw a stone to get the maximal possible range (including air resistance), etc.  This way, instead of doing an assignment in the last minute at the end of the week, students have to work out what they just learned in class every day! More importantly, I get to change how I teach. Depending on how they did on the previous short homework, I adjust the material (both speed and volume) discussed in class. I also  design examples for the future sections in such a way that I can repeat parts of the topic that was hard for the students previously. Hence, instead of a linear propagation of the course, we are moving along something akin to helical motion, returning and spending more time on topics that students find more difficult. That’t why my teaching is “non-linear”.

Project homework assignments are designed to develop understanding of how topics in a given chapter relate to each other. There are as many project assignments as there are chapters. Students get two weeks to complete them.

Overall, students solve exactly the same number of problems they would in a normal lecture class. Yet, those problems are scheduled in a different way. In my way, students are forced to learn by constantly re-working what was just discussed in a lecture. And for me, I can quickly react (by adjusting lecture material and speed) using constant feedback I get from students in the form of short homeworks. Win-win!

I will do benchmarking at the end of the class by comparing my class performance to aggregate data from previous years. I’ll report on it later. But for now I would be interested to hear your comments!


by apetrov at October 09, 2017 09:45 PM

October 05, 2017

Symmetrybreaking - Fermilab/SLAC

A radio for dark matter

Instead of searching for dark matter particles, a new device will search for dark matter waves.

Header: A radio for dark matter

Researchers are testing a prototype “radio” that could let them listen to the tune of mysterious dark matter particles. 

Dark matter is an invisible substance thought to be five times more prevalent in the universe than regular matter. According to theory, billions of dark matter particles pass through the Earth each second. We don’t notice them because they interact with regular matter only very weakly, through gravity.

So far, researchers have mostly been looking for dark matter particles. But with the dark matter radio, they want to look for dark matter waves.

Direct detection experiments for dark matter particles use large underground detectors. Researchers hope to see signals from dark matter particles colliding with the detector material. However, this only works if dark matter particles are heavy enough to deposit a detectable amount energy in the collision. 

“If dark matter particles were very light, we might have a better chance of detecting them as waves rather than particles,” says Peter Graham, a theoretical physicist at the Kavli Institute for Particle Astrophysics and Cosmology, a joint institute of Stanford University and the Department of Energy’s SLAC National Accelerator Laboratory. “Our device will take the search in that direction.”

The dark matter radio makes use of a bizarre concept of quantum mechanics known as wave-particle duality: Every particle can also behave like a wave. 

Take, for example, the photon: the massless fundamental particle that carries the electromagnetic force. Streams of them make up electromagnetic radiation, or light, which we typically describe as waves—including radio waves. 

The dark matter radio will search for dark matter waves associated with two particular dark matter candidates.  It could find hidden photons—hypothetical cousins of photons with a small mass. Or it could find axions, which scientists think can be produced out of light and transform back into it in the presence of a magnetic field.

“The search for hidden photons will be completely unexplored territory,” says Saptarshi Chaudhuri, a Stanford graduate student on the project. “As for axions, the dark matter radio will close gaps in the searches of existing experiments.”

Intercepting dark matter vibes

A regular radio intercepts radio waves with an antenna and converts them into sound. What sound depends on the station. A listener chooses a station by adjusting an electric circuit, in which electricity can oscillate with a certain resonant frequency. If the circuit’s resonant frequency matches the station’s frequency, the radio is tuned in and the listener can hear the broadcast.

The dark matter radio works the same way. At its heart is an electric circuit with an adjustable resonant frequency. If the device were tuned to a frequency that matched the frequency of a dark matter particle wave, the circuit would resonate. Scientists could measure the frequency of the resonance, which would reveal the mass of the dark matter particle. 

The idea is to do a frequency sweep by slowly moving through the different frequencies, as if tuning a radio from one end of the dial to the other.

The electric signal from dark matter waves is expected to be very weak. Therefore, Graham has partnered with a team led by another KIPAC researcher, Kent Irwin. Irwin’s group is developing highly sensitive magnetometers known as superconducting quantum interference devices, or SQUIDs, which they’ll pair with extremely low-noise amplifiers to hunt for potential signals.

In its final design, the dark matter radio will search for particles in a mass range of trillionths to millionths of an electronvolt. (One electronvolt is about a billionth of the mass of a proton.) This is somewhat problematic because this range includes kilohertz to gigahertz frequencies—frequencies used for over-the-air broadcasting. 

“Shielding the radio from unwanted radiation is very important and also quite challenging,” Irwin says. “In fact, we would need a several-yards-thick layer of copper to do so. Fortunately we can achieve the same effect with a thin layer of superconducting metal.”

One advantage of the dark matter radio is that it does not need to be shielded from cosmic rays. Whereas direct detection searches for dark matter particles must operate deep underground to block out particles falling from space, the dark matter radio can operate in a university basement.

The researchers are now testing a small-scale prototype at Stanford that will scan a relatively narrow frequency range. They plan on eventually operating two independent, full-size instruments at Stanford and SLAC.

“This is exciting new science,” says Arran Phipps, a KIPAC postdoc on the project. “It’s great that we get to try out a new detection concept with a device that is relatively low-budget and low-risk.” 

The dark matter disc jockeys are taking the first steps now and plan to conduct their dark matter searches over the next few years. Stay tuned for future results.

by Manuel Gnida at October 05, 2017 01:23 PM

October 03, 2017

Jon Butterworth - Life and Physics

Symmetrybreaking - Fermilab/SLAC

Nobel recognizes gravitational wave discovery

Scientists Rainer Weiss, Kip Thorne and Barry Barish won the 2017 Nobel Prize in Physics for their roles in creating the LIGO experiment.

Illustration depicting two black holes circling one another and producing gravitational waves

Three scientists who made essential contributions to the LIGO collaboration have been awarded the 2017 Nobel Prize in Physics.

Rainer Weiss will share the prize with Kip Thorne and Barry Barish for their roles in the discovery of gravitational waves, ripples in space-time predicted by Albert Einstein. Weiss and Thorne conceived of LIGO, and Barish is credited with reviving the struggling experiment and making it happen.

“I view this more as a thing that recognizes the work of about 1000 people,” Weiss said during a Q&A after the announcement this morning. “It’s really a dedicated effort that has been going on, I hate to tell you, for as long as 40 years, people trying to make a detection in the early days and then slowly but surely getting the technology together to do it.”

Another founder of LIGO, scientist Ronald Drever, died in March. Nobel Prizes are not awarded posthumously.

According to Einstein’s general theory of relativity, powerful cosmic events release energy in the form of waves traveling through the fabric of existence at the speed of light. LIGO detects these disturbances when they disrupt the symmetry between the passages of identical laser beams traveling identical distances.

The setup for the LIGO experiment looks like a giant L, with each side stretching about 2.5 miles long. Scientists split a laser beam and shine the two halves down the two sides of the L. When each half of the beam reaches the end, it reflects off a mirror and heads back to the place where its journey began.

Normally, the two halves of the beam return at the same time. When there’s a mismatch, scientists know something is going on. Gravitational waves compress space-time in one direction and stretch it in another, giving one half of the beam a shortcut and sending the other on a longer trip. LIGO is sensitive enough to notice a difference between the arms as small as 1000th the diameter of an atomic nucleus.

Scientists on LIGO and their partner collaboration, called Virgo, reported the first detection of gravitational waves in February 2016. The waves were generated in the collision of two black holes with 29 and 36 times the mass of the sun 1.3 billion years ago. They reached the LIGO experiment as scientists were conducting an engineering test.

“It took us a long time, something like two months, to convince ourselves that we had seen something from outside that was truly a gravitational wave,” Weiss said.

LIGO, which stands for Laser Interferometer Gravitational-Wave Observatory, consists of two of these pieces of equipment, one located in Louisiana and another in Washington state.

The experiment is operated jointly by Weiss’s home institution, MIT, and Barish and Thorne’s home institution, Caltech. The experiment has collaborators from more than 80 institutions from more than 20 countries. A third interferometer, operated by the Virgo collaboration, recently joined LIGO to make the first joint observation of gravitational waves.

by Kathryn Jepsen at October 03, 2017 10:42 AM

September 28, 2017

Symmetrybreaking - Fermilab/SLAC

Conjuring ghost trains for safety

A Fermilab technical specialist recently invented a device that could help alert oncoming trains to large vehicles stuck on the tracks.

Photo of a train traveling along the tracks

Browsing YouTube late at night, Fermilab Technical Specialist Derek Plant stumbled on a series of videos that all begin the same way: a large vehicle—a bus, semi or other low-clearance vehicle—is stuck on a railroad crossing. In the end, the train crashes into the stuck vehicle, destroying it and sometimes even derailing the train. According to the Federal Railroad Administration, every year hundreds of vehicles meet this fate by trains, which can take over a mile to stop.

“I was just surprised at the number of these that I found,” Plant says. “For every accident that’s videotaped, there are probably many more.”

Inspired by a workplace safety class that preached a principle of minimizing the impact of accidents, Plant set about looking for solutions to the problem of trains hitting stuck vehicles.

Railroad tracks are elevated for proper drainage, and the humped profile of many crossings can cause a vehicle to bottom out. “Theoretically, we could lower all the crossings so that they’re no longer a hump. But there are 200,000 crossings in the United States,” Plant says. “Railroads and local governments are trying hard to minimize the number of these crossings by creating overpasses, or elevating roadways. That’s cost-prohibitive, and it’s not going to happen soon.”

Other solutions, such as re-engineering the suspension on vehicles likely to get stuck, seemed equally improbable.

After studying how railroad signaling systems work, Plant came up with an idea: to fake the presence of a train. His invention was developed in his spare time using techniques and principles he learned over his almost two decades at Fermilab. It is currently in the patent application process and being prosecuted by Fermilab’s Office of Technology Transfer.

“If you cross over a railroad track and you look down the tracks, you’ll see red or yellow or green lights,” he says. “Trains have traffic signals too.”

These signals are tied to signal blocks—segments of the tracks that range from a mile to several miles in length. When a train is on the tracks, its metal wheels and axle connect both rails, forming an electric circuit through the tracks to trigger the signals. These signals inform other trains not to proceed while one train occupies a block, avoiding pileups.

Plant thought, “What if other vehicles could trigger the same signal in an emergency?” By faking the presence of a train, a vehicle stuck on the tracks could give advanced warning for oncoming trains to stop and stall for time. Hence the name of Plant’s invention: the Ghost Train Generator.

To replicate the train’s presence, Plant knew he had to create a very strong electric current between the rails. The most straightforward way to do this is with massive amounts of metal, as a train does. But for the Ghost Train Generator to be useful in a pinch, it needs to be small, portable and easily applied. The answer to achieving these features lies in strong magnets and special wire.

“Put one magnet on one rail and one magnet on the other and the device itself mimics—electrically—what a train would look like to the signaling system,” he says. “In theory, this could be carried in vehicles that are at high risk for getting stuck on a crossing: semis, tour buses and first-response vehicles,” Plant says. “Keep it just like you would a fire extinguisher—just behind the seat or in an emergency compartment.”

Once the device is deployed, the train would receive the signal that the tracks were obstructed and stop. Then the driver of the stuck vehicle could call for emergency help using the hotline posted on all crossings.

Plant compares the invention to a seatbelt.

“Is it going to save your life 100 percent of the time? Nope, but smart people wear them,” he says. “It’s designed to prevent a collision when a train is more than two minutes from the crossing.”

And like a seatbelt, part of what makes Plant’s invention so appealing is its simplicity.

“The first thing I thought was that this is a clever invention,” says Aaron Sauers from Fermilab’s technology transfer office, who works with lab staff to develop new technologies for market. “It’s an elegant solution to an existing problem. I thought, ‘This technology could have legs.’”

The organizers of the National Innovation Summit seem to agree.  In May, Fermilab received an Innovation Award from TechConnect for the Ghost Train Generator. The invention will also be featured as a showcase technology in the upcoming Defense Innovation Summit in October.

The Ghost Train Generator is currently in the pipeline to receive a patent with help from Fermilab, and its prospects are promising, according to Sauers. It is a nonprovisional patent, which has specific claims and can be licensed. After that, if the generator passes muster and is granted a patent, Plant will receive a portion of the royalties that it generates for Fermilab.

Fermilab encourages a culture of scientific innovation and exploration beyond the field of particle physics, according to Sauers, who noted that Plant’s invention is just one of a number of technology transfer initiatives at the lab.

Plant agrees—Fermilab’s environment helped motivate his efforts to find a solution for railroad crossing accidents.

“It’s just a general problem-solving state of mind,” he says. “That’s the philosophy we have here at the lab.”

Editor's note: A version of this article was originally published by Fermilab.

by Daniel Garisto at September 28, 2017 05:33 PM

Symmetrybreaking - Fermilab/SLAC

Fermilab on display

The national laboratory opened usually inaccessible areas of its campus to thousands of visitors to celebrate 50 years of discovery.

Fermilab on display

Fermi National Accelerator Laboratory’s yearlong 50th anniversary celebration culminated on Saturday with an Open House that drew thousands of visitors despite the unseasonable heat.

On display were areas of the lab not normally open to guests, including neutrino and muon experiments, a portion of the accelerator complex, lab spaces and magnet and accelerator fabrication and testing areas, to name a few. There were also live links to labs around the world, including CERN, a mountaintop observatory in Chile, and the mile-deep Sanford Underground Research Facility that will house the international neutrino experiment, DUNE.

But it wasn’t all physics. In addition to hands-on demos and a STEM fair, visitors could also learn about Fermilab’s art and history, walk the prairie trails or hang out with the ever-popular bison. In all, some 10,000 visitors got to go behind-the-scenes at Fermilab, shuttled around on 80 buses and welcomed by 900 Fermilab workers eager to explain their roles at the lab. Below, see a few of the photos captured as Fermilab celebrated 50 years of discovery.

by Lauren Biron at September 28, 2017 03:47 PM

September 27, 2017

Matt Strassler - Of Particular Significance

LIGO and VIRGO Announce a Joint Observation of a Black Hole Merger

Welcome, VIRGO!  Another merger of two big black holes has been detected, this time by both LIGO’s two detectors and by VIRGO as well.

Aside from the fact that this means that the VIRGO instrument actually works, which is great news, why is this a big deal?  By adding a third gravitational wave detector, built by the VIRGO collaboration, to LIGO’s Washington and Louisiana detectors, the scientists involved in the search for gravitational waves now can determine fairly accurately the direction from which a detected gravitational wave signal is coming.  And this allows them to do something new: to tell their astronomer colleagues roughly where to look in the sky, using ordinary telescopes, for some form of electromagnetic waves (perhaps visible light, gamma rays, or radio waves) that might have been produced by whatever created the gravitational waves.

The point is that with three detectors, one can triangulate.  The gravitational waves travel for billions of years, traveling at the speed of light, and when they pass by, they are detected at both LIGO detectors and at VIRGO.  But because it takes light a few thousandths of a second to travel the diameter of the Earth, the waves arrive at slightly different times at the LIGO Washington site, the LIGO Louisiana site, and the VIRGO site in Italy.  The precise timing tells the scientists what direction the waves were traveling in, and therefore roughly where they came from.  In a similar way, using the fact that sound travels at a known speed, the times that a gunshot is heard at multiple locations can be used by police to determine where the shot was fired.

You can see the impact in the picture below, which is an image of the sky drawn as a sphere, as if seen from outside the sky looking in.  In previous detections of black hole mergers by LIGO’s two detectors, the scientists could only determine a large swath of sky where the observed merger might have occurred; those are the four colored regions that stretch far across the sky.  But notice the green splotch at lower left.  That’s the region of sky where the black hole merger announced today occurred.  The fact that this region is many times smaller than the other four reflects what including VIRGO makes possible.  It’s a small enough region that one can search using an appropriate telescope for something that is making visible light, or gamma rays, or radio waves.

Skymap of the LIGO/Virgo black hole mergers.

Image credit: LIGO/Virgo/Caltech/MIT/Leo Singer (Milky Way image: Axel Mellinger)


While a black hole merger isn’t expected to be observable by other telescopes, and indeed nothing was observed by other telescopes this time, other events that LIGO might detect, such as a merger of two neutron stars, may create an observable effect. We can hope for such exciting news over the next year or two.

Filed under: Astronomy, Gravitational Waves Tagged: black holes, Gravitational Waves, LIGO

by Matt Strassler at September 27, 2017 05:50 PM

September 26, 2017

Symmetrybreaking - Fermilab/SLAC

Shining with possibility

As Jordan-based SESAME nears its first experiments, members are connecting in new ways.

Header: A new light

Early in the morning, physicist Roy Beck Barkai boards a bus in Tel Aviv bound for Jordan. By 10:30 a.m., he is on site at SESAME, a new scientific facility where scientists plan to use light to study everything from biology to archaeology. He is back home by 7 p.m., in time to have dinner with his children.

Before SESAME opened, the closest facility like it was in Italy. Beck Barkai often traveled for two days by airplane, train and taxi for a day or two of work—an inefficient and expensive process that limited his ability to work with specialized equipment from his home lab and required him to spend days away from his family.  

“For me, having the ability to kiss them goodbye in the morning and just before they went to sleep at night is a miracle,” Beck Barkai says. “It felt like a dream come true. Having SESAME at our doorstep is a big plus."

SESAME, also known as the International Centre for Synchrotron-Light for Experimental Science and Applications in the Middle East, opened its doors in May and is expected to host its first beams of synchrotron light this year. Scientists from around the world will be able to apply for time to use the facility’s powerful light source for their experiments. It’s the first synchrotron in the region. 

Beck Barkai says SESAME provides a welcome dose of convenience, as scientists in the region can now drive to a research center instead of flying with sensitive equipment to another country. It’s also more cost-effective.

Located in Jordan to the northwest of the city of Amman, SESAME was built by a collaboration made up of Cyprus, Egypt, Iran, Israel, Jordan, Pakistan, Turkey and the Palestinian Authority—a partnership members hope will improve relations among the eight neighbors.

“SESAME is a very important step in the region,” says SESAME Scientific Advisory Committee Chair Zehra Sayers. “The language of science is objective. It’s based on curiosity. It doesn’t need to be affected by the differences in cultural and social backgrounds. I hope it is something that we will leave the next generations as a positive step toward stability.”

Inline_1: A new light
Artwork by Ana Kova

Protein researcher and a University of Jordan professor Areej Abuhammad says she hopes SESAME will provide an environment that encourages collaboration. 

“I think through having the chance to interact, the scientists from around this region will learn to trust and respect each other,” she says. “I don’t think that this will result in solving all the problems in the region from one day to the next, but it will be a big step forward.”

The $100 million center is a state-of-the-art research facility that should provide some relief to scientists seeking time at other, overbooked facilities. SESAME plans to eventually host 100 to 200 users at a time. 

SESAME’s first two beamlines will open later this year. About twice per year, SESAME will announce calls for research proposals, the next of which is expected for this fall. Sayers says proposals will be evaluated for originality, preparedness and scientific quality. 

Groups of researchers hoping to join the first round of experiments submitted more than 50 applications. Once the lab is at full operation, Sayers says, the selection committee expects to receive four to five times more than that.

Opening up a synchrotron in the Middle East means that more people will learn about these facilities and have a chance to use them. Because some scientists in the region are new to using synchrotrons or writing the style of applications SESAME requires, Sayers asked the selection committee to provide feedback with any rejections. 

Abuhammad is excited for the learning opportunity SESAME presents for her students—and for the possibility that experiences at SESAME will spark future careers in science. 

She plans to apply for beam time at SESAME to conduct protein crystallography, a field that involves peering inside proteins to learn about their function and aid in pharmaceutical drug discovery. 

Another scientist vying for a spot at SESAME is Iranian chemist Maedeh Darzi, who studies the materials of ancient manuscripts and how they degrade. Synchrotrons are of great value to archaeologists because they minimize the damage to irreplaceable artifacts. Instead of cutting them apart, scientists can take a less damaging approach by probing them with particles. 

Darzi sees SESAME as a chance to collaborate with scientists from the Middle East and to promote science, peace and friendship. For her and others, SESAME could be a place where particles put things back together.

by Signe Brewster at September 26, 2017 02:13 PM

September 24, 2017

September 21, 2017

Symmetrybreaking - Fermilab/SLAC

Concrete applications for accelerator science

A project called A2D2 will explore new applications for compact linear accelerators.

Tom Kroc, Matteo Quagliotto and Mike Geelhoed set up a sample beneath the A2D2 accelerator to test the electron beam.

Particle accelerators are the engines of particle physics research at Fermi National Accelerator Laboratory. They generate nearly light-speed, subatomic particles that scientists study to get to the bottom of what makes our universe tick. Fermilab experiments rely on a number of different accelerators, including a powerful, 500-foot-long linear accelerator that kick-starts the process of sending particle beams to various destinations.

But if you’re not doing physics research, what’s an accelerator good for?

It turns out, quite a lot: Electron beams generated by linear accelerators have all kinds of practical uses, such as making the wires used in cars melt-resistant or purifying water.

A project called Accelerator Application Development and Demonstration (A2D2) at Fermilab’s Illinois Accelerator Research Center will help Fermilab and its partners to explore new applications for compact linear accelerators, which are only a few feet long rather than a few hundred. These compact accelerators are of special interest because of their small size—they’re cheaper and more practical to build in an industrial setting than particle physics research accelerators—and they can be more powerful than ever.

“A2D2 has two aspects: One is to investigate new applications of how electron beams might be used to change, modify or process different materials,” says Fermilab’s Tom Kroc, an A2D2 physicist. “The second is to contribute a little more to the understanding of how these processes happen.”

To develop these aspects of accelerator applications, A2D2 will employ a compact linear accelerator that was once used in a hospital to treat tumors with electron beams. With a few upgrades to increase its power, the A2D2 accelerator will be ready to embark on a new venture: exploring and benchmarking other possible uses of electron beams, which will help specify the design of a new, industrial-grade, high-power machine under development by IARC and its partners.

It won’t be just Fermilab scientists using the A2D2 accelerator: As part of IARC, the accelerator will be available for use (typically through a formal CRADA or SPP agreement) by anyone who has a novel idea for electron beam applications. IARC’s purpose is to partner with industry to explore ways to translate basic research and tools, including accelerator research, into commercial applications.

“I already have a lot of people from industry asking me, ‘When can I use A2D2?’” says Charlie Cooper, general manager of IARC. “A2D2 will allow us to directly contribute to industrial applications—it’s something concrete that IARC now offers.”

Speaking of concrete, one of the first applications in mind for compact linear accelerators is creating durable pavement for roads that won’t crack in the cold or spread out in the heat. This could be achieved by replacing traditional asphalt with a material that could be strengthened using an accelerator. The extra strength would come from crosslinking, a process that creates bonds between layers of material, almost like applying glue between sheets of paper. A single sheet of paper tears easily, but when two or more layers are linked by glue, the paper becomes stronger.

“Using accelerators, you could have pavement that lasts longer, is tougher and has a bigger temperature range,” says Bob Kephart, director of IARC. Kephart holds two patents for the process of curing cement through crosslinking. “Basically, you’d put the road down like you do right now, and you’d pass an accelerator over it, and suddenly you’d turn it into really tough stuff—like the bed liner in the back of your pickup truck.”

This process has already caught the eye of the U.S. Army Corps of Engineers, which will be one of A2D2’s first partners. Another partner will be the Chicago Metropolitan Water Reclamation District, which will test the utility of compact accelerators for water purification. Many other potential customers are lining up to use the A2D2 technology platform.

“You can basically drive chemical reactions with electron beams—and in many cases those can be more efficient than conventional technology, so there are a variety of applications,” Kephart says. “Usually what you have to do is make a batch of something and heat it up in order for a reaction to occur. An electron beam can make a reaction happen by breaking a bond with a single electron.”

In other words, instead of having to cook a material for a long time to reach a specific heat that would induce a chemical reaction, you could zap it with an electron beam to get the same effect in a fraction of the time.

In addition to exploring the new electron-beam applications with the A2D2 accelerator, scientists and engineers at IARC are using cutting-edge accelerator technology to design and build a new kind of portable, compact accelerator, one that will take applications uncovered with A2D2 out of the lab and into the field. The A2D2 accelerator is already small compared to most accelerators, but the latest R&D allows IARC experts to shrink the size while increasing the power of their proposed accelerator even further.

“The new, compact accelerator that we’re developing will be high-power and high-energy for industry,” Cooper says. “This will enable some things that weren’t possible in the past. For something such as environmental cleanup, you could take the accelerator directly to the site.”

While the IARC team develops this portable accelerator, which should be able to fit on a standard trailer, the A2D2 accelerator will continue to be a place to experiment with how to use electron beams—and study what happens when you do.

“The point of this facility is more development than research, however there will be some research on irradiated samples,” says Fermilab’s Mike Geelhoed, one of the A2D2 project leads. “We’re all excited—at least I am. We and our partners have been anticipating this machine for some time now. We all want to see how well it can perform.”

Editor's note: This article was originally published by Fermilab.

by Leah Poffenberger at September 21, 2017 05:18 PM

September 19, 2017

Symmetrybreaking - Fermilab/SLAC

50 years of stories

To celebrate a half-century of discovery, Fermilab has been gathering tales of life at the lab.

People discussing Fermilab history

Science stories usually catch the eye when there’s big news: the discovery of gravitational waves, the appearance of a new particle. But behind the blockbusters are the thousands of smaller stories of science behind the scenes and daily life at a research institution. 

As the Department of Energy’s Fermi National Accelerator Laboratory celebrates its 50th anniversary year, employees past and present have shared memories of building a lab dedicated to particle physics.

Some shared personal memories: keeping an accelerator running during a massive snowstorm; being too impatient for the arrival of an important piece of detector equipment to stay put and wait for it to arrive; accidentally complaining about the lab to the lab’s director.

Others focused on milestones and accomplishments: the first daycare at a national lab, the Saturday Morning Physics Program built by Nobel laureate Leon Lederman, the birth of the web at Fermilab.

People shared memories of big names that built the lab: charismatic founding director Robert R. Wilson, fiery head of accelerator development Helen Edwards, talented lab artist Angela Gonzales.

And or course, employees told stories about Fermilab’s resident herd of bison.

There are many more stories to peruse. You can watch a playlist of the video anecdotes or find all of the stories (both written and video) collected on Fermilab’s 50th anniversary website.

by Lauren Biron at September 19, 2017 01:00 PM

September 15, 2017

Symmetrybreaking - Fermilab/SLAC

SENSEI searches for light dark matter

Technology proposed 30 years ago to search for dark matter is finally seeing the light.

Two scientists in hard hats stand next to a cart holding detector components.

In a project called SENSEI, scientists are using innovative sensors developed over three decades to look for the lightest dark matter particles anyone has ever tried to detect.

Dark matter—so named because it doesn’t absorb, reflect or emit light—constitutes 27 percent of the universe, but the jury is still out on what it’s made of. The primary theoretical suspect for the main component of dark matter is a particle scientists have descriptively named the weakly interactive massive particle, or WIMP.

But since none of these heavy particles, which are expected to have a mass 100 times that of a proton, have shown up in experiments, it might be time for researchers to think small.

“There is a growing interest in looking for different kinds of dark matter that are additives to the standard WIMP model,” says Fermi National Accelerator Laboratory scientist Javier Tiffenberg, a leader of the SENSEI collaboration. “Lightweight, or low-mass, dark matter is a very compelling possibility, and for the first time, the technology is there to explore these candidates.”

Sensing the unseen

In traditional dark matter experiments, scientists look for a transfer of energy that would occur if dark matter particles collided with an ordinary nucleus. But SENSEI is different; it looks for direct interactions of dark matter particles colliding with electrons.

“That is a big difference—you get a lot more energy transferred in this case because an electron is so light compared to a nucleus,” Tiffenberg says.

If dark matter had low mass—much smaller than the WIMP model suggests—then it would be many times lighter than an atomic nucleus. So if it were to collide with a nucleus, the resulting energy transfer would be far too small to tell us anything. It would be like throwing a ping-pong ball at a boulder: The heavy object wouldn’t go anywhere, and there would be no sign the two had come into contact.

An electron is nowhere near as heavy as an atomic nucleus. In fact, a single proton has about 1836 times more mass than an electron. So the collision of a low-mass dark matter particle with an electron has a much better chance of leaving a mark—it’s more bowling ball than boulder.

Bowling balls aren't exactly light, though. An energy transfer between a low-mass dark matter particle and an electron would leave only a blip of energy, one either too small for most detectors to pick up or easily overshadowed by noise in the data.

“The bowling ball will move a very tiny amount,” says Fermilab scientist Juan Estrada, a SENSEI collaborator. “You need a very precise detector to see this interaction of lightweight particles with something that is much heavier.”

That’s where SENSEI’s sensitive sensors come in.

SENSEI will use skipper charge-couple devices, also called skipper CCDs. CCDs have been used for other dark matter detection experiments, such as the Dark Matter in CCDs (or DAMIC) experiment operating at SNOLAB in Canada. These CCDs were a spinoff from sensors developed for use in the Dark Energy Camera in Chile and other dark energy search projects.

CCDs are typically made of silicon divided into pixels. When a dark matter particle passes through the CCD, it collides with the silicon’s electrons, knocking them free, leaving a net electric charge in each pixel the particle passes through. The electrons then flow through adjacent pixels and are ultimately read as a current in a device that measures the number of electrons freed from each CCD pixel. That measurement tells scientists about the mass and energy of the particle that got the chain reaction going. A massive particle, like a WIMP, would free a gusher of electrons, but a low-mass particle might free only one or two.

Typical CCDs can measure the charge left behind only once, which makes it difficult to decide if a tiny energy signal from one or two electrons is real or an error.

Skipper CCDs are a new generation of the technology that helps eliminate the “iffiness” of a measurement that has a one- or two-electron margin of error. “The big step forward for the skipper CCD is that we are able to measure this charge as many times as we want,” Tiffenberg says.

The charge left behind in the skipper CCD can be sampled multiple times and then averaged, a method that yields a more precise measurement of the charge deposited in each pixel than the measure-one-and-done technique. That’s the rule of statistics: With more data, you get closer to a property’s true value.

SENSEI scientists take advantage of the skipper CCD architecture, measuring the number of electrons in a single pixel a whopping 4000 times.

“This is a simple idea, but it took us 30 years to get it to work,” Estrada says.

From idea to reality to beyond

A small SENSEI prototype is currently running at Fermilab in a detector hall 385 feet below ground, and it has demonstrated that this detector design will work in the hunt for dark matter.

Skipper CCD technology and SENSEI were brought to life by Laboratory Directed Research and Development (LDRD) funds at Fermilab and Lawrence Berkeley National Laboratory (Berkeley Lab). LDRD programs are intended to provide funding for development of novel, cutting-edge ideas for scientific discovery.

The Fermilab LDRDs were awarded only recently—less than two years ago—but close collaboration between the two laboratories has already yielded SENSEI’s promising design, partially thanks to Berkeley lab’s previous work in skipper CCD design.

Fermilab LDRD funds allow researchers to test the sensors and develop detectors based on the science, and the Berkeley Lab LDRD funds support the sensor design, which was originally proposed by Berkeley Lab scientist Steve Holland.

“It is the combination of the two LDRDs that really make SENSEI possible,” Estrada says.

Future SENSEI research will also receive a boost thanks to a recent grant from the Heising-Simons Foundation.

“SENSEI is very cool, but what’s really impressive is that the skipper CCD will allow the SENSEI science and a lot of other applications,” Estrada says. “Astronomical studies are limited by the sensitivity of their experimental measurements, and having sensors without noise is the equivalent of making your telescope bigger—more sensitive.”

SENSEI technology may also be critical in the hunt for a fourth type of neutrino, called the sterile neutrino, which seems to be even more shy than its three notoriously elusive neutrino family members.

A larger SENSEI detector equipped with more skipper CCDs will be deployed within the year. It’s possible it might not detect anything, sending researchers back to the drawing board in the hunt for dark matter. Or SENSEI might finally make contact with dark matter—and that would be SENSEI-tional.

Editor's note: This article is based on an article published by Fermilab.

by Leah Poffenberger at September 15, 2017 07:00 PM



[RSS 2.0 Feed] [Atom Feed]

Last updated:
December 18, 2017 10:06 PM
All times are UTC.

Suggest a blog: