# Particle Physics Planet

## December 18, 2017

### Emily Lakdawalla - The Planetary Society Blog

## December 17, 2017

### Christian P. Robert - xi'an's og

**A**t the end of [last] August, Jeremy Heng, Adrian Bishop†, George Deligiannidis and Arnaud Doucet arXived a paper on controlled sequential Monte Carlo (SMC). That we read today at the BiPs reading group in Paris-Saclay, when I took these notes. The setting is classical SMC, but with a twist in that the proposals at each time iteration are modified by an importance function. (I was quite surprised to discover that this was completely new in that I was under the false impression that it had been tried ages ago!) This importance sampling setting can be interpreted as a change of measures on both the hidden Markov chain and on its observed version. So that the overall normalising constant remains the same. And then being in an importance sampling setting there exists an optimal choice for the importance functions. That results in a zero variance estimated normalising constant, unsurprisingly. And the optimal solution is actually the backward filter familiar to SMC users.

A large part of the paper actually concentrates on figuring out an implementable version of this optimal solution. Using dynamic programming. And projection of each local generator over a simple linear space with Gaussian kernels (aka Gaussian mixtures). Which becomes feasible through the particle systems generated at earlier iterations of said dynamic programming.

The paper is massive, both in terms of theoretical results and of the range of simulations, and we could not get through it within the 90 minutes Sylvain LeCorff spent on presenting it. I can only wonder at this stage how much Rao-Blackwellisation or AMIS could improve the performances of the algorithm. (A point I find quite amazing in Proposition 1 is that the normalising constant Z of the filtering distribution does not change along observations when using the optimal importance function, which translates into the estimates being nearly constant after a few iterations.)

Filed under: Books, pictures, Statistics, University life Tagged: BiPS, dynamic programming, hidden Markov models, importance sampling, normalising constant, sequential Monte Carlo

### The n-Category Cafe

In the comments last time, a conversation got going about *$<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$-adic*
entropy. But here I’ll return to the original subject: entropy *modulo
$<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$*. I’ll answer the question:

Given a “probability distribution” mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$, that is, a tuple $$<semantics>\pi =({\pi}_{1},\dots ,{\pi}_{n})\in (\mathbb{Z}/p\mathbb{Z}{)}^{n}<annotation\; encoding="application/x-tex">\; \backslash pi\; =\; (\backslash pi\_1,\; \backslash ldots,\; \backslash pi\_n)\; \backslash in\; (\backslash mathbb\{Z\}/p\backslash mathbb\{Z\})^n\; </annotation></semantics>$$ summing to $<semantics>1<annotation\; encoding="application/x-tex">1</annotation></semantics>$, what is the right definition of its entropy $$<semantics>{H}_{p}(\pi )\in \mathbb{Z}/p\mathbb{Z}?<annotation\; encoding="application/x-tex">\; H\_p(\backslash pi)\; \backslash in\; \backslash mathbb\{Z\}/p\backslash mathbb\{Z\}?\; </annotation></semantics>$$

How will we know when we’ve got the right definition? As I explained last time, the acid test is whether it satisfies the chain rule

$$<semantics>{H}_{p}(\gamma \circ ({\pi}^{1},\dots ,{\pi}^{n}))={H}_{p}(\gamma )+\sum _{i=1}^{n}{\gamma}_{i}{H}_{p}({\pi}^{i}).<annotation\; encoding="application/x-tex">\; H\_p(\backslash gamma\; \backslash circ\; (\backslash pi^1,\; \backslash ldots,\; \backslash pi^n))\; =\; H\_p(\backslash gamma)\; +\; \backslash sum\_\{i\; =\; 1\}^n\; \backslash gamma\_i\; H\_p(\backslash pi^i).\; </annotation></semantics>$$

This is supposed to hold for all $<semantics>\gamma =({\gamma}_{1},\dots ,{\gamma}_{n})\in {\Pi}_{n}<annotation\; encoding="application/x-tex">\backslash gamma\; =\; (\backslash gamma\_1,\; \backslash ldots,\; \backslash gamma\_n)\; \backslash in\; \backslash Pi\_n</annotation></semantics>$ and $<semantics>{\pi}^{i}=({\pi}_{1}^{i},\dots ,{\pi}_{{k}_{i}}^{i})\in {\Pi}_{{k}_{i}}<annotation\; encoding="application/x-tex">\backslash pi^i\; =\; (\backslash pi^i\_1,\; \backslash ldots,\; \backslash pi^i\_\{k\_i\})\; \backslash in\; \backslash Pi\_\{k\_i\}</annotation></semantics>$, where $<semantics>{\Pi}_{n}<annotation\; encoding="application/x-tex">\backslash Pi\_n</annotation></semantics>$ is the hyperplane

$$<semantics>{\Pi}_{n}=\{({\pi}_{1},\dots ,{\pi}_{n})\in (\mathbb{Z}/p\mathbb{Z}{)}^{n}:{\pi}_{1}+\cdots +{\pi}_{n}=1\},<annotation\; encoding="application/x-tex">\; \backslash Pi\_n\; =\; \backslash \{\; (\backslash pi\_1,\; \backslash ldots,\; \backslash pi\_n)\; \backslash in\; (\backslash mathbb\{Z\}/p\backslash mathbb\{Z\})^n\; :\; \backslash pi\_1\; +\; \backslash cdots\; +\; \backslash pi\_n\; =\; 1\backslash \},\; </annotation></semantics>$$

whose elements we’re calling “probability distributions” mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$. And if
God is smiling on us, $<semantics>{H}_{p}<annotation\; encoding="application/x-tex">H\_p</annotation></semantics>$ will be essentially the *only* quantity
that satisfies the chain rule. Then we’ll know we’ve got the right
definition.

Black belts in functional equations will be able to use the chain rule and
nothing else to work out what $<semantics>{H}_{p}<annotation\; encoding="application/x-tex">H\_p</annotation></semantics>$ must be. But the rest of us might like
an extra clue, and we have one in the definition of *real* Shannon entropy:

$$<semantics>{H}_{\mathbb{R}}(\pi )=-\sum _{i:{\pi}_{i}\ne 0}{\pi}_{i}\mathrm{log}{\pi}_{i}.<annotation\; encoding="application/x-tex">\; H\_\backslash mathbb\{R\}(\backslash pi)\; =\; -\; \backslash sum\_\{i:\; \backslash pi\_i\; \backslash neq\; 0\}\; \backslash pi\_i\; \backslash log\; \backslash pi\_i.\; </annotation></semantics>$$

Now, we saw last time that there is no logarithm mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$; that is, there is no group homomorphism

$$<semantics>(\mathbb{Z}/p\mathbb{Z}{)}^{\times}\to \mathbb{Z}/p\mathbb{Z}.<annotation\; encoding="application/x-tex">\; (\backslash mathbb\{Z\}/p\backslash mathbb\{Z\})^\backslash times\; \backslash to\; \backslash mathbb\{Z\}/p\backslash mathbb\{Z\}.\; </annotation></semantics>$$

But there *is* a next-best thing: a homomorphism

$$<semantics>(\mathbb{Z}/{p}^{2}\mathbb{Z}{)}^{\times}\to \mathbb{Z}/p\mathbb{Z}.<annotation\; encoding="application/x-tex">\; (\backslash mathbb\{Z\}/p^2\backslash mathbb\{Z\})^\backslash times\; \backslash to\; \backslash mathbb\{Z\}/p\backslash mathbb\{Z\}.\; </annotation></semantics>$$

This is called the Fermat quotient $<semantics>{q}_{p}<annotation\; encoding="application/x-tex">q\_p</annotation></semantics>$, and it’s given by

$$<semantics>{q}_{p}(n)=\frac{{n}^{p-1}-1}{p}\in \mathbb{Z}/p\mathbb{Z}.<annotation\; encoding="application/x-tex">\; q\_p(n)\; =\; \backslash frac\{n^\{p\; -\; 1\}\; -\; 1\}\{p\}\; \backslash in\; \backslash mathbb\{Z\}/p\backslash mathbb\{Z\}.\; </annotation></semantics>$$

Let’s go through why this works.

The elements of $<semantics>\mathbb{Z}/{p}^{2}\mathbb{Z}<annotation\; encoding="application/x-tex">\backslash mathbb\{Z\}/p^2\backslash mathbb\{Z\}</annotation></semantics>$ are the congruence classes mod $<semantics>{p}^{2}<annotation\; encoding="application/x-tex">p^2</annotation></semantics>$ of the integers not divisible by $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$. Fermat’s little theorem says that whenever $<semantics>n<annotation\; encoding="application/x-tex">n</annotation></semantics>$ is not divisible by $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$,

$$<semantics>\frac{{n}^{p-1}-1}{p}<annotation\; encoding="application/x-tex">\; \backslash frac\{n^\{p\; -\; 1\}\; -\; 1\}\{p\}\; </annotation></semantics>$$

is an integer. This, or rather its congruence class mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$, is the Fermat quotient. The congruence class of $<semantics>n<annotation\; encoding="application/x-tex">n</annotation></semantics>$ mod $<semantics>{p}^{2}<annotation\; encoding="application/x-tex">p^2</annotation></semantics>$ determines the congruence class of $<semantics>{n}^{p-1}-1<annotation\; encoding="application/x-tex">n^\{p\; -\; 1\}\; -\; 1</annotation></semantics>$ mod $<semantics>{p}^{2}<annotation\; encoding="application/x-tex">p^2</annotation></semantics>$, and it therefore determines the congruence class of $<semantics>({n}^{p-1}-1)/p<annotation\; encoding="application/x-tex">(n^\{p\; -\; 1\}\; -\; 1)/p</annotation></semantics>$ mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$. So, $<semantics>{q}_{p}<annotation\; encoding="application/x-tex">q\_p</annotation></semantics>$ defines a function $<semantics>(\mathbb{Z}/{p}^{2}\mathbb{Z}{)}^{\times}\to \mathbb{Z}/p\mathbb{Z}<annotation\; encoding="application/x-tex">(\backslash mathbb\{Z\}/p^2\backslash mathbb\{Z\})^\backslash times\; \backslash to\; \backslash mathbb\{Z\}/p\backslash mathbb\{Z\}</annotation></semantics>$. It’s a pleasant exercise to show that it’s a homomorphism. In other words, $<semantics>{q}_{p}<annotation\; encoding="application/x-tex">q\_p</annotation></semantics>$ has the log-like property

$$<semantics>{q}_{p}(mn)={q}_{p}(m)+{q}_{p}(n)<annotation\; encoding="application/x-tex">\; q\_p(m\; n)\; =\; q\_p(m)\; +\; q\_p(n)\; </annotation></semantics>$$

for all integers $<semantics>m,n<annotation\; encoding="application/x-tex">m,\; n</annotation></semantics>$ not divisible by $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$.

In fact, it’s essentially unique as such. Any other homomorphism $<semantics>(\mathbb{Z}/{p}^{2}\mathbb{Z}{)}^{\times}\to \mathbb{Z}/p\mathbb{Z}<annotation\; encoding="application/x-tex">(\backslash mathbb\{Z\}/p^2\backslash mathbb\{Z\})^\backslash times\; \backslash to\; \backslash mathbb\{Z\}/p\backslash mathbb\{Z\}</annotation></semantics>$ is a scalar multiple of $<semantics>{q}_{p}<annotation\; encoding="application/x-tex">q\_p</annotation></semantics>$. (This follows from the classical theorem that the group $<semantics>(\mathbb{Z}/{p}^{2}\mathbb{Z}{)}^{\times}<annotation\; encoding="application/x-tex">(\backslash mathbb\{Z\}/p^2\backslash mathbb\{Z\})^\backslash times</annotation></semantics>$ is cyclic.) It’s just like the fact that up to a scalar multiple, the real logarithm is the unique measurable function $<semantics>\mathrm{log}:(0,\mathrm{\infty})\to R<annotation\; encoding="application/x-tex">\backslash log\; :\; (0,\; \backslash infty)\; \backslash to\; \backslash R</annotation></semantics>$ such that $<semantics>\mathrm{log}(xy)=\mathrm{log}x+\mathrm{log}y<annotation\; encoding="application/x-tex">\backslash log(x\; y)\; =\; \backslash log\; x\; +\; \backslash log\; y</annotation></semantics>$, but here there’s nothing like measurability complicating things.

So: $<semantics>{q}_{p}<annotation\; encoding="application/x-tex">q\_p</annotation></semantics>$ functions as a kind of logarithm. Given a mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$ probability distribution $<semantics>\pi =\in {\Pi}_{n}<annotation\; encoding="application/x-tex">\backslash pi\; =\; \backslash in\; \backslash Pi\_n</annotation></semantics>$, we might therefore guess that the right definition of its entropy is

$$<semantics>-\sum _{i:{\pi}_{i}\ne 0}{\pi}_{i}{q}_{p}({a}_{i}),<annotation\; encoding="application/x-tex">\; -\; \backslash sum\_\{i\; :\; \backslash pi\_i\; \backslash neq\; 0\}\; \backslash pi\_i\; q\_p(a\_i),\; </annotation></semantics>$$

where $<semantics>{a}_{i}<annotation\; encoding="application/x-tex">a\_i</annotation></semantics>$ is an integer representing $<semantics>{\pi}_{i}\in \mathbb{Z}/p\mathbb{Z}<annotation\; encoding="application/x-tex">\backslash pi\_i\; \backslash in\; \backslash mathbb\{Z\}/p\backslash mathbb\{Z\}</annotation></semantics>$.

However, this doesn’t work. It depends on the choice of representatives $<semantics>{a}_{i}<annotation\; encoding="application/x-tex">a\_i</annotation></semantics>$.

To get the right answer, we’ll look at real entropy in a slightly different way. Define $<semantics>{\partial}_{\mathbb{R}}:[0,1]\to \mathbb{R}<annotation\; encoding="application/x-tex">\backslash partial\_\backslash mathbb\{R\}:\; [0,\; 1]\; \backslash to\; \backslash mathbb\{R\}</annotation></semantics>$ by

$$<semantics>{\partial}_{\mathbb{R}}(x)=\{\begin{array}{ll}-x\mathrm{log}x& \mathrm{if}x\ne 0,\\ 0& \mathrm{if}x=0.\end{array}.annotation\; encoding="application/x-tex"\; \backslash partial\_\backslash mathbb\{R\}(x)\; =\; \backslash begin\{cases\}\; -\; x\; \backslash log\; x\; ifnbsp;\; x\; \backslash neq\; 0,\; \backslash \backslash \; 0\; ifnbsp;\; x\; =\; 0.\; \backslash end\{cases\}.\; /annotation/semantics$$

Then $<semantics>{\partial}_{\mathbb{R}}<annotation\; encoding="application/x-tex">\backslash partial\_\backslash mathbb\{R\}</annotation></semantics>$ has the derivative-like property

$$<semantics>{\partial}_{\mathbb{R}}(xy)=x{\partial}_{\mathbb{R}}(y)+{\partial}_{\mathbb{R}}(x)y.<annotation\; encoding="application/x-tex">\; \backslash partial\_\backslash mathbb\{R\}(x\; y)\; =\; x\; \backslash partial\_\backslash mathbb\{R\}(y)\; +\; \backslash partial\_\backslash mathbb\{R\}(x)\; y.\; </annotation></semantics>$$

A *linear* map with this property is called a derivation, so it’s
reasonable to call $<semantics>{\partial}_{\mathbb{R}}<annotation\; encoding="application/x-tex">\backslash partial\_\backslash mathbb\{R\}</annotation></semantics>$ a **nonlinear derivation**.

The observation that $<semantics>{\partial}_{\mathbb{R}}<annotation\; encoding="application/x-tex">\backslash partial\_\backslash mathbb\{R\}</annotation></semantics>$ is a nonlinear derivation turns out to be quite useful. For instance, real entropy is given by

$$<semantics>{H}_{\mathbb{R}}(\pi )=\sum _{i=1}^{n}{\partial}_{\mathbb{R}}({\pi}_{i})<annotation\; encoding="application/x-tex">\; H\_\backslash mathbb\{R\}(\backslash pi)\; =\; \backslash sum\_\{i\; =\; 1\}^n\; \backslash partial\_\backslash mathbb\{R\}(\backslash pi\_i)\; </annotation></semantics>$$

($<semantics>\pi \in {\Pi}_{n}<annotation\; encoding="application/x-tex">\backslash pi\; \backslash in\; \backslash Pi\_n</annotation></semantics>$), and verifying the chain rule for $<semantics>{H}_{\mathbb{R}}<annotation\; encoding="application/x-tex">H\_\backslash mathbb\{R\}</annotation></semantics>$ is done most neatly using the derivation property of $<semantics>{\partial}_{\mathbb{R}}<annotation\; encoding="application/x-tex">\backslash partial\_\backslash mathbb\{R\}</annotation></semantics>$.

An equivalent formula for real entropy is

$$<semantics>{H}_{\mathbb{R}}(\pi )=\sum _{i=1}^{n}{\partial}_{\mathbb{R}}({\pi}_{i})-{\partial}_{\mathbb{R}}\left(\sum _{i=1}^{n}{\pi}_{i}\right).<annotation\; encoding="application/x-tex">\; H\_\backslash mathbb\{R\}(\backslash pi)\; =\; \backslash sum\_\{i\; =\; 1\}^n\; \backslash partial\_\backslash mathbb\{R\}(\backslash pi\_i)\; -\; \backslash partial\_\backslash mathbb\{R\}\backslash biggl(\; \backslash sum\_\{i\; =\; 1\}^n\; \backslash pi\_i\; \backslash biggr).\; </annotation></semantics>$$

This is a triviality: $<semantics>\sum {\pi}_{i}=1<annotation\; encoding="application/x-tex">\backslash sum\; \backslash pi\_i\; =\; 1</annotation></semantics>$, so $<semantics>{\partial}_{\mathbb{R}}(\sum {\pi}_{i})=0<annotation\; encoding="application/x-tex">\backslash partial\_\backslash mathbb\{R\}\backslash bigl(\; \backslash sum\; \backslash pi\_i\; \backslash bigr)\; =\; 0</annotation></semantics>$, so this is the same as the previous formula. But it’s also quite suggestive: $<semantics>{H}_{\mathbb{R}}(\pi )<annotation\; encoding="application/x-tex">H\_\backslash mathbb\{R\}(\backslash pi)</annotation></semantics>$ measures the extent to which the nonlinear derivation $<semantics>{\partial}_{\mathbb{R}}<annotation\; encoding="application/x-tex">\backslash partial\_\backslash mathbb\{R\}</annotation></semantics>$ fails to preserve the sum $<semantics>\sum {\pi}_{i}<annotation\; encoding="application/x-tex">\backslash sum\; \backslash pi\_i</annotation></semantics>$.

Now let’s try to imitate this in $<semantics>\mathbb{Z}/p\mathbb{Z}<annotation\; encoding="application/x-tex">\backslash mathbb\{Z\}/p\backslash mathbb\{Z\}</annotation></semantics>$. Since $<semantics>{q}_{p}<annotation\; encoding="application/x-tex">q\_p</annotation></semantics>$ plays a similar role to $<semantics>\mathrm{log}<annotation\; encoding="application/x-tex">\backslash log</annotation></semantics>$, it’s natural to define

$$<semantics>{\partial}_{p}(n)=-n{q}_{p}(n)=\frac{n-{n}^{p}}{p}<annotation\; encoding="application/x-tex">\; \backslash partial\_p(n)\; =\; -n\; q\_p(n)\; =\; \backslash frac\{n\; -\; n^p\}\{p\}\; </annotation></semantics>$$

for integers $<semantics>n<annotation\; encoding="application/x-tex">n</annotation></semantics>$ not divisible by $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$. But the last expression makes sense
even if $<semantics>n<annotation\; encoding="application/x-tex">n</annotation></semantics>$ *is* divisible by $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$. So, we can define a function

$$<semantics>{\partial}_{p}:\mathbb{Z}/{p}^{2}\mathbb{Z}\to \mathbb{Z}/p\mathbb{Z}<annotation\; encoding="application/x-tex">\; \backslash partial\_p\; :\; \backslash mathbb\{Z\}/p^2\backslash mathbb\{Z\}\; \backslash to\; \backslash mathbb\{Z\}/p\backslash mathbb\{Z\}\; </annotation></semantics>$$

by $<semantics>{\partial}_{p}(n)=(n-{n}^{p})/p<annotation\; encoding="application/x-tex">\backslash partial\_p(n)\; =\; (n\; -\; n^p)/p</annotation></semantics>$. (This is called a $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$-derivation.) It’s easy to check that $<semantics>{\partial}_{p}<annotation\; encoding="application/x-tex">\backslash partial\_p</annotation></semantics>$ has the derivative-like property

$$<semantics>{\partial}_{p}(mn)=m{\partial}_{p}(n)+{\partial}_{p}(m)n.<annotation\; encoding="application/x-tex">\; \backslash partial\_p(m\; n)\; =\; m\; \backslash partial\_p(n)\; +\; \backslash partial\_p(m)\; n.\; </annotation></semantics>$$

And now we arrive at the long-awaited definition. The **entropy mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$**
of $<semantics>\pi =({\pi}_{1},\dots ,{\pi}_{n})<annotation\; encoding="application/x-tex">\backslash pi\; =\; (\backslash pi\_1,\; \backslash ldots,\; \backslash pi\_n)</annotation></semantics>$ is

$$<semantics>{H}_{p}(\pi )=\sum _{i=1}^{n}{\partial}_{p}({a}_{i})-{\partial}_{p}\left(\sum _{i=1}^{n}{a}_{i}\right),<annotation\; encoding="application/x-tex">\; H\_p(\backslash pi)\; =\; \backslash sum\_\{i\; =\; 1\}^n\; \backslash partial\_p(a\_i)\; -\; \backslash partial\_p\backslash biggl(\; \backslash sum\_\{i\; =\; 1\}^n\; a\_i\; \backslash biggr),\; </annotation></semantics>$$

where $<semantics>{a}_{i}\in \mathbb{Z}<annotation\; encoding="application/x-tex">a\_i\; \backslash in\; \backslash mathbb\{Z\}</annotation></semantics>$ represents $<semantics>{\pi}_{i}\in \mathbb{Z}/p\mathbb{Z}<annotation\; encoding="application/x-tex">\backslash pi\_i\; \backslash in\; \backslash mathbb\{Z\}/p\backslash mathbb\{Z\}</annotation></semantics>$. This is independent of the choice of representatives $<semantics>{a}_{i}<annotation\; encoding="application/x-tex">a\_i</annotation></semantics>$. And when you work it out explicitly, it gives

$$<semantics>{H}_{p}(\pi )=\frac{1}{p}(1-\sum _{i=1}^{n}{a}_{i}^{p}).<annotation\; encoding="application/x-tex">\; H\_p(\backslash pi)\; =\; \backslash frac\{1\}\{p\}\; \backslash biggl(\; 1\; -\; \backslash sum\_\{i\; =\; 1\}^n\; a\_i^p\; \backslash biggr).\; </annotation></semantics>$$

Just as in the real case, $<semantics>{H}_{p}<annotation\; encoding="application/x-tex">H\_p</annotation></semantics>$ satisfies the chain rule, which is most easily shown using the derivation property of $<semantics>{\partial}_{p}<annotation\; encoding="application/x-tex">\backslash partial\_p</annotation></semantics>$.

Before I say any more, let’s have some examples.

In the real case, the uniform distribution $<semantics>{u}_{n}=(1/n,\dots ,1/n)<annotation\; encoding="application/x-tex">u\_n\; =\; (1/n,\; \backslash ldots,\; 1/n)</annotation></semantics>$ has entropy $<semantics>\mathrm{log}n<annotation\; encoding="application/x-tex">\backslash log\; n</annotation></semantics>$. Mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$, this distribution only makes sense if $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$ does not divide $<semantics>n<annotation\; encoding="application/x-tex">n</annotation></semantics>$ (otherwise $<semantics>1/n<annotation\; encoding="application/x-tex">1/n</annotation></semantics>$ is undefined); but assuming that, we do indeed have $<semantics>{H}_{p}({u}_{n})={q}_{p}(n)<annotation\; encoding="application/x-tex">H\_p(u\_n)\; =\; q\_p(n)</annotation></semantics>$, as we’d expect.

When we take our prime $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$ to be $<semantics>2<annotation\; encoding="application/x-tex">2</annotation></semantics>$, a probability distribution $<semantics>\pi <annotation\; encoding="application/x-tex">\backslash pi</annotation></semantics>$ is just a sequence of bits like $<semantics>(0,0,1,0,1,1,1,0,1)<annotation\; encoding="application/x-tex">(0,\; 0,\; 1,\; 0,\; 1,\; 1,\; 1,\; 0,\; 1)</annotation></semantics>$ with an odd number of $<semantics>1<annotation\; encoding="application/x-tex">1</annotation></semantics>$s. Its entropy $<semantics>{H}_{2}(\pi )\in \mathbb{Z}/2\mathbb{Z}<annotation\; encoding="application/x-tex">H\_2(\backslash pi)\; \backslash in\; \backslash mathbb\{Z\}/2\backslash mathbb\{Z\}</annotation></semantics>$ turns out to be $<semantics>0<annotation\; encoding="application/x-tex">0</annotation></semantics>$ if the number of $<semantics>1<annotation\; encoding="application/x-tex">1</annotation></semantics>$s is congruent to $<semantics>1<annotation\; encoding="application/x-tex">1</annotation></semantics>$ mod $<semantics>4<annotation\; encoding="application/x-tex">4</annotation></semantics>$, and $<semantics>1<annotation\; encoding="application/x-tex">1</annotation></semantics>$ if the number of $<semantics>1<annotation\; encoding="application/x-tex">1</annotation></semantics>$s is congruent to $<semantics>3<annotation\; encoding="application/x-tex">3</annotation></semantics>$ mod $<semantics>4<annotation\; encoding="application/x-tex">4</annotation></semantics>$.

What about distributions on two elements? In other words, let $<semantics>\alpha \in \mathbb{Z}/p\mathbb{Z}<annotation\; encoding="application/x-tex">\backslash alpha\; \backslash in\; \backslash mathbb\{Z\}/p\backslash mathbb\{Z\}</annotation></semantics>$ and put $<semantics>\pi =(\alpha ,1-\alpha )<annotation\; encoding="application/x-tex">\backslash pi\; =\; (\backslash alpha,\; 1\; -\; \backslash alpha)</annotation></semantics>$. What is $<semantics>{H}_{p}(\pi )<annotation\; encoding="application/x-tex">H\_p(\backslash pi)</annotation></semantics>$?

It takes a bit of algebra to figure this out, but it’s not too hard, and the outcome is that for $<semantics>p\ne 2<annotation\; encoding="application/x-tex">p\; \backslash neq\; 2</annotation></semantics>$, $$<semantics>{H}_{p}(\alpha ,1-\alpha )=\sum _{r=1}^{p-1}\frac{{\alpha}^{r}}{r}.<annotation\; encoding="application/x-tex">\; H\_p(\backslash alpha,\; 1\; -\; \backslash alpha)\; =\; \backslash sum\_\{r\; =\; 1\}^\{p\; -\; 1\}\; \backslash frac\{\backslash alpha^r\}\{r\}.\; </annotation></semantics>$$ This function was, in fact, the starting point of Kontsevich’s note, and it’s what he called the $<semantics>1\frac{1}{2}<annotation\; encoding="application/x-tex">1\backslash tfrac\{1\}\{2\}</annotation></semantics>$-logarithm.

We’ve now succeeded in finding a definition of entropy mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$ that
satisfies the chain rule. That’s not quite enough, though. In principle,
there could be *loads* of things satisfying the chain rule, in which case,
what special status would ours have?

But in fact, up to the inevitable constant factor, our definition of
entropy mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$ is the *one and only* definition satisfying the chain rule:

TheoremLet $<semantics>(I:{\Pi}_{n}\to \mathbb{Z}/p\mathbb{Z})<annotation\; encoding="application/x-tex">(I:\; \backslash Pi\_n\; \backslash to\; \backslash mathbb\{Z\}/p\backslash mathbb\{Z\})</annotation></semantics>$ be a sequence of functions. Then $<semantics>I<annotation\; encoding="application/x-tex">I</annotation></semantics>$ satisfies the chain rule if and only if $<semantics>I=c{H}_{p}<annotation\; encoding="application/x-tex">I\; =\; c\; H\_p</annotation></semantics>$ for some $<semantics>c\in \mathbb{Z}/p\mathbb{Z}<annotation\; encoding="application/x-tex">c\; \backslash in\; \backslash mathbb\{Z\}/p\backslash mathbb\{Z\}</annotation></semantics>$.

This is precisely analogous to the characterization theorem for real entropy, except that in the real case some analytic condition on $<semantics>I<annotation\; encoding="application/x-tex">I</annotation></semantics>$ has to be imposed (continuity in Faddeev’s theorem, and measurability in the stronger theorem of Lee). So, this is excellent justification for calling $<semantics>{H}_{p}<annotation\; encoding="application/x-tex">H\_p</annotation></semantics>$ the entropy mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$.

I’ll say nothing about the proof except the following. In Faddeev’s
theorem over $<semantics>\mathbb{R}<annotation\; encoding="application/x-tex">\backslash mathbb\{R\}</annotation></semantics>$, the tricky part of the proof involves the fact
that the sequence $<semantics>(\mathrm{log}n{)}_{n\ge 1}<annotation\; encoding="application/x-tex">(\backslash log\; n)\_\{n\; \backslash geq\; 1\}</annotation></semantics>$ is *not* uniquely characterized up
to a constant factor by the equation $<semantics>\mathrm{log}(mn)=\mathrm{log}m+\mathrm{log}n<annotation\; encoding="application/x-tex">\backslash log(m\; n)\; =\; \backslash log\; m\; +\; \backslash log\; n</annotation></semantics>$; to make
that work, you have to introduce some analytic condition. Over
$<semantics>\mathbb{Z}/p\mathbb{Z}<annotation\; encoding="application/x-tex">\backslash mathbb\{Z\}/p\backslash mathbb\{Z\}</annotation></semantics>$, the tricky part involves the fact that the domain
of the “logarithm” (Fermat quotient) is not $<semantics>\mathbb{Z}/p\mathbb{Z}<annotation\; encoding="application/x-tex">\backslash mathbb\{Z\}/p\backslash mathbb\{Z\}</annotation></semantics>$, but
$<semantics>\mathbb{Z}/{p}^{2}\mathbb{Z}<annotation\; encoding="application/x-tex">\backslash mathbb\{Z\}/p^2\backslash mathbb\{Z\}</annotation></semantics>$. So, analytic difficulties are replaced by
number-theoretic difficulties.

Kontsevich didn’t actually write down a definition of entropy mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$ in his two-and-a-half page note. He did exactly enough to show that there must be a unique sensible such definition… and left it there! Of course he could have worked it out if he’d wanted to, and maybe he even did, but he didn’t write it up here.

Anyway, let’s return to the quotation from Kontsevich that I began my first post with:

Conclusion:If we have a random variable $<semantics>\xi <annotation\; encoding="application/x-tex">\backslash xi</annotation></semantics>$ which takes finitely many values with all probabilities in $<semantics>\mathbb{R}<annotation\; encoding="application/x-tex">\backslash mathbb\{R\}</annotation></semantics>$ then we can define not only the transcendental number $<semantics>H(\xi )<annotation\; encoding="application/x-tex">H(\backslash xi)</annotation></semantics>$ but also its “residues modulo $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$” for almost all primes $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$ !

In the notation of these posts, he’s saying the following. Let

$$<semantics>\pi =({\pi}_{1},\dots ,{\pi}_{n})<annotation\; encoding="application/x-tex">\; \backslash pi\; =\; (\backslash pi\_1,\; \backslash ldots,\; \backslash pi\_n)\; </annotation></semantics>$$

be a real probability distribution in which each $<semantics>{\pi}_{i}<annotation\; encoding="application/x-tex">\backslash pi\_i</annotation></semantics>$ is rational.
There are only finitely many primes that divide one or more of the
denominators of $<semantics>{\pi}_{1},\dots ,{\pi}_{n}<annotation\; encoding="application/x-tex">\backslash pi\_1,\; \backslash ldots,\; \backslash pi\_n</annotation></semantics>$. For primes $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$ *not* belonging to
this exceptional set, we can interpret $<semantics>\pi <annotation\; encoding="application/x-tex">\backslash pi</annotation></semantics>$ as a probability distribution
in $<semantics>\mathbb{Z}/p\mathbb{Z}<annotation\; encoding="application/x-tex">\backslash mathbb\{Z\}/p\backslash mathbb\{Z\}</annotation></semantics>$. We can therefore take its mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$ entropy,
$<semantics>{H}_{p}(\pi )<annotation\; encoding="application/x-tex">H\_p(\backslash pi)</annotation></semantics>$.

Kontsevich is playfully suggesting that we view $<semantics>{H}_{p}(\pi )\in \mathbb{Z}/p\mathbb{Z}<annotation\; encoding="application/x-tex">H\_p(\backslash pi)\; \backslash in\; \backslash mathbb\{Z\}/p\backslash mathbb\{Z\}</annotation></semantics>$ as the residue class mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$ of $<semantics>{H}_{\mathbb{R}}(\pi )\in \mathbb{R}<annotation\; encoding="application/x-tex">H\_\backslash mathbb\{R\}(\backslash pi)\; \backslash in\; \backslash mathbb\{R\}</annotation></semantics>$.

There is more to this than meets the eye! Different real probability distributions can have the same real entropy, so there’s a question of consistency. Kontsevich’s suggestion only makes sense if

$$<semantics>{H}_{\mathbb{R}}(\pi )={H}_{\mathbb{R}}(\pi \prime )\Rightarrow {H}_{p}(\pi )={H}_{p}(\pi \prime ).<annotation\; encoding="application/x-tex">\; H\_\backslash mathbb\{R\}(\backslash pi)\; =\; H\_\backslash mathbb\{R\}(\backslash pi\text{\'})\; \backslash implies\; H\_p(\backslash pi)\; =\; H\_p(\backslash pi\text{\'}).\; </annotation></semantics>$$

And this is true! I have a proof, though I’m not convinced it’s optimal. Does anyone see an easy argument for this?

Let’s write $<semantics>{\mathscr{H}}^{(p)}<annotation\; encoding="application/x-tex">\backslash mathcal\{H\}^\{(p)\}</annotation></semantics>$ for the set of real numbers of the form $<semantics>{H}_{\mathbb{R}}(\pi )<annotation\; encoding="application/x-tex">H\_\backslash mathbb\{R\}(\backslash pi)</annotation></semantics>$, where $<semantics>\pi <annotation\; encoding="application/x-tex">\backslash pi</annotation></semantics>$ is a real probability distribution whose probabilities $<semantics>{\pi}_{i}<annotation\; encoding="application/x-tex">\backslash pi\_i</annotation></semantics>$ can all be expressed as fractions with denominator not divisible by $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$. We’ve just seen that there’s a well-defined map

$$<semantics>[.]:{\mathscr{H}}^{(p)}\to \mathbb{Z}/p\mathbb{Z}<annotation\; encoding="application/x-tex">\; [.]\; :\; \backslash mathcal\{H\}^\{(p)\}\; \backslash to\; \backslash mathbb\{Z\}/p\backslash mathbb\{Z\}\; </annotation></semantics>$$

defined by

$$<semantics>[{H}_{\mathbb{R}}(\pi )]={H}_{p}(\pi ).<annotation\; encoding="application/x-tex">\; [H\_\backslash mathbb\{R\}(\backslash pi)]\; =\; H\_p(\backslash pi).\; </annotation></semantics>$$

For $<semantics>x\in {\mathscr{H}}^{(p)}\subseteq \mathbb{R}<annotation\; encoding="application/x-tex">x\; \backslash in\; \backslash mathcal\{H\}^\{(p)\}\; \backslash subseteq\; \backslash mathbb\{R\}</annotation></semantics>$, we view $<semantics>[x]<annotation\; encoding="application/x-tex">[x]</annotation></semantics>$ as the congruence class mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$ of $<semantics>x<annotation\; encoding="application/x-tex">x</annotation></semantics>$. This notion of “congruence class” even behaves something like the ordinary notion, in the sense that $<semantics>[.]<annotation\; encoding="application/x-tex">[.]</annotation></semantics>$ preserves addition.

(We can even go a bit further. Accompanying the characterization theorem
for entropy mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$, there is a characterization theorem for information
loss mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$, strictly analogous to the theorem that John Baez, Tobias
Fritz and I proved over $<semantics>\mathbb{R}<annotation\; encoding="application/x-tex">\backslash mathbb\{R\}</annotation></semantics>$. I won’t review that stuff here,
but the point is that an information loss is a *difference* of entropies,
and this enables us to define the congruence class mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$ of the
*difference* of two elements of $<semantics>{\mathscr{H}}^{(p)}<annotation\; encoding="application/x-tex">\backslash mathcal\{H\}^\{(p)\}</annotation></semantics>$. The same additivity holds.)

There’s just one more thing. In a way, the definition of entropy mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$ is unsatisfactory. In order to define it, we had to step outside the world of $<semantics>\mathbb{Z}/p\mathbb{Z}<annotation\; encoding="application/x-tex">\backslash mathbb\{Z\}/p\backslash mathbb\{Z\}</annotation></semantics>$ by making arbitrary choices of representing integers, and then we had to show that the definition was independent of those choices. Can’t we do it directly?

In fact, we can. It’s a well-known miracle about finite fields $<semantics>K<annotation\; encoding="application/x-tex">K</annotation></semantics>$ that
*any* function $<semantics>K\to K<annotation\; encoding="application/x-tex">K\; \backslash to\; K</annotation></semantics>$ is a polynomial. It’s a slightly less well-known
miracle that any function $<semantics>{K}^{n}\to K<annotation\; encoding="application/x-tex">K^n\; \backslash to\; K</annotation></semantics>$, for any $<semantics>n\ge 0<annotation\; encoding="application/x-tex">n\; \backslash geq\; 0</annotation></semantics>$, is also a
polynomial.

Of course, multiple polynomials can induce the same function. For
instance, the polynomials $<semantics>{x}^{p}<annotation\; encoding="application/x-tex">x^p</annotation></semantics>$ and $<semantics>x<annotation\; encoding="application/x-tex">x</annotation></semantics>$ induce the same function
$<semantics>\mathbb{Z}/p\mathbb{Z}\to \mathbb{Z}/p\mathbb{Z}<annotation\; encoding="application/x-tex">\backslash mathbb\{Z\}/p\backslash mathbb\{Z\}\; \backslash to\; \backslash mathbb\{Z\}/p\backslash mathbb\{Z\}</annotation></semantics>$. But it’s still
possible to make a uniqueness statement. Given a function $<semantics>F:{K}^{n}\to K<annotation\; encoding="application/x-tex">F\; :\; K^n\; \backslash to\; K</annotation></semantics>$,
there’s a *unique* polynomial $<semantics>f\in K[{x}_{1},\dots ,{x}_{n}]<annotation\; encoding="application/x-tex">f\; \backslash in\; K[x\_1,\; \backslash ldots,\; x\_n]</annotation></semantics>$ that induces $<semantics>F<annotation\; encoding="application/x-tex">F</annotation></semantics>$
and is of degree less than the order of $<semantics>K<annotation\; encoding="application/x-tex">K</annotation></semantics>$ in each variable separately.

So, there must be a polynomial representing entropy, of order less than $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$ in each variable. And as it turns out, it’s this one:

$$<semantics>{H}_{p}({\pi}_{1},\dots ,{\pi}_{n})=-\sum _{\begin{array}{c}0\le {r}_{1},\dots ,{r}_{n}<p:\\ {r}_{1}+\cdots +{r}_{n}=p\end{array}}\frac{{\pi}_{1}^{{r}_{1}}\cdots {\pi}_{n}^{{r}_{n}}}{{r}_{1}!\cdots {r}_{n}!}.<annotation\; encoding="application/x-tex">\; H\_p(\backslash pi\_1,\; \backslash ldots,\; \backslash pi\_n)\; =\; -\; \backslash sum\_\{\backslash substack\{0\; \backslash leq\; r\_1,\; \backslash ldots,\; r\_n\; \backslash lt\; p:\backslash \backslash r\_1\; +\; \backslash cdots\; +\; r\_n\; =\; p\}\}\; \backslash frac\{\backslash pi\_1^\{r\_1\}\; \backslash cdots\; \backslash pi\_n^\{r\_n\}\}\{r\_1!\; \backslash cdots\; r\_n!\}.\; </annotation></semantics>$$

You can check that when $<semantics>n=2<annotation\; encoding="application/x-tex">n\; =\; 2</annotation></semantics>$, this is in fact the same polynomial $<semantics>{\sum}_{r=1}^{p-1}{\pi}_{1}^{r}/r<annotation\; encoding="application/x-tex">\backslash sum\_\{r\; =\; 1\}^\{p\; -\; 1\}\; \backslash pi\_1^r/r</annotation></semantics>$ as we met before — Kontsevich’s $<semantics>1\frac{1}{2}<annotation\; encoding="application/x-tex">1\backslash tfrac\{1\}\{2\}</annotation></semantics>$-logarithm.

It’s striking that this direct formula for entropy modulo a prime looks quite unlike the formula for real entropy,

$$<semantics>{H}_{\mathbb{R}}(\pi )=-\sum _{i:{\pi}_{i}\ne 0}{\pi}_{i}\mathrm{log}{\pi}_{i}.<annotation\; encoding="application/x-tex">\; H\_\backslash mathbb\{R\}(\backslash pi)\; =\; -\; \backslash sum\_\{i\; :\; \backslash pi\_i\; \backslash neq\; 0\}\; \backslash pi\_i\; \backslash log\; \backslash pi\_i.\; </annotation></semantics>$$

It’s also striking that in the case $<semantics>n=2<annotation\; encoding="application/x-tex">n\; =\; 2</annotation></semantics>$, the formula for real entropy is

$$<semantics>{H}_{\mathbb{R}}(\alpha ,1-\alpha )=-\alpha \mathrm{log}\alpha -(1-\alpha )\mathrm{log}(1-\alpha ),<annotation\; encoding="application/x-tex">\; H\_\backslash mathbb\{R\}(\backslash alpha,\; 1\; -\; \backslash alpha)\; =\; -\; \backslash alpha\; \backslash log\; \backslash alpha\; -\; (1\; -\; \backslash alpha)\; \backslash log(1\; -\; \backslash alpha),\; </annotation></semantics>$$

whereas mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$, we get

$$<semantics>{H}_{p}(\alpha ,1-\alpha )=\sum _{r=1}^{p-1}\frac{{\alpha}^{r}}{r},<annotation\; encoding="application/x-tex">\; H\_p(\backslash alpha,\; 1\; -\; \backslash alpha)\; =\; \backslash sum\_\{r\; =\; 1\}^\{p\; -\; 1\}\; \backslash frac\{\backslash alpha^r\}\{r\},\; </annotation></semantics>$$

which is a truncation of the Taylor series of $<semantics>-\mathrm{log}(1-\alpha )<annotation\; encoding="application/x-tex">-\backslash log(1\; -\; \backslash alpha)</annotation></semantics>$. And yet, the characterization theorems for entropy over $<semantics>\mathbb{R}<annotation\; encoding="application/x-tex">\backslash mathbb\{R\}</annotation></semantics>$ and over $<semantics>\mathbb{Z}/p\mathbb{Z}<annotation\; encoding="application/x-tex">\backslash mathbb\{Z\}/p\backslash mathbb\{Z\}</annotation></semantics>$ are strictly analogous.

As I see it, there are two or three big open questions:

Entropy over $<semantics>\mathbb{R}<annotation\; encoding="application/x-tex">\backslash mathbb\{R\}</annotation></semantics>$ can be understood, interpreted and applied in many ways. How can we understand, interpret or apply entropy mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$?

Entropy over $<semantics>\mathbb{R}<annotation\; encoding="application/x-tex">\backslash mathbb\{R\}</annotation></semantics>$ and entropy mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$ are defined in roughly analogous ways, and uniquely characterized by strictly analogous theorems. Is there a common generalization? That is, can we unify the two definitions and characterization theorems, perhaps proving a theorem about entropy over suitable fields?

by leinster (Tom.Leinster@gmx.com) at December 17, 2017 09:10 PM

### Peter Coles - In the Dark

This is what a University education meant to the poet and theologian Thomas Traherne (1636-1674), according to his Centuries of Meditations. In this astonishing book describing his own voyage of spiritual discovery, Traherne celebrates, among many other things, the beauty and complexity of creation as a manifestation of the power of God. Even a non-religious person like myself can find much to appreciate in his words about the wonder of the natural world and the joy of learning for learning’s sake:

*Having been at the University, and received there the taste and tincture of another education, I saw that there were things in this world of which I never dreamed; glorious secrets, and glorious persons past imagination. *

*There I saw that Logic, Ethics, Physics, Metaphysics, Geometry, Astronomy, Poesy, Medicine, Grammar, Music, Rhetoric all kinds of Arts, Trades, and Mechanisms that adorned the world pertained to felicity; at least there I saw those things, which afterwards I knew to pertain unto it: and was delighted in it. *

*There I saw into the nature of the Sea, the Heavens, the Sun, the Moon and Stars, the Elements, Minerals, and Vegetables. All which appeared like the King’s Daughter, all glorious within; and those things which my nurses, and parents, should have talked of there were taught unto me.
*

## December 16, 2017

### Christian P. Robert - xi'an's og

Thanks To Victor Elvira, I read this fantastic novel by Ramón Sender, a requiem for a Spanish peasant, Pablo, which tells the story of a bright and progressive Spanish peasant from Aragon, who got shot by the fascists during the Spanish Civil War. The story is short and brilliant, told from the eyes of the parish priest who denounced Pablo to the Franco falanges who eventually executed it. The style is brilliant as well, since the priest keeps returning to his long-term connection with Pablo, from his years as an altar boy, discovering poverty and injustice when visiting dying parishioners with the priest, to launching rural reform actions against the local landowners. And uselessly if understandably trying to justify his responsibility in the death of the young man, celebrating a mass in his memory where no one from the village attends, except for the landowners themselves. A truly moving celebration of the Spanish Civil War and of the massive support of the catholic church for Franco.

Filed under: Books, pictures, Travel Tagged: Aragon, book reviews, Catholic Church, Franco, Ramón Sender, Spain, Spanish Civil War, Spanish history

### Christian P. Robert - xi'an's og

**As** Kristian Lum’s courageous posting of her harrowing experience at ISBA 2010 and of her resulting decision to leave academia, if not thankfully research (as demonstrated by her recent work on the biases in policing software), is hitting the Bayesian community and beyond as a salutary tsunami, I am seeking concrete actions to change ISBA meetings towards preventing to the largest extent sexual harassment and helping victims formally as well as informally, as Dan Simpson put it on his blog post. Having discussed the matter intensely with colleagues and friends over the past days, and joined a Task Force set immediately on Dec 14 by Kerrie Mengersen in her quality of President of ISBA, there are many avenues in the medium and long terms to approach such goals. But I feel the most urgent action is to introduce contact referents (for lack of a better name outside the military or the religious…) who at each conference could be reached at all times in case of need or of reporting inappropriate conduct of any kind. This may prove difficult to build, not because of a lack of volunteers but because of the difficulty in achieving a representativity of all attendees towards them trusting at least one member well enough to reach and confide. One section of ISBA, j-ISBA, can and definitely does help in this regard, including its involvement in the Task Force, but we need to reach further. As put by Kerrie in her statement, your input is valued.

Filed under: University life Tagged: ISBA, Kerrie Mengersen, safeisba@bayesian.org, sexual harassment, Spain, Valencia conferences

### The n-Category Cafe

Here’s a draft of a little thing I’m writing for the *Newsletter of the London Mathematical Society*. The regular icosahedron is connected to many ‘exceptional objects’ in mathematics, and here I describe two ways of using it to construct $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\; \backslash mathrm\{E\}\_8</annotation></semantics>$. One uses a subring of the quaternions called the ‘icosians’, while the other uses Du Val’s work on the resolution of Kleinian singularities. I leave it as a challenge to find the connection between these two constructions!

(Dedicated readers of this blog may recall that I was struggling with the second construction in July. David Speyer helped me a lot, but I got distracted by other work and the discussion fizzled. Now I’ve made more progress… but I’ve realized that the details would never fit in the *Newsletter*, so I’m afraid anyone interested will have to wait a bit longer.)

You can get a PDF version here:

• From the icosahedron to E_{8}.

But blogs are more fun.

### From the Icosahedron to E_{8}

In mathematics, every sufficiently beautiful object is connected to all others. Many exciting adventures, of various levels of difficulty, can be had by following these connections.
Take, for example, the icosahedron — that is, the *regular* icosahedron, one of the five Platonic solids. Starting from this it is just a hop, skip and a jump to the $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\backslash mathrm\{E\}\_8</annotation></semantics>$ lattice, a wonderful pattern of points in 8 dimensions! As we explore this connection we shall see that it also ties together many other remarkable entities: the golden ratio, the quaternions, the quintic equation, a highly symmetrical 4-dimensional shape called the 600-cell, and a manifold called the Poincaré homology 3-sphere.

Indeed, the main problem with these adventures is knowing where to stop. The story we shall tell is just a snippet of a longer one involving the McKay correspondence and quiver representations. It would be easy to bring in the octonions, exceptional Lie groups, and more. But it can be enjoyed without these digressions, so let us introduce the protagonists without further ado.

The icosahedron has a long history. According to a comment in Euclid’s *Elements* it was discovered by Plato’s friend Theaetetus, a geometer who lived from roughly 415 to 369 BC. Since Theaetetus is believed to have classified the Platonic solids, he may have found the icosahedron as part of this project. If so, it is one of the earliest mathematical objects discovered as part of a classification theorem. In any event, it was known to Plato: in his *Timaeus*, he argued that water comes in atoms of this shape.

The icosahedron has 20 triangular faces, 30 edges, and 12 vertices. We can take the vertices to be the four points

$$<semantics>(0,\pm 1,\pm \Phi )<annotation\; encoding="application/x-tex">\; (0\; ,\; \backslash pm\; 1\; ,\; \backslash pm\; \backslash Phi)\; </annotation></semantics>$$

and all those obtained from these by cyclic permutations of the coordinates, where

$$<semantics>{\displaystyle \Phi =\frac{\sqrt{5}+1}{2}}<annotation\; encoding="application/x-tex">\; \backslash displaystyle\{\; \backslash Phi\; =\; \backslash frac\{\backslash sqrt\{5\}\; +\; 1\}\{2\}\; \}\; </annotation></semantics>$$

is the golden ratio. Thus, we can group the vertices into three orthogonal **golden rectangles**: rectangles whose proportions are $<semantics>\Phi <annotation\; encoding="application/x-tex">\backslash Phi</annotation></semantics>$ to 1.

In fact, there are five ways to do this. The rotational symmetries of the icosahedron permute these five ways, and any nontrivial rotation gives a nontrivial permutation. The rotational symmetry group of the icosahedron is thus a subgroup of $<semantics>{\mathrm{S}}_{5}<annotation\; encoding="application/x-tex">\; \backslash mathrm\{S\}\_5</annotation></semantics>$. Moreover, this subgroup has 60 elements. After all, any rotation is determined by what it does to a chosen face of the icosahedron: it can map this face to any of the 20 faces, and it can do so in 3 ways. The rotational symmetry group of the icosahedron is therefore a 60-element subgroup of $<semantics>{\mathrm{S}}_{5}<annotation\; encoding="application/x-tex">\; \backslash mathrm\{S\}\_5</annotation></semantics>$. Group theory therefore tells us that it must be the alternating group $<semantics>{\mathrm{A}}_{5}<annotation\; encoding="application/x-tex">\; \backslash mathrm\{A\}\_5</annotation></semantics>$.

The $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\; \backslash mathrm\{E\}\_8</annotation></semantics>$ lattice is harder to visualize than the icosahedron, but still easy to characterize. Take a bunch of equal-sized spheres in 8 dimensions. Get as many of these spheres to touch a single sphere as you possibly can. Then, get as many to touch *those* spheres as you possibly can, and so on. Unlike in 3 dimensions, where there is “wiggle room”, you have no choice about how to proceed, except for an overall rotation and translation. The balls will inevitably be centered at points of the $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\; \backslash mathrm\{E\}\_8</annotation></semantics>$ lattice!

We can also characterize the $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\backslash mathrm\{E\}\_8</annotation></semantics>$ lattice as the one giving the densest packing of spheres among all lattices in 8 dimensions. This packing was long suspected to be optimal even among those that do not arise from lattices — but this fact was proved only in 2016, by the young mathematician Maryna Viazovska [V].

We can also describe the $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\backslash mathrm\{E\}\_8</annotation></semantics>$ lattice more explicitly. In suitable coordinates, it consists of vectors for which:

• the components are either all integers or all integers plus $<semantics>\frac{1}{2}<annotation\; encoding="application/x-tex">\backslash textstyle\{\backslash frac\{1\}\{2\}\}</annotation></semantics>$, and

• the components sum to an even number.

This lattice consists of all integral linear combinations of the 8 rows of this matrix:

$$<semantics>\left(\begin{array}{rrrrrrrr}1& -1& 0& 0& 0& 0& 0& 0\\ 0& 1& -1& 0& 0& 0& 0& 0\\ 0& 0& 1& -1& 0& 0& 0& 0\\ 0& 0& 0& 1& -1& 0& 0& 0\\ 0& 0& 0& 0& 1& -1& 0& 0\\ 0& 0& 0& 0& 0& 1& -1& 0\\ 0& 0& 0& 0& 0& 1& 1& 0\\ -\frac{1}{2}& -\frac{1}{2}& -\frac{1}{2}& -\frac{1}{2}& -\frac{1}{2}& -\frac{1}{2}& -\frac{1}{2}& -\frac{1}{2}\end{array}\right)<annotation\; encoding="application/x-tex">\; \backslash left(\; \backslash begin\{array\}\{rrrrrrrr\}\; 1\&-1\&0\&0\&0\&0\&0\&0\; \backslash \backslash \; 0\&1\&-1\&0\&0\&0\&0\&0\; \backslash \backslash \; 0\&0\&1\&-1\&0\&0\&0\&0\; \backslash \backslash \; 0\&0\&0\&1\&-1\&0\&0\&0\; \backslash \backslash \; 0\&0\&0\&0\&1\&-1\&0\&0\; \backslash \backslash \; 0\&0\&0\&0\&0\&1\&-1\&0\; \backslash \backslash \; 0\&0\&0\&0\&0\&1\&1\&0\; \backslash \backslash \; -\backslash frac\{1\}\{2\}\&-\backslash frac\{1\}\{2\}\&-\backslash frac\{1\}\{2\}\&-\backslash frac\{1\}\{2\}\&-\backslash frac\{1\}\{2\}\&-\backslash frac\{1\}\{2\}\&-\backslash frac\{1\}\{2\}\&-\backslash frac\{1\}\{2\}\; \backslash end\{array\}\; \backslash right)\; </annotation></semantics>$$

The inner product of any row vector with itself is 2, while the inner product of distinct row vectors is either 0 or -1. Thus, any two of these vectors lie at an angle of either 90° or 120° from each other. If we draw a dot for each vector, and connect two dots by an edge when the angle between their vectors is 120° we get this pattern:

This is called the **$<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\backslash mathrm\{E\}\_8</annotation></semantics>$ Dynkin diagram**. In the first part of our story we shall find the $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\; \backslash mathrm\{E\}\_8</annotation></semantics>$ lattice hiding in the icosahedron; in the second part, we shall find this diagram. The two parts of this story must be related — but the relation remains mysterious, at least to me.

### The Icosians

The quickest route from the icosahedron to $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\; \backslash mathrm\{E\}\_8</annotation></semantics>$ goes through the fourth dimension. The symmetries of the icosahedron can be described using certain quaternions; the integer linear combinations of these form a subring of the quaternions called the ‘icosians’, but the icosians can be reinterpreted as a lattice in 8 dimensions, and this is the $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\; \backslash mathrm\{E\}\_8</annotation></semantics>$ lattice [CS]. Let us see how this works. The quaternions, discovered by Hamilton, are a 4-dimensional algebra

$$<semantics>{\displaystyle \mathbb{H}=\{a+bi+cj+dk:\phantom{\rule{thickmathspace}{0ex}}a,b,c,d\in \mathbb{R}\}}<annotation\; encoding="application/x-tex">\; \backslash displaystyle\{\; \backslash mathbb\{H\}\; =\; \backslash \{a\; +\; b\; i\; +\; c\; j\; +\; d\; k\; \backslash colon\; \backslash ;\; a,b,c,d\backslash in\; \backslash mathbb\{R\}\backslash \}\; \}\; </annotation></semantics>$$

with multiplication given as follows:

$$<semantics>{\displaystyle {i}^{2}={j}^{2}={k}^{2}=-1,}<annotation\; encoding="application/x-tex">\; \backslash displaystyle\{i^2\; =\; j^2\; =\; k^2\; =\; -1,\; \}\; </annotation></semantics>$$ $$<semantics>{\displaystyle ij=k=-ji\phantom{\rule{thickmathspace}{0ex}}\mathrm{and}\phantom{\rule{thickmathspace}{0ex}}\mathrm{cyclic}\phantom{\rule{thickmathspace}{0ex}}\mathrm{permutations}}<annotation\; encoding="application/x-tex">\; \backslash displaystyle\{i\; j\; =\; k\; =\; -\; j\; i\; \backslash ;\; and\; \backslash ;\; cyclic\; \backslash ;\; permutations\; \}\; </annotation></semantics>$$

It is a normed division algebra, meaning that the norm

$$<semantics>{\displaystyle |a+bi+cj+dk|=\sqrt{{a}^{2}+{b}^{2}+{c}^{2}+{d}^{2}}}<annotation\; encoding="application/x-tex">\; \backslash displaystyle\{\; |a\; +\; b\; i\; +\; c\; j\; +\; d\; k|\; =\; \backslash sqrt\{a^2\; +\; b^2\; +\; c^2\; +\; d^2\}\; \}\; </annotation></semantics>$$

obeys

$$<semantics>|qq\prime |=|q||q\prime |<annotation\; encoding="application/x-tex">\; |q\; q\text{\'}|\; =\; |q|\; |q\text{\'}|\; </annotation></semantics>$$

for all $<semantics>q,q\prime \in \mathbb{H}<annotation\; encoding="application/x-tex">q,q\text{\'}\; \backslash in\; \backslash mathbb\{H\}</annotation></semantics>$. The unit sphere in $<semantics>\mathbb{H}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{H\}</annotation></semantics>$ is thus a group, often called $<semantics>\mathrm{SU}(2)<annotation\; encoding="application/x-tex">\; \backslash mathrm\{SU\}(2)</annotation></semantics>$ because its elements can be identified with $<semantics>2\times 2<annotation\; encoding="application/x-tex">\; 2\; \backslash times\; 2</annotation></semantics>$ unitary matrices with determinant 1. This group acts as rotations of 3-dimensional Euclidean space, since we can see any point in $<semantics>{\mathbb{R}}^{3}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{R\}^3</annotation></semantics>$ as a **purely imaginary** quaternion $<semantics>x=bi+cj+dk<annotation\; encoding="application/x-tex">\; x\; =\; b\; i\; +\; c\; j\; +\; d\; k</annotation></semantics>$, and the quaternion $<semantics>{\mathrm{qxq}}^{-1}<annotation\; encoding="application/x-tex">\; qxq^\{-1\}</annotation></semantics>$ is then purely imaginary for any $<semantics>q\in \mathrm{SO}(3)<annotation\; encoding="application/x-tex">\; q\; \backslash in\; \backslash mathrm\{SO\}(3)</annotation></semantics>$. Indeed, this action gives a double cover

$$<semantics>{\displaystyle \alpha :\mathrm{SU}(2)\to \mathrm{SO}(3)}<annotation\; encoding="application/x-tex">\; \backslash displaystyle\{\; \backslash alpha\; \backslash colon\; \backslash mathrm\{SU\}(2)\; \backslash to\; \backslash mathrm\{SO\}(3)\; \}\; </annotation></semantics>$$

where $<semantics>\mathrm{SO}(3)<annotation\; encoding="application/x-tex">\; \backslash mathrm\{SO\}(3)</annotation></semantics>$ is the group of rotations of $<semantics>{\mathbb{R}}^{3}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{R\}^3</annotation></semantics>$.

We can thus take any Platonic solid, look at its group of rotational symmetries, get a subgroup of $<semantics>\mathrm{SO}(3)<annotation\; encoding="application/x-tex">\; \backslash mathrm\{SO\}(3)</annotation></semantics>$, and take its double cover in $<semantics>\mathrm{SU}(2)<annotation\; encoding="application/x-tex">\; \backslash mathrm\{SU\}(2)</annotation></semantics>$. If we do this starting with the icosahedron, we see that the $<semantics>60<annotation\; encoding="application/x-tex">\; 60</annotation></semantics>$-element group $<semantics>{\mathrm{A}}_{5}\subset \mathrm{SO}(3)<annotation\; encoding="application/x-tex">\; \backslash mathrm\{A\}\_5\; \backslash subset\; \backslash mathrm\{SO\}(3)</annotation></semantics>$ is covered by a 120-element group $<semantics>\Gamma \subset \mathrm{SU}(2)<annotation\; encoding="application/x-tex">\; \backslash Gamma\; \backslash subset\; \backslash mathrm\{SU\}(2)</annotation></semantics>$, called the **binary icosahedral group**.

The elements of $<semantics>\Gamma <annotation\; encoding="application/x-tex">\; \backslash Gamma</annotation></semantics>$ are quaternions of norm one, and it turns out that they are the vertices of a 4-dimensional regular polytope: a 4-dimensional cousin of the Platonic solids. It deserves to be called the ‘hypericosahedron’, but it is usually called the 600-cell, since it has 600 tetrahedral faces. Here is the 600-cell projected down to 3 dimensions, drawn using Robert Webb’s Stella software:

Explicitly, if we identify $<semantics>\mathbb{H}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{H\}</annotation></semantics>$ with $<semantics>{\mathbb{R}}^{4}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{R\}^4</annotation></semantics>$, the elements of $<semantics>\Gamma <annotation\; encoding="application/x-tex">\; \backslash Gamma</annotation></semantics>$ are the points

$$<semantics>{\displaystyle (\pm \frac{1}{2},\pm \frac{1}{2},\pm \frac{1}{2},\pm \frac{1}{2})}<annotation\; encoding="application/x-tex">\; \backslash displaystyle\{\; (\backslash pm\; \backslash textstyle\{\backslash frac\{1\}\{2\}\},\; \backslash pm\; \backslash textstyle\{\backslash frac\{1\}\{2\}\},\backslash pm\; \backslash textstyle\{\backslash frac\{1\}\{2\}\},\backslash pm\; \backslash textstyle\{\backslash frac\{1\}\{2\}\})\; \}\; </annotation></semantics>$$

$$<semantics>{\displaystyle (\pm 1,0,0,0)}<annotation\; encoding="application/x-tex">\; \backslash displaystyle\{\; (\backslash pm\; 1,\; 0,\; 0,\; 0)\; \}</annotation></semantics>$$

$$<semantics>{\displaystyle \frac{1}{2}(\pm \Phi ,\pm 1,\pm 1/\Phi ,0),}<annotation\; encoding="application/x-tex">\; \backslash displaystyle\{\; \backslash textstyle\{\backslash frac\{1\}\{2\}\}\; (\backslash pm\; \backslash Phi,\; \backslash pm\; 1\; ,\; \backslash pm\; 1/\backslash Phi,\; 0\; ),\}\; </annotation></semantics>$$

and those obtained from these by even permutations of the coordinates. Since these points are closed under multiplication, if we take integral linear combinations of them we get a subring of the quaternions:

$$<semantics>{\displaystyle \mathbb{I}=\{\sum _{q\in \Gamma}{a}_{q}q:\phantom{\rule{thickmathspace}{0ex}}{a}_{q}\in \mathbb{Z}\}\subset \mathbb{H}.}<annotation\; encoding="application/x-tex">\; \backslash displaystyle\{\; \backslash mathbb\{I\}\; =\; \backslash \{\; \backslash sum\_\{q\; \backslash in\; \backslash Gamma\}\; a\_q\; q\; :\; \backslash ;\; a\_q\; \backslash in\; \backslash mathbb\{Z\}\; \backslash \}\; \backslash subset\; \backslash mathbb\{H\}\; .\}\; </annotation></semantics>$$

Conway and Sloane [CS] call this the ring of **icosians**. The icosians are not a lattice in the quaternions: they are dense. However, any icosian is of the form $<semantics>a+\mathrm{bi}+\mathrm{cj}+\mathrm{dk}<annotation\; encoding="application/x-tex">\; a\; +\; bi\; +\; cj\; +\; dk</annotation></semantics>$ where $<semantics>a,b,c<annotation\; encoding="application/x-tex">\; a,b,c</annotation></semantics>$, and $<semantics>d<annotation\; encoding="application/x-tex">\; d</annotation></semantics>$ live in the **golden field**

$$<semantics>{\displaystyle \mathbb{Q}(\sqrt{5})=\{x+\sqrt{5}y:\phantom{\rule{thickmathspace}{0ex}}x,y\in \mathbb{Q}\}}<annotation\; encoding="application/x-tex">\; \backslash displaystyle\{\; \backslash mathbb\{Q\}(\backslash sqrt\{5\})\; =\; \backslash \{\; x\; +\; \backslash sqrt\{5\}\; y\; :\; \backslash ;\; x,y\; \backslash in\; \backslash mathbb\{Q\}\backslash \}\; \}\; </annotation></semantics>$$

Thus we can think of an icosian as an 8-tuple of rational numbers. Such 8-tuples form a lattice in 8 dimensions.

In fact we can put a norm on the icosians as follows. For $<semantics>q\in \mathbb{I}<annotation\; encoding="application/x-tex">\; q\; \backslash in\; \backslash mathbb\{I\}</annotation></semantics>$ the usual quaternionic norm has

$$<semantics>{\displaystyle |q{|}^{2}=x+\sqrt{5}y}<annotation\; encoding="application/x-tex">\; \backslash displaystyle\{\; |q|^2\; =\; x\; +\; \backslash sqrt\{5\}\; y\; \}\; </annotation></semantics>$$

for some rational numbers $<semantics>x<annotation\; encoding="application/x-tex">\; x</annotation></semantics>$ and $<semantics>y<annotation\; encoding="application/x-tex">\; y</annotation></semantics>$, but we can define a new norm on $<semantics>\mathbb{I}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{I\}</annotation></semantics>$ by setting

$$<semantics>{\displaystyle \Vert q{\Vert}^{2}=x+y}<annotation\; encoding="application/x-tex">\; \backslash displaystyle\{\; \backslash |q\backslash |^2\; =\; x\; +\; y\; \}\; </annotation></semantics>$$

With respect to this new norm, the icosians form a lattice that fits isometrically in 8-dimensional Euclidean space. And this is none other than $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\; \backslash mathrm\{E\}\_8</annotation></semantics>$!

### Klein’s Icosahedral Function

Not only is the $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\; \backslash mathrm\{E\}\_8</annotation></semantics>$ lattice hiding in the icosahedron; so is the $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\; \backslash mathrm\{E\}\_8</annotation></semantics>$ Dynkin diagram. The space of all regular icosahedra of arbitrary size centered at the origin has a singularity, which corresponds to a degenerate special case: the icosahedron of zero size. If we resolve this singularity in a minimal way we get eight Riemann spheres, intersecting in a pattern described by the $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\; \backslash mathrm\{E\}\_8</annotation></semantics>$ Dynkin diagram!

This remarkable story starts around 1884 with Felix Klein’s *Lectures on the Icosahedron* [Kl]. In this work he inscribed an icosahedron in the Riemann sphere, $<semantics>\u2102{\mathrm{P}}^{1}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{C\}\backslash mathrm\{P\}^1</annotation></semantics>$. He thus got the icosahedron’s symmetry group, $<semantics>{\mathrm{A}}_{5}<annotation\; encoding="application/x-tex">\; \backslash mathrm\{A\}\_5</annotation></semantics>$, to act as conformal transformations of $<semantics>\u2102{\mathrm{P}}^{1}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{C\}\backslash mathrm\{P\}^1</annotation></semantics>$ — indeed, rotations. He then found a rational function of one complex variable that is invariant under all these transformations. This function equals $<semantics>0<annotation\; encoding="application/x-tex">\; 0</annotation></semantics>$ at the centers of the icosahedron’s faces, 1 at the midpoints of its edges, and $<semantics>\mathrm{\infty}<annotation\; encoding="application/x-tex">\; \backslash infty</annotation></semantics>$ at its vertices.

Here is Klein’s icosahedral function as drawn by Abdelaziz Nait Merzouk. The color shows its phase, while the contour lines show its magnitude:

We can think of Klein’s icosahedral function as a branched cover of the Riemann sphere by itself with 60 sheets:

$$<semantics>{\displaystyle \mathcal{I}:\u2102{\mathrm{P}}^{1}\to \u2102{\mathrm{P}}^{1}.}<annotation\; encoding="application/x-tex">\; \backslash displaystyle\{\; \backslash mathcal\{I\}\; \backslash colon\; \backslash mathbb\{C\}\backslash mathrm\{P\}^1\; \backslash to\; \backslash mathbb\{C\}\backslash mathrm\{P\}^1\; .\}\; </annotation></semantics>$$

Indeed, $<semantics>{\mathrm{A}}_{5}<annotation\; encoding="application/x-tex">\; \backslash mathrm\{A\}\_5</annotation></semantics>$ acts on $<semantics>\u2102{\mathrm{P}}^{1}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{C\}\backslash mathrm\{P\}^1</annotation></semantics>$, and the quotient space $<semantics>\u2102{\mathrm{P}}^{1}/{\mathrm{A}}_{5}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{C\}\backslash mathrm\{P\}^1/\backslash mathrm\{A\}\_5</annotation></semantics>$ is isomorphic to $<semantics>\u2102{\mathrm{P}}^{1}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{C\}\backslash mathrm\{P\}^1</annotation></semantics>$ again. The function $<semantics>\mathcal{I}<annotation\; encoding="application/x-tex">\; \backslash mathcal\{I\}</annotation></semantics>$ gives an explicit formula for the quotient map $<semantics>\u2102{\mathrm{P}}^{1}\to \u2102{\mathrm{P}}^{1}/{\mathrm{A}}_{5}\cong \u2102{\mathrm{P}}^{1}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{C\}\backslash mathrm\{P\}^1\; \backslash to\; \backslash mathbb\{C\}\backslash mathrm\{P\}^1/\backslash mathrm\{A\}\_5\; \backslash cong\; \backslash mathbb\{C\}\backslash mathrm\{P\}^1</annotation></semantics>$.

Klein managed to reduce solving the quintic to the problem of solving the equation $<semantics>\mathcal{I}(z)=w<annotation\; encoding="application/x-tex">\; \backslash mathcal\{I\}(z)\; =\; w</annotation></semantics>$ for $<semantics>z<annotation\; encoding="application/x-tex">\; z</annotation></semantics>$. A modern exposition of this result is Shurman’s *Geometry of the Quintic* [Sh]. For a more high-powered approach, see the paper by Nash [N]. Unfortunately, neither of these treatments avoids complicated calculations. But our interest in Klein’s icosahedral function here does not come from its connection to the quintic: instead, we want to see its connection to $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\; \backslash mathrm\{E\}\_8</annotation></semantics>$.

For this we should actually construct Klein’s icosahedral function. To do this, recall that the Riemann sphere $<semantics>\u2102{\mathrm{P}}^{1}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{C\}\backslash mathrm\{P\}^1</annotation></semantics>$ is the space of 1-dimensional linear subspaces of $<semantics>{\u2102}^{2}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{C\}^2</annotation></semantics>$. Let us work directly with $<semantics>{\u2102}^{2}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{C\}^2</annotation></semantics>$. While $<semantics>\mathrm{SO}(3)<annotation\; encoding="application/x-tex">\; \backslash mathrm\{SO\}(3)</annotation></semantics>$ acts on $<semantics>\u2102{\mathrm{P}}^{1}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{C\}\backslash mathrm\{P\}^1</annotation></semantics>$, this comes from an action of this group’s double cover $<semantics>\mathrm{SU}(2)<annotation\; encoding="application/x-tex">\; \backslash mathrm\{SU\}(2)</annotation></semantics>$ on $<semantics>{\u2102}^{2}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{C\}^2</annotation></semantics>$. As we have seen, the rotational symmetry group of the icosahedron, $<semantics>{\mathrm{A}}_{5}\subset \mathrm{SO}(3)<annotation\; encoding="application/x-tex">\; \backslash mathrm\{A\}\_5\; \backslash subset\; \backslash mathrm\{SO\}(3)</annotation></semantics>$, is double covered by the binary icosahedral group $<semantics>\Gamma \subset \mathrm{SU}(2)<annotation\; encoding="application/x-tex">\; \backslash Gamma\; \backslash subset\; \backslash mathrm\{SU\}(2)</annotation></semantics>$. To build an $<semantics>{\mathrm{A}}_{5}<annotation\; encoding="application/x-tex">\; \backslash mathrm\{A\}\_5</annotation></semantics>$-invariant rational function on $<semantics>\u2102{\mathrm{P}}^{1}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{C\}\backslash mathrm\{P\}^1</annotation></semantics>$, we should thus look for $<semantics>\Gamma <annotation\; encoding="application/x-tex">\; \backslash Gamma</annotation></semantics>$-invariant homogeneous polynomials on $<semantics>{\u2102}^{2}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{C\}^2</annotation></semantics>$.

It is easy to construct three such polynomials:

• $<semantics>V<annotation\; encoding="application/x-tex">\; V</annotation></semantics>$, of degree $<semantics>12<annotation\; encoding="application/x-tex">\; 12</annotation></semantics>$, vanishing on the 1d subspaces corresponding to icosahedron vertices.

• $<semantics>E<annotation\; encoding="application/x-tex">\; E</annotation></semantics>$, of degree $<semantics>30<annotation\; encoding="application/x-tex">\; 30</annotation></semantics>$, vanishing on the 1d subspaces corresponding to icosahedron edge midpoints.

• $<semantics>F<annotation\; encoding="application/x-tex">\; F</annotation></semantics>$, of degree $<semantics>20<annotation\; encoding="application/x-tex">\; 20</annotation></semantics>$, vanishing on the 1d subspaces corresponding to icosahedron face centers.

Remember, we have embedded the icosahedron in $<semantics>\u2102{\mathrm{P}}^{1}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{C\}\backslash mathrm\{P\}^1</annotation></semantics>$, and each point in $<semantics>\u2102{\mathrm{P}}^{1}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{C\}\backslash mathrm\{P\}^1</annotation></semantics>$ is a 1-dimensional subspace of $<semantics>{\u2102}^{2}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{C\}^2</annotation></semantics>$, so each icosahedron vertex determines such a subspace, and there is a linear function on $<semantics>{\u2102}^{2}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{C\}^2</annotation></semantics>$, unique up to a constant factor, that vanishes on this subspace. The icosahedron has $<semantics>12<annotation\; encoding="application/x-tex">\; 12</annotation></semantics>$ vertices, so we get $<semantics>12<annotation\; encoding="application/x-tex">\; 12</annotation></semantics>$ linear functions this way. Multiplying them gives $<semantics>V<annotation\; encoding="application/x-tex">\; V</annotation></semantics>$, a homogeneous polynomial of degree $<semantics>12<annotation\; encoding="application/x-tex">\; 12</annotation></semantics>$ on $<semantics>{\u2102}^{2}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{C\}^2</annotation></semantics>$ that vanishes on all the subspaces corresponding to icosahedron vertices! The same trick gives $<semantics>E<annotation\; encoding="application/x-tex">\; E</annotation></semantics>$, which has degree $<semantics>30<annotation\; encoding="application/x-tex">\; 30</annotation></semantics>$ because the icosahedron has $<semantics>30<annotation\; encoding="application/x-tex">\; 30</annotation></semantics>$ edges, and $<semantics>F<annotation\; encoding="application/x-tex">\; F</annotation></semantics>$, which has degree $<semantics>20<annotation\; encoding="application/x-tex">\; 20</annotation></semantics>$ because the icosahedron has $<semantics>20<annotation\; encoding="application/x-tex">\; 20</annotation></semantics>$ faces.

A bit of work is required to check that $<semantics>V,E<annotation\; encoding="application/x-tex">\; V,E</annotation></semantics>$ and $<semantics>F<annotation\; encoding="application/x-tex">\; F</annotation></semantics>$ are invariant under $<semantics>\Gamma <annotation\; encoding="application/x-tex">\; \backslash Gamma</annotation></semantics>$, instead of changing by constant factors under group transformations. Indeed, if we had copied this construction using a tetrahedron or octahedron, this would not be the case. For details, see Shurman’s book [Sh], which is free online, or van Hoboken’s nice thesis [VH].

Since both $<semantics>{F}^{3}<annotation\; encoding="application/x-tex">\; F^3</annotation></semantics>$ and $<semantics>{V}^{5}<annotation\; encoding="application/x-tex">\; V^5</annotation></semantics>$ have degree $<semantics>60<annotation\; encoding="application/x-tex">\; 60</annotation></semantics>$, $<semantics>{F}^{3}/{V}^{5}<annotation\; encoding="application/x-tex">\; F^3/V^5</annotation></semantics>$ is homogeneous of degree zero, so it defines a rational function $<semantics>\mathcal{I}:\u2102{\mathrm{P}}^{1}\to \u2102{\mathrm{P}}^{1}<annotation\; encoding="application/x-tex">\; \backslash mathcal\{I\}\; \backslash colon\; \backslash mathbb\{C\}\backslash mathrm\{P\}^1\; \backslash to\; \backslash mathbb\{C\}\backslash mathrm\{P\}^1</annotation></semantics>$. This function is invariant under $<semantics>{\mathrm{A}}_{5}<annotation\; encoding="application/x-tex">\; \backslash mathrm\{A\}\_5</annotation></semantics>$ because $<semantics>F<annotation\; encoding="application/x-tex">\; F</annotation></semantics>$ and $<semantics>V<annotation\; encoding="application/x-tex">\; V</annotation></semantics>$ are invariant under $<semantics>\Gamma <annotation\; encoding="application/x-tex">\; \backslash Gamma</annotation></semantics>$. Since $<semantics>F<annotation\; encoding="application/x-tex">\; F</annotation></semantics>$ vanishes at face centers of the icosahedron while $<semantics>V<annotation\; encoding="application/x-tex">\; V</annotation></semantics>$ vanishes at vertices, $<semantics>\mathcal{I}={F}^{3}/{V}^{5}<annotation\; encoding="application/x-tex">\; \backslash mathcal\{I\}\; =\; F^3/V^5</annotation></semantics>$ equals $<semantics>0<annotation\; encoding="application/x-tex">\; 0</annotation></semantics>$ at face centers and $<semantics>\mathrm{\infty}<annotation\; encoding="application/x-tex">\; \backslash infty</annotation></semantics>$ at vertices. Finally, thanks to its invariance property, $<semantics>\mathcal{I}<annotation\; encoding="application/x-tex">\; \backslash mathcal\{I\}</annotation></semantics>$ takes the same value at every edge center, so we can normalize $<semantics>V<annotation\; encoding="application/x-tex">\; V</annotation></semantics>$ or $<semantics>F<annotation\; encoding="application/x-tex">\; F</annotation></semantics>$ to make this value 1. Thus, $<semantics>\mathcal{I}<annotation\; encoding="application/x-tex">\; \backslash mathcal\{I\}</annotation></semantics>$ has precisely the properties required of Klein’s icosahedral function!

### The Appearance of E_{8}

Now comes the really interesting part. Three polynomials on a 2-dimensional space must obey a relation, and $<semantics>V,E<annotation\; encoding="application/x-tex">\; V,E</annotation></semantics>$, and $<semantics>F<annotation\; encoding="application/x-tex">\; F</annotation></semantics>$ obey a very pretty one, at least after we normalize them correctly:

$$<semantics>{\displaystyle {V}^{5}+{E}^{2}+{F}^{3}=0.}<annotation\; encoding="application/x-tex">\; \backslash displaystyle\{\; V^5\; +\; E^2\; +\; F^3\; =\; 0.\; \}\; </annotation></semantics>$$

We could guess this relation simply by noting that each term must have the same degree. Every $<semantics>\Gamma <annotation\; encoding="application/x-tex">\; \backslash Gamma</annotation></semantics>$-invariant polynomial on $<semantics>{\u2102}^{2}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{C\}^2</annotation></semantics>$ is a polynomial in $<semantics>V,E<annotation\; encoding="application/x-tex">\; V,\; E</annotation></semantics>$ and $<semantics>F<annotation\; encoding="application/x-tex">\; F</annotation></semantics>$, and indeed

$$<semantics>{\displaystyle {\u2102}^{2}/\Gamma \cong \{(V,E,F)\in {\u2102}^{3}:\phantom{\rule{thickmathspace}{0ex}}{V}^{5}+{E}^{2}+{F}^{3}=0\}.}<annotation\; encoding="application/x-tex">\; \backslash displaystyle\{\; \backslash mathbb\{C\}^2\; /\; \backslash Gamma\; \backslash cong\; \backslash \{\; (V,E,F)\; \backslash in\; \backslash mathbb\{C\}^3\; \backslash colon\; \backslash ;\; V^5\; +\; E^2\; +\; F^3\; =\; 0\; \backslash \}\; .\; \}\; </annotation></semantics>$$

This complex surface is smooth except at $<semantics>V=E=F=0<annotation\; encoding="application/x-tex">\; V\; =\; E\; =\; F\; =\; 0</annotation></semantics>$, where it has a singularity. And hiding in this singularity is $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\; \backslash mathrm\{E\}\_8</annotation></semantics>$!

To see this, we need to ‘resolve’ the singularity. Roughly, this means that we find a smooth complex surface $<semantics>S<annotation\; encoding="application/x-tex">\; S</annotation></semantics>$ and an onto map

that is one-to-one away from the singularity. (More precisely, if $<semantics>X<annotation\; encoding="application/x-tex">\; X</annotation></semantics>$ is an algebraic variety with singular points $<semantics>{X}_{\mathrm{sing}}\subset X<annotation\; encoding="application/x-tex">\; X\_\{\backslash mathrm\{sing\}\}\; \backslash subset\; X</annotation></semantics>$, $<semantics>\pi :S\to X<annotation\; encoding="application/x-tex">\; \backslash pi\; \backslash colon\; S\; \backslash to\; X</annotation></semantics>$ is a **resolution** of $<semantics>X<annotation\; encoding="application/x-tex">\; X</annotation></semantics>$ if $<semantics>S<annotation\; encoding="application/x-tex">\; S</annotation></semantics>$ is smooth, $<semantics>\pi <annotation\; encoding="application/x-tex">\; \backslash pi</annotation></semantics>$ is proper, $<semantics>{\pi}^{-1}(X-{X}_{\mathrm{sing}})<annotation\; encoding="application/x-tex">\; \backslash pi^\{-1\}(X\; -\; X\_\{sing\})</annotation></semantics>$ is dense in $<semantics>S<annotation\; encoding="application/x-tex">\; S</annotation></semantics>$, and $<semantics>\pi <annotation\; encoding="application/x-tex">\; \backslash pi</annotation></semantics>$ is an isomorphism between $<semantics>{\pi}^{-1}(X-{X}_{\mathrm{sing}})<annotation\; encoding="application/x-tex">\; \backslash pi^\{-1\}(X\; -\; X\_\{sing\})</annotation></semantics>$ and $<semantics>X-{X}_{\mathrm{sing}}<annotation\; encoding="application/x-tex">\; X\; -\; X\_\{sing\}</annotation></semantics>$. For more details see Lamotke’s book [L].)

There are many such resolutions, but one **minimal** resolution, meaning that all others factor uniquely through this one:

What sits above the singularity in this minimal resolution? Eight copies of the Riemann sphere $<semantics>\u2102{\mathrm{P}}^{1}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{C\}\backslash mathrm\{P\}^1</annotation></semantics>$, one for each dot here:

Two of these $<semantics>\u2102{\mathrm{P}}^{1}<annotation\; encoding="application/x-tex">\; \backslash mathbb\{C\}\backslash mathrm\{P\}^1</annotation></semantics>$s intersect in a point if their dots are connected by an edge: otherwise they are disjoint.

This amazing fact was discovered by Patrick Du Val in 1934 [DV]. Why is it true? Alas, there is not enough room in the margin, or even in the entire blog article, to explain this. The books by Kirillov [Ki] and Lamotke [L] fill in the details. But here is a clue. The $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\; \backslash mathrm\{E\}\_8</annotation></semantics>$ Dynkin diagram has ‘legs’ of lengths $<semantics>5,2<annotation\; encoding="application/x-tex">\; 5,\; 2</annotation></semantics>$ and $<semantics>3<annotation\; encoding="application/x-tex">\; 3</annotation></semantics>$:

On the other hand,

$$<semantics>{\displaystyle {\mathrm{A}}_{5}\cong \u27e8v,e,f|{v}^{5}={e}^{2}={f}^{3}=vef=1\u27e9}<annotation\; encoding="application/x-tex">\; \backslash displaystyle\{\; \backslash mathrm\{A\}\_5\; \backslash cong\; \backslash langle\; v,\; e,\; f\; |\; v^5\; =\; e^2\; =\; f^3\; =\; v\; e\; f\; =\; 1\; \backslash rangle\; \}\; </annotation></semantics>$$

where in terms of the rotational symmetries of the icosahedron:

• $<semantics>v<annotation\; encoding="application/x-tex">\; v</annotation></semantics>$ is a $<semantics>1/5<annotation\; encoding="application/x-tex">\; 1/5</annotation></semantics>$ turn around some vertex of the icosahedron,

• $<semantics>e<annotation\; encoding="application/x-tex">\; e</annotation></semantics>$ is a $<semantics>1/2<annotation\; encoding="application/x-tex">\; 1/2</annotation></semantics>$ turn around the center of an edge touching that vertex,

• $<semantics>f<annotation\; encoding="application/x-tex">\; f</annotation></semantics>$ is a $<semantics>1/3<annotation\; encoding="application/x-tex">\; 1/3</annotation></semantics>$ turn around the center of a face touching that vertex,

and we must choose the sense of these rotations correctly to obtain $<semantics>vef=1<annotation\; encoding="application/x-tex">\; v\; e\; f\; =\; 1</annotation></semantics>$. To get a presentation of the binary icosahedral group we drop one relation:

$$<semantics>{\displaystyle \Gamma \cong \u27e8v,e,f|{v}^{5}={e}^{2}={f}^{3}=vef\u27e9}<annotation\; encoding="application/x-tex">\; \backslash displaystyle\{\; \backslash Gamma\; \backslash cong\; \backslash langle\; v,\; e,\; f\; |\; v^5\; =\; e^2\; =\; f^3\; =\; v\; e\; f\; \backslash rangle\; \}\; </annotation></semantics>$$

The dots in the $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\backslash mathrm\{E\}\_8</annotation></semantics>$ Dynkin diagram correspond naturally to conjugacy classes in $<semantics>\Gamma <annotation\; encoding="application/x-tex">\backslash Gamma</annotation></semantics>$, not counting the conjugacy class of the central element $<semantics>-1<annotation\; encoding="application/x-tex">-1</annotation></semantics>$. Each of these conjugacy classes, in turn, gives a copy of $<semantics>\u2102{\mathrm{P}}^{1}<annotation\; encoding="application/x-tex">\backslash mathbb\{C\}\backslash mathrm\{P\}^1</annotation></semantics>$ in the minimal resolution of $<semantics>{\u2102}^{2}/\Gamma <annotation\; encoding="application/x-tex">\backslash mathbb\{C\}^2/\backslash Gamma</annotation></semantics>$.

Not only the $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\backslash mathrm\{E\}\_8</annotation></semantics>$ Dynkin diagram, but also the $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\backslash mathrm\{E\}\_8</annotation></semantics>$ lattice, can be found in the minimal resolution of $<semantics>{\u2102}^{2}/\Gamma <annotation\; encoding="application/x-tex">\backslash mathbb\{C\}^2/\backslash Gamma</annotation></semantics>$. Topologically, this space is a 4-dimensional manifold. Its real second homology group is an 8-dimensional vector space with an inner product given by the intersection pairing. The integral second homology is a lattice in this vector space spanned by the 8 copies of $<semantics>\u2102{P}^{1}<annotation\; encoding="application/x-tex">\backslash mathbb\{C\}P^1</annotation></semantics>$ we have just seen—and it is a copy of the $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\backslash mathrm\{E\}\_8</annotation></semantics>$ lattice [KS].

But let us turn to a more basic question: what is $<semantics>{\u2102}^{2}/\Gamma <annotation\; encoding="application/x-tex">\backslash mathbb\{C\}^2/\backslash Gamma</annotation></semantics>$ like as a topological space? To tackle this, first note that we can identify a pair of complex numbers with a single quaternion, and this gives a homeomorphism

$$<semantics>{\u2102}^{2}/\Gamma \cong \mathbb{H}/\Gamma <annotation\; encoding="application/x-tex">\; \backslash mathbb\{C\}^2/\backslash Gamma\; \backslash cong\; \backslash mathbb\{H\}/\backslash Gamma\; </annotation></semantics>$$

where we let $<semantics>\Gamma <annotation\; encoding="application/x-tex">\backslash Gamma</annotation></semantics>$ act by right multiplication on $<semantics>\mathbb{H}<annotation\; encoding="application/x-tex">\backslash mathbb\{H\}</annotation></semantics>$. So, it suffices to understand $<semantics>\mathbb{H}/\Gamma <annotation\; encoding="application/x-tex">\backslash mathbb\{H\}/\backslash Gamma</annotation></semantics>$.

Next, note that sitting inside $<semantics>\mathbb{H}/\Gamma <annotation\; encoding="application/x-tex">\backslash mathbb\{H\}/\backslash Gamma</annotation></semantics>$ are the points coming from the unit sphere in $<semantics>\mathbb{H}<annotation\; encoding="application/x-tex">\backslash mathbb\{H\}</annotation></semantics>$. These points form the 3-dimensional manifold $<semantics>\mathrm{SU}(2)/\Gamma <annotation\; encoding="application/x-tex">\backslash mathrm\{SU\}(2)/\backslash Gamma</annotation></semantics>$, which is called the **Poincaré homology 3-sphere** [KS]. This is a wonderful thing in its own right: Poincaré discovered it as a counterexample to his guess that any compact 3-manifold with the same homology as a 3-sphere is actually diffeomorphic to the 3-sphere, and it is deeply connected to $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\backslash mathrm\{E\}\_8</annotation></semantics>$. But for our purposes, what matters is that we can think of this manifold in another way, since we have a diffeomorphism

$$<semantics>\mathrm{SU}(2)/\Gamma \cong \mathrm{SO}(3)/{\mathrm{A}}_{5}.<annotation\; encoding="application/x-tex">\; \backslash mathrm\{SU\}(2)/\backslash Gamma\; \backslash cong\; \backslash mathrm\{SO\}(3)/\backslash mathrm\{A\}\_5.\; </annotation></semantics>$$

The latter is just *the space of all icosahedra inscribed in the unit sphere in 3d space*, where we count two as the same if they differ by a rotational symmetry.

This is a nice description of the points of $<semantics>\mathbb{H}/\Gamma <annotation\; encoding="application/x-tex">\backslash mathbb\{H\}/\backslash Gamma</annotation></semantics>$ coming from points in the unit sphere of $<semantics>H<annotation\; encoding="application/x-tex">\backslash H</annotation></semantics>$. But every quaternion lies in *some* sphere centered at the origin of $<semantics>\mathbb{H}<annotation\; encoding="application/x-tex">\backslash mathbb\{H\}</annotation></semantics>$, of possibly zero radius. It follows that $<semantics>{\u2102}^{2}/\Gamma \cong \mathbb{H}/\Gamma <annotation\; encoding="application/x-tex">\backslash mathbb\{C\}^2/\backslash Gamma\; \backslash cong\; \backslash mathbb\{H\}/\backslash Gamma</annotation></semantics>$ is the space of *all* icosahedra centered at the origin of 3d space — of arbitrary size, including a degenerate icosahedron of zero size. This degenerate icosahedron is the singular point in $<semantics>{\u2102}^{2}/\Gamma <annotation\; encoding="application/x-tex">\backslash mathbb\{C\}^2/\backslash Gamma</annotation></semantics>$. This is where $<semantics>{E}_{8}<annotation\; encoding="application/x-tex">\backslash E\_8</annotation></semantics>$ is hiding.

Clearly much has been left unexplained in this brief account. Most of the missing details can be found in the references. But it remains unknown — at least to me — how the two constructions of $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\backslash mathrm\{E\}\_8</annotation></semantics>$ from the icosahedron fit together in a unified picture.

Recall what we did. First we took the binary icosahedral group $<semantics>\Gamma \subset \mathbb{H}<annotation\; encoding="application/x-tex">\backslash Gamma\; \backslash subset\; \backslash mathbb\{H\}</annotation></semantics>$, took integer linear combinations of its elements, thought of these as forming a lattice in an 8-dimensional rational vector space with a natural norm, and discovered that this lattice is a copy of the $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\backslash mathrm\{E\}\_8</annotation></semantics>$ lattice. Then we took $<semantics>{\u2102}^{2}/\Gamma \cong \mathbb{H}/\Gamma <annotation\; encoding="application/x-tex">\backslash mathbb\{C\}^2/\backslash Gamma\; \backslash cong\; \backslash mathbb\{H\}/\backslash Gamma</annotation></semantics>$, took its minimal resolution, and found that the integral 2nd homology of this space, equipped with its natural inner product, is a copy of the $<semantics>{\mathrm{E}}_{8}<annotation\; encoding="application/x-tex">\backslash mathrm\{E\}\_8</annotation></semantics>$ lattice. From the same ingredients we built the same lattice in two very different ways! How are these constructions connected? This puzzle deserves a nice solution.

#### Acknowledgements

I thank Tong Yang for inviting me to speak on this topic at the Annual General Meeting of the Hong Kong Mathematical Society on May 20, 2017, and Guowu Meng for hosting me at the HKUST while I prepared that talk. I also thank the many people, too numerous to accurately list, who have helped me understand these topics over the years.

#### Bibliography

[CS] J. H. Conway and N. J. A. Sloane, *Sphere Packings, Lattices and Groups*, Springer, Berlin, 2013.

[DV] P. du Val, On isolated singularities of surfaces which do not affect the conditions of adjunction, I, II and III, *Proc. Camb. Phil. Soc. * **30**, 453–459, 460–465, 483–491.

[KS] R. Kirby and M. Scharlemann, Eight faces of the Poincaré homology 3-sphere, *Usp. Mat. Nauk.* **37** (1982), 139–159. Available at https://tinyurl.com/ybrn4pjq.

[Ki] A. Kirillov, *Quiver Representations and Quiver Varieties*, AMS, Providence, Rhode Island, 2016.

[Kl] F. Klein, *Lectures on the Ikosahedron and the Solution of Equations of the Fifth Degree*, Trüubner & Co., London, 1888. Available at https://archive.org/details/cu31924059413439.

[L] K. Lamotke, *Regular Solids and Isolated Singularities*, Vieweg & Sohn, Braunschweig, 1986.

[N] O. Nash, On Klein’s icosahedral solution of the quintic. Available as arXiv:1308.0955.

[Sh] J. Shurman, *Geometry of the Quintic*, Wiley, New York, 1997. Available at http://people.reed.edu/~jerry/Quintic/quintic.html.

[Sl] P. Slodowy, Platonic solids, Kleinian singularities, and Lie groups, in *Algebraic Geometry*, Lecture Notes in Mathematics **1008**, Springer, Berlin, 1983, pp. 102–138.

[VH] J. van Hoboken, *Platonic Solids, Binary Polyhedral Groups, Kleinian Singularities and Lie Algebras of Type A, D, E*, Master’s Thesis, University of Amsterdam, 2002. Available at http://math.ucr.edu/home/baez/joris_van_hoboken_platonic.pdf.

[V] M. Viazovska, The sphere packing problem in dimension 8, *Ann. Math.* **185** (2017), 991–1015. Available at https://arxiv.org/abs/1603.04246.

### John Baez - Azimuth

I can’t stop thinking about the 600-cell:

It’s a ‘Platonic solid in 4 dimensions’ with 600 tetrahedral faces and 120 vertices. One reason I like it is that you can think of these vertices as forming a *group*: a double cover of the rotational symmetry group of the icosahedron. Another reason is that it’s a halfway house between the icosahedron and the lattice. I explained all this in my last post here:

I wrote that post as a spinoff of an article I was writing for the *Newsletter of the London Mathematical Society*, which had a deadline attached to it. Now I should be writing something else, for another deadline. But somehow deadlines strongly demotivate me—they make me want to do *anything else*. So I’ve been continuing to think about the 600-cell. I posed some puzzles about it in the comments to my last post, and they led me to some interesting thoughts, which I feel like explaining. But they’re not quite solidified, so right now I just want to give a fairly concrete picture of the 600-cell, or at least its vertices.

This will be a much less demanding post than the last one—and correspondingly less rewarding. Remember the basic idea:

Points in the 3-sphere can be seen as quaternions of norm 1, and these form a group that double covers The vertices of the 600-cell are the points of a subgroup that double covers the rotational symmetry group of the icosahedron. This group is the famous **binary icosahedral group**.

Thus, we can name the vertices of the 600-cell by rotations of the icosahedron—as long as we remember to distinguish between a rotation by and a rotation by Let’s do it!

• 0° (1 of these). We can take the identity rotation as our chosen ‘favorite’ vertex of the 600-cell.

• 72° (12 of these). The nearest neighbors of our chosen vertex correspond to the rotations by the smallest angles that are symmetries of the icosahedron; these correspond to taking any of its 12 vertices and giving it a 1/5 turn clockwise.

• 120° (20 of these). The next nearest neighbors correspond to taking one of the 20 faces of the icosahedron and giving it a 1/3 turn clockwise.

• 144° (12 of these). These correspond to taking one of the vertices of the icosahedron and giving it a 2/5 turn clockwise.

• 180° (30 of these). These correspond to taking one of the edges and giving it a 1/2 turn clockwise. (Note that since we’re working in the double cover rather than giving one edge a half turn clockwise counts as different than giving the opposite edge a half turn clockwise.)

• 216° (12 of these). These correspond to taking one of the vertices of the icosahedron and giving it a 3/5 turn clockwise. (Again, this counts as different than rotating the opposite vertex by a 2/5 turn clockwise.)

• 240° (20 of these). These correspond to taking one of the faces of the icosahedron and giving it a 2/3 turn clockwise. (Again, this counts as different than rotating the opposite vertex by a 1/3 turn clockwise.)

• 288° (12 of these). These correspond to taking any of the vertices and giving it a 4/5 turn clockwise.

• 360° (1 of these). This corresponds to a full turn in any direction.

Let’s check:

Good! We need a total of 120 vertices.

This calculation also shows that if we move a hyperplane through the 3-sphere, which hits our favorite vertex the moment it touches the 3-sphere, it will give the following slices of the 600-cell:

• Slice 1: a point (our favorite vertex),

• Slice 2: a dodecahedron (its 12 nearest neighbors),

• Slice 3: an icosahedron (the 20 next-nearest neighbors),

• Slice 4: a dodecahedron (the 12 third-nearest neighbors),

• Slice 5: an icosidodecahedron (the 30 fourth-nearest neighbors),

• Slice 6: a dodecahedron (the 12 fifth-nearest neighbors),

• Slice 7: an icosahedron (the 20 sixth-nearest neighbors),

• Slice 8: a dodecahedron (the 12 seventh-nearest neighbors),

• Slice 9: a point (the vertex opposite our favorite).

Here’s a picture drawn by J. Gregory Moxness, illustrating this:

Note that there are 9 slices. Each corresponds to a different conjugacy class in the group These in turn correspond to the dots in the *extended* Dynkin diagram of which has the usual 8 dots and one more.

The usual Dynkin diagram has ‘legs’ of lengths and

The three legs correspond to conjugacy classes in that map to rotational symmetries of an icosahedron that preserve a vertex (5 conjugacy classes), an edge (2 conjugacy classes), and a (3 conjugacy classes)… not counting the element That last element gives the extra dot in the *extended* Dynkin diagram.

## December 15, 2017

### Christian P. Robert - xi'an's og

**I**n Le Monde edition of Nov 5, an article on the difficulty of maths departments to attract students, especially in master programs and in the training of secondary school maths teachers (Agrégation & CAPES), where the number of candidates usually does not reach the number of potential positions… And also on the deep changes in the training of secondary school pupils, who over the past five years have lost a considerable amount of maths bases and hence are found missing when entering the university level. (Or, put otherwise, have a lower level in maths that implies a strong modification of our own programs and possibly the addition of an extra year or at least semester to the bachelor degree…) For instance, a few weeks ago, I realised for instance that my third year class had little idea of a conditional density and teaching measure theory at this level becomes more and more of a challenge!

Filed under: Kids, University life Tagged: bachelor, Cédric Villani, French education ministry, Le Monde, measure theory, programmes, secondary schools, teaching

### Tommaso Dorigo - Scientificblogging

*Hadron Collider Searches for Diboson Resonances*". The article, which will be published in the prestigious "Progress in Particle and Nuclear Physics", an Elsevier journal with an impact factor above 11 (compare with Physics Letters B, IF=4.8, or Physical Review Letters, IF=8.5, to see why it's relevant), is currently in peer review, but that does not mean that I cannot make a short summary of its contents here.

### Peter Coles - In the Dark

I’m starting to get the hang of some of the differences between things here in Ireland and the United Kingdom, both domestically and in the world of work.

One of the most important points of variation that concerns academic life is the school system students go through before going to University. In the system operating in England and Wales the standard qualification for entry is the GCE A-level. Most students take A-levels in three subjects, which gives them a relatively narrow focus although the range of subjects to choose from is rather large. In Ireland the standard qualification is the Leaving Certificate, which comprises a minimum of six subjects, giving students a broader range of knowledge at the sacrifice (perhaps) of a certain amount of depth; it has been decreed for entry into this system that an Irish Leaving Certificate counts as about 2/3 of an A-level for admissions purposes, so Irish students do the equivalent of at least four A-levels, and many do more than this.

There’s a lot to be said for the increased breadth of subjects undertaken for the leaving certificate, but I have no direct experience of teaching first-year university students here yet so I can’t comment on their level of preparedness.

Coincidentally, though, one of the first emails I received this week referred to a consultation about proposed changes to the Leaving Certificate in Applied Mathematics. Not knowing much about the old syllabus, I didn’t feel there was much I could add but I had a look at the new one and was surprised to see a whole `Strand’, on *Mathematical Modelling with netwworks and graphs*.

The introductory blurb reads:

In this strand students learn about networks or graphs as mathematical models which can be used to investigate a wide range of real-world problems. They learn about graphs and adjacency matrices and how useful these are in solving problems. They are given further opportunity to consolidate their understanding that mathematical ideas can be represented in multiple ways. They are introduced to dynamic programming as a quantitative analysis technique used to solve large, complex problems that involve the need to make a sequence of decisions. As they progress in their understanding they will explore and appreciate the use of algorithms in problem solving as well as considering some of the wider issues involved with the use of such techniques.

Among the specific topics listed you will find:

- Minimal Spanning trees applied to problems involving optimising networks and algorithms associated with finding these (Kruskal, Prim);
- Bellman’s Optimality Principal to find the shortest paths in a weighted directed network, and to be able to formulate the process algebraically;

For the record I should say that I’ve actually used Minimal Spanning Trees in a research context (see, e.g., this paper) and have read (and still have) a number of books on graph theory, which I find a truly fascinating subject. It seems to me that the topics all listed above are all interesting and they’re all useful in a range of contexts, but they do seem rather advanced topics to me for a pre-university student and will be unfamiliar to a great many potential teachers of Applied Mathematics too. It may turn out, therefore, that the students will end up getting a very superficial knowledge of this very trendy subject, when they would actually be better off getting a more solid basis in more traditional mathematical methods so I wonder what the reaction will be to this proposal!

## December 14, 2017

### The n-Category Cafe

In 1995, the German geometer Friedrich Hirzebruch retired, and a private booklet was put together to mark the occasion. That booklet included a short note by Maxim Kontsevich entitled “The $<semantics>1\frac{1}{2}<annotation\; encoding="application/x-tex">1\backslash tfrac\{1\}\{2\}</annotation></semantics>$-logarithm”.

Kontsevich’s note didn’t become publicly available until five years later, when it was included as an appendix to a paper on polylogarithms by Philippe Elbaz-Vincent and Herbert Gangl. Towards the end, it contains the following provocative words:

Conclusion:If we have a random variable $<semantics>\xi <annotation\; encoding="application/x-tex">\backslash xi</annotation></semantics>$ which takes finitely many values with all probabilities in $<semantics>\mathbb{Q}<annotation\; encoding="application/x-tex">\backslash mathbb\{Q\}</annotation></semantics>$ then we can define not only the transcendental number $<semantics>H(\xi )<annotation\; encoding="application/x-tex">H(\backslash xi)</annotation></semantics>$ but also its “residues modulo $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$” for almost all primes $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$ !

Kontsevich’s note was very short and omitted many details. I’ll put some flesh on those bones, showing how to make sense of the sentence above, and much more.

The “$<semantics>H<annotation\; encoding="application/x-tex">H</annotation></semantics>$” that Kontsevich uses here is the symbol for entropy — or
more exactly, Shannon entropy. So, I’ll begin by recalling what that is.
That will pave the way for what I *really* want to talk about, which is
a kind of entropy for probability distributions where the “probabilities”
are not real numbers, but elements of the field $<semantics>\mathbb{Z}/p\mathbb{Z}<annotation\; encoding="application/x-tex">\backslash mathbb\{Z\}/p\backslash mathbb\{Z\}</annotation></semantics>$ of integers modulo a
prime $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$.

Let $<semantics>\pi =({\pi}_{1},\dots ,{\pi}_{n})<annotation\; encoding="application/x-tex">\backslash pi\; =\; (\backslash pi\_1,\; \backslash ldots,\; \backslash pi\_n)</annotation></semantics>$ be a finite probability distribution.
(It would be more usual to write a probability distribution as $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$, but I
want to reserve that letter for prime numbers.) The
**entropy** of $<semantics>\pi <annotation\; encoding="application/x-tex">\backslash pi</annotation></semantics>$ is

$$<semantics>{H}_{\mathbb{R}}(\pi )=-\sum _{i:{\pi}_{i}\ne 0}{\pi}_{i}\mathrm{log}{\pi}_{i}.<annotation\; encoding="application/x-tex">\; H\_\backslash mathbb\{R\}(\backslash pi)\; =\; -\; \backslash sum\_\{i\; :\; \backslash pi\_i\; \backslash neq\; 0\}\; \backslash pi\_i\; \backslash log\; \backslash pi\_i.\; </annotation></semantics>$$

Usually this is just written as $<semantics>H<annotation\; encoding="application/x-tex">H</annotation></semantics>$, but I want to emphasize the role of the real numbers here: both the probabilities $<semantics>{\pi}_{i}<annotation\; encoding="application/x-tex">\backslash pi\_i</annotation></semantics>$ and the entropy $<semantics>{H}_{\mathbb{R}}(\pi )<annotation\; encoding="application/x-tex">H\_\backslash mathbb\{R\}(\backslash pi)</annotation></semantics>$ belong to $<semantics>\mathbb{R}<annotation\; encoding="application/x-tex">\backslash mathbb\{R\}</annotation></semantics>$.

There are applications of entropy in dozens of branches of science… but none will be relevant here! This is purely a mathematical story, though if anyone can think of any possible application or interpretation of entropy modulo a prime, I’d love to hear it.

The challenge now is to find the correct analogue of entropy when the field $<semantics>\mathbb{R}<annotation\; encoding="application/x-tex">\backslash mathbb\{R\}</annotation></semantics>$ is replaced by the field $<semantics>\mathbb{Z}/p\mathbb{Z}<annotation\; encoding="application/x-tex">\backslash mathbb\{Z\}/p\backslash mathbb\{Z\}</annotation></semantics>$ of integers mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$, for any prime $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$. So, we want to define a kind of entropy

$$<semantics>{H}_{p}({\pi}_{1},\dots ,{\pi}_{n})\in \mathbb{Z}/p\mathbb{Z}<annotation\; encoding="application/x-tex">\; H\_p(\backslash pi\_1,\; \backslash ldots,\; \backslash pi\_n)\; \backslash in\; \backslash mathbb\{Z\}/p\backslash mathbb\{Z\}\; </annotation></semantics>$$

when $<semantics>{\pi}_{i}\in \mathbb{Z}/p\mathbb{Z}<annotation\; encoding="application/x-tex">\backslash pi\_i\; \backslash in\; \backslash mathbb\{Z\}/p\backslash mathbb\{Z\}</annotation></semantics>$.

We immediately run into an obstacle. Over $<semantics>\mathbb{R}<annotation\; encoding="application/x-tex">\backslash mathbb\{R\}</annotation></semantics>$, probabilities are
required to be nonnegative. Indeed, the logarithm in the definition of
entropy doesn’t make sense otherwise. But in $<semantics>\mathbb{Z}/p\mathbb{Z}<annotation\; encoding="application/x-tex">\backslash mathbb\{Z\}/p\backslash mathbb\{Z\}</annotation></semantics>$,
there is no notion of positive or negative. So, what are we even going to
define the entropy *of*?

We take the simplest way out: ignore the problem. So, writing

$$<semantics>{\Pi}_{n}=\{({\pi}_{1},\dots ,{\pi}_{n})\in (\mathbb{Z}/p\mathbb{Z}{)}^{n}:{\pi}_{1}+\cdots +{\pi}_{n}=1\},<annotation\; encoding="application/x-tex">\; \backslash Pi\_n\; =\; \backslash \{\; (\backslash pi\_1,\; \backslash ldots,\; \backslash pi\_n)\; \backslash in\; (\backslash mathbb\{Z\}/p\backslash mathbb\{Z\})^n\; :\; \backslash pi\_1\; +\; \backslash cdots\; +\; \backslash pi\_n\; =\; 1\; \backslash \},\; </annotation></semantics>$$

we’re going to try to define

$$<semantics>{H}_{p}(\pi )\in \mathbb{Z}/p\mathbb{Z}<annotation\; encoding="application/x-tex">\; H\_p(\backslash pi)\; \backslash in\; \backslash mathbb\{Z\}/p\backslash mathbb\{Z\}\; </annotation></semantics>$$

for each $<semantics>\pi =({\pi}_{1},\dots ,{\pi}_{n})\in {\Pi}_{n}<annotation\; encoding="application/x-tex">\backslash pi\; =\; (\backslash pi\_1,\; \backslash ldots,\; \backslash pi\_n)\; \backslash in\; \backslash Pi\_n</annotation></semantics>$.

Let’s try the most direct approach to doing this. That is, let’s stare at the formula defining real entropy…

$$<semantics>{H}_{\mathbb{R}}(\pi )=-\sum _{i:{\pi}_{i}\ne 0}{\pi}_{i}\mathrm{log}{\pi}_{i}<annotation\; encoding="application/x-tex">\; H\_\backslash mathbb\{R\}(\backslash pi)\; =\; -\; \backslash sum\_\{i\; :\; \backslash pi\_i\; \backslash neq\; 0\}\; \backslash pi\_i\; \backslash log\; \backslash pi\_i\; </annotation></semantics>$$

… and try to write down the analogous formula over $<semantics>\mathbb{Z}/p\mathbb{Z}<annotation\; encoding="application/x-tex">\backslash mathbb\{Z\}/p\backslash mathbb\{Z\}</annotation></semantics>$.

The immediate question is: what should play the role of the logarithm mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$?

The crucial property of the ordinary logarithm is that it converts multiplication into addition. Specifically, we’re concerned here with logarithms of nonzero probabilities, and $<semantics>\mathrm{log}<annotation\; encoding="application/x-tex">\backslash log</annotation></semantics>$ defines a homomorphism from the multiplicative group $<semantics>(0,1]<annotation\; encoding="application/x-tex">(0,\; 1]</annotation></semantics>$ of nonzero probabilities to the additive group $<semantics>\mathbb{R}<annotation\; encoding="application/x-tex">\backslash mathbb\{R\}</annotation></semantics>$.

Mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$, then, we want a homomorphism from the multiplicative group $<semantics>(\mathbb{Z}/p\mathbb{Z}{)}^{\times}<annotation\; encoding="application/x-tex">(\backslash mathbb\{Z\}/p\backslash mathbb\{Z\})^\backslash times</annotation></semantics>$ of nonzero probabilities to the additive group $<semantics>\mathbb{Z}/p\mathbb{Z}<annotation\; encoding="application/x-tex">\backslash mathbb\{Z\}/p\backslash mathbb\{Z\}</annotation></semantics>$. And here we hit another obstacle: a simple argument using Lagrange’s theorem shows that apart from the zero map, no such homomorphism exists.

So, we seem to be stuck. Actually, we’re stuck in a way that often happens when you try to construct something new, working by analogy with something old: slavishly imitating the old situation, symbol for symbol, often doesn’t work. In the most interesting analogies, there are wrinkles.

To make some progress, instead of looking at the *formula* for entropy,
let’s look at the *properties* of entropy.

The most important property is a kind of recursivity. In the language
spoken by many patrons of the Café, finite probability distributions form
an *operad*. Explicitly, this means the following.

Suppose I flip a coin. If it’s heads, I roll a die, and if it’s tails, I draw from a pack of cards. This is a two-stage process with 58 possible final outcomes: either the face of a die or a playing card. Assuming that the coin toss, die roll and card draw are all fair, the probability distribution on the 58 outcomes is

$$<semantics>(1/12,\dots ,1/12,1/104,\dots ,1/104),<annotation\; encoding="application/x-tex">\; (1/12,\; \backslash ldots,\; 1/12,\; 1/104,\; \backslash ldots,\; 1/104),\; </annotation></semantics>$$

with $<semantics>6<annotation\; encoding="application/x-tex">6</annotation></semantics>$ copies of $<semantics>1/12<annotation\; encoding="application/x-tex">1/12</annotation></semantics>$ and $<semantics>52<annotation\; encoding="application/x-tex">52</annotation></semantics>$ copies of $<semantics>1/104<annotation\; encoding="application/x-tex">1/104</annotation></semantics>$. Generally, given a probability distribution $<semantics>\gamma =({\gamma}_{1},\dots ,{\gamma}_{n})<annotation\; encoding="application/x-tex">\backslash gamma\; =\; (\backslash gamma\_1,\; \backslash ldots,\; \backslash gamma\_n)</annotation></semantics>$ on $<semantics>n<annotation\; encoding="application/x-tex">n</annotation></semantics>$ elements and, for each $<semantics>i\in \{1,\dots ,n\}<annotation\; encoding="application/x-tex">i\; \backslash in\; \backslash \{1,\; \backslash ldots,\; n\backslash \}</annotation></semantics>$, a probability distribution $<semantics>{\pi}^{i}=({\pi}_{1}^{i},\dots ,{\pi}_{{k}_{i}}^{i})<annotation\; encoding="application/x-tex">\backslash pi^i\; =\; (\backslash pi^i\_1,\; \backslash ldots,\; \backslash pi^i\_\{k\_i\})</annotation></semantics>$ on $<semantics>{k}_{i}<annotation\; encoding="application/x-tex">k\_i</annotation></semantics>$ elements, we get a composite distribution

$$<semantics>\gamma \circ ({\pi}^{1},\dots ,{\pi}^{n})=({\gamma}_{1}{\pi}_{1}^{1},\dots ,{\gamma}_{1}{\pi}_{{k}_{1}}^{1},\dots ,{\gamma}_{n}{\pi}_{1}^{n},\dots ,{\gamma}_{n}{\pi}_{{k}_{n}}^{n})<annotation\; encoding="application/x-tex">\; \backslash gamma\; \backslash circ\; (\backslash pi^1,\; \backslash ldots,\; \backslash pi^n)\; =\; (\backslash gamma\_1\; \backslash pi^1\_1,\; \backslash ldots,\; \backslash gamma\_1\; \backslash pi^1\_\{k\_1\},\; \backslash ldots,\; \backslash gamma\_n\; \backslash pi^n\_1,\; \backslash ldots,\; \backslash gamma\_n\; \backslash pi^n\_\{k\_n\})\; </annotation></semantics>$$

on $<semantics>{k}_{1}+\cdots +{k}_{n}<annotation\; encoding="application/x-tex">k\_1\; +\; \backslash cdots\; +\; k\_n</annotation></semantics>$ elements.

For example, take the coin-die-card process above. Writing $<semantics>{u}_{n}<annotation\; encoding="application/x-tex">u\_n</annotation></semantics>$ for the uniform distribution on $<semantics>n<annotation\; encoding="application/x-tex">n</annotation></semantics>$ elements, the final distribution on $<semantics>58<annotation\; encoding="application/x-tex">58</annotation></semantics>$ elements is $<semantics>{u}_{2}\circ ({u}_{6},{u}_{52})<annotation\; encoding="application/x-tex">u\_2\; \backslash circ\; (u\_6,\; u\_\{52\})</annotation></semantics>$, which I wrote out explicitly above.

The important recursivity property of entropy is called the
**chain rule**, and it states that

$$<semantics>{H}_{\mathbb{R}}(\gamma \circ ({\pi}^{1},\dots ,{\pi}^{n}))={H}_{\mathbb{R}}(\gamma )+\sum _{i=1}^{n}{\gamma}_{i}{H}_{\mathbb{R}}({\pi}^{i}).<annotation\; encoding="application/x-tex">\; H\_\backslash mathbb\{R\}(\backslash gamma\; \backslash circ\; (\backslash pi^1,\; \backslash ldots,\; \backslash pi^n))\; =\; H\_\backslash mathbb\{R\}(\backslash gamma)\; +\; \backslash sum\_\{i\; =\; 1\}^n\; \backslash gamma\_i\; H\_\backslash mathbb\{R\}(\backslash pi^i).\; </annotation></semantics>$$

It’s easy to check that this is true. (It’s also nice to understand it in terms of information… but if I follow every tempting explanatory byway, I’ll run out of energy too soon.) And in fact, it characterizes entropy almost uniquely:

TheoremLet $<semantics>I<annotation\; encoding="application/x-tex">I</annotation></semantics>$ be a function assigning a real number $<semantics>I(\pi )<annotation\; encoding="application/x-tex">I(\backslash pi)</annotation></semantics>$ to each finite probability distribution $<semantics>\pi <annotation\; encoding="application/x-tex">\backslash pi</annotation></semantics>$. The following are equivalent:

$<semantics>I<annotation\; encoding="application/x-tex">I</annotation></semantics>$ is continuous in $<semantics>\pi <annotation\; encoding="application/x-tex">\backslash pi</annotation></semantics>$ and satisfies the chain rule;

$<semantics>I=c{H}_{\mathbb{R}}<annotation\; encoding="application/x-tex">I\; =\; c\; H\_\backslash mathbb\{R\}</annotation></semantics>$ for some constant $<semantics>c\in \mathbb{R}<annotation\; encoding="application/x-tex">c\; \backslash in\; \backslash mathbb\{R\}</annotation></semantics>$.

The theorem as stated is due to Faddeev, and I blogged about it earlier this year. In fact, you can weaken “continuous” to “measurable” (a theorem of Lee), but that refinement won’t be important here.

What *is* important is this. In our quest to imitate real entropy in
$<semantics>\mathbb{Z}/p\mathbb{Z}<annotation\; encoding="application/x-tex">\backslash mathbb\{Z\}/p\backslash mathbb\{Z\}</annotation></semantics>$, we now have something to aim for. Namely: we
want a sequence of functions $<semantics>{H}_{p}:{\Pi}_{n}\to \mathbb{Z}/p\mathbb{Z}<annotation\; encoding="application/x-tex">H\_p\; :\; \backslash Pi\_n\; \backslash to\; \backslash mathbb\{Z\}/p\backslash mathbb\{Z\}</annotation></semantics>$
satisfying the obvious analogue of the chain rule. And if we’re really
lucky, there will be essentially only *one* such sequence.

We’ll discover that this is indeed the case. Once we’ve found the right definition of $<semantics>{H}_{p}<annotation\; encoding="application/x-tex">H\_p</annotation></semantics>$ and proved this, we can very legitimately baptize $<semantics>{H}_{p}<annotation\; encoding="application/x-tex">H\_p</annotation></semantics>$ as “entropy mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$” — no matter what weird and wonderful formula might be used to define it — because it has the same characteristic properties as entropy over $<semantics>\mathbb{R}<annotation\; encoding="application/x-tex">\backslash mathbb\{R\}</annotation></semantics>$.

I think I’ll leave you on that cliff-hanger. If you’d like to guess what the definition of entropy mod $<semantics>p<annotation\; encoding="application/x-tex">p</annotation></semantics>$ is, go ahead! Otherwise, I’ll tell you next time.

by leinster (Tom.Leinster@gmx.com) at December 14, 2017 11:18 PM

### Emily Lakdawalla - The Planetary Society Blog

### Peter Coles - In the Dark

The programming language *Python* has established itself as the industry standard for researchers in physics and astronomy (as well as the many other fields, including most of those covered by the Data Innovation Research Institute which employs me part-time). It has also become the standard vehicle for teaching coding skills to undergraduates in many disciplines. In fact it looks like the first module I will be teaching in Maynooth next term is in Computational Physics, and that will be delivered using Python too. It’s been a while since I last did any significant hands-on programming, so this will provide me with a good refresher. The best way to learn something well is to have to teach it to others!

But I digress. This morning I noticed a paper by Benedikt Diemer on the arXiv with the title *COLOSSUS: A python toolkit for cosmology, large-scale structure, and dark matter halos*. Here is the abstract:

This paper introduces Colossus, a public, open-source python package for calculations related to cosmology, the large-scale structure of matter in the universe, and the properties of dark matter halos. The code is designed to be fast and easy to use, with a coherent, well-documented user interface. The cosmology module implements FLRW cosmologies including curvature, relativistic species, and different dark energy equations of state, and provides fast computations of the linear matter power spectrum, variance, and correlation function. The large-scale structure module is concerned with the properties of peaks in Gaussian random fields and halos in a statistical sense, including their peak height, peak curvature, halo bias, and mass function. The halo module deals with spherical overdensity radii and masses, density profiles, concentration, and the splashback radius. To facilitate the rapid exploration of these quantities, Colossus implements about 40 different fitting functions from the literature. I discuss the core routines in detail, with a particular emphasis on their accuracy. Colossus is available at bitbucket.org/bdiemer/colossus.

The software can be downloaded here. It looks a very useful package that includes code to calculate many of the bits and pieces used by cosmologists working on the theory of large-scale structure and galaxy evolution. It is also, I hope, an example of a trend towards greater use of open-source software, for which I congratulate the author! I think this is an important part of the campaign to create truly open science, as I blogged about here.

An important aspect of the way science works is that when a given individual or group publishes a result, it should be possible for others to reproduce it (or not, as the case may be). At present, this can’t always be done. In my own field of astrophysics/cosmology, for example, results in traditional scientific papers are often based on very complicated analyses of large data sets. This is increasingly the case in other fields too. A basic problem obviously arises when data are not made public. Fortunately in astrophysics these days researchers are pretty good at sharing their data, although this hasn’t always been the case.

However, even allowing open access to data doesn’t always solve the reproducibility problem. Often extensive numerical codes are needed to process the measurements and extract meaningful output. Without access to these pipeline codes it is impossible for a third party to check the path from input to output without writing their own version assuming that there is sufficient information to do that in the first place. That researchers should publish their software as well as their results is quite a controversial suggestion, but I think it’s the best practice for science. There isn’t a uniform policy in astrophysics and cosmology, but I sense that quite a few people out there agree with me. Cosmological numerical simulations, for example, can be performed by anyone with a sufficiently big computer using GADGET the source codes of which are freely available. Likewise, for CMB analysis, there is the excellent CAMB code, which can be downloaded at will; this is in a long tradition of openly available numerical codes, including CMBFAST and HealPix.

I suspect some researchers might be reluctant to share the codes they have written because they feel they won’t get sufficient credit for work done using them. I don’t think this is true, as researchers are generally very appreciative of such openness and publications describing the corresponding codes are generously cited. In any case I don’t think it’s appropriate to withhold such programs from the wider community, which prevents them being either scrutinized or extended as well as being used to further scientific research. In other words excessively proprietorial attitudes to data analysis software are detrimental to the spirit of open science.

Anyway, my views aren’t guaranteed to be representative of the community, so I’d like to ask for a quick show of hands via a poll…

<noscript><a href="http://polldaddy.com/poll/6877776">Take Our Poll</a></noscript>…and you are of course welcome to comment via the usual box.

Follow @telescoper### Robert Helling - atdotde

I would like to call it Summers' problem:

Let's have two real random variables $M$ and $F$ that are drawn according to two probability distributions $\rho_{M/F}(x)$ (for starters you may both assume to be Gaussians but possibly with different mean and variance). Take $N$ draws from each and order the $2N$ results. What is the probability that the $k$ largest ones are all from $M$ rather than $F$? Express your results in terms of the $\rho_{M/F}(x)$. We are also interested in asymptotic results for $N$ large and $k$ fixed as well as $N$ and $k$ large but $k/N$ fixed.

Last bonus question: How many of the people that say that they hire only based on merit and end up with an all male board realise that by this they say that women are not as good by quite a margin?

by Robert Helling (noreply@blogger.com) at December 14, 2017 08:58 AM

## December 13, 2017

### Emily Lakdawalla - The Planetary Society Blog

### Peter Coles - In the Dark

I see that the Minister responsible for UK universities, Jo Johnson, has decided that universities should offer two-year degrees, claiming that this will somehow attract more students into higher education.

The idea seems to be that students will get the same `amount’ of teaching, but concentrated in two full calendar years rather than spread over three academic years. This fast-track degree will be offered at a lower level of fee than a normal three-year Bachelors programme.

I can just about accept that this will work in some disciplines and at some universities. The (private) University of Buckingham, for example, already offers such programmes. On the other hand, the University of Buckingham did not participate in the latest Research Excellence Framework, no doubt for the reason that teaching all-year round leaves its academic staff no time to do research or even attend conferences, which these days is only possible during the summer recess.

Call me old-fashioned, but I think an institution that does not combine teaching and research – and indeed one in which the teaching is not led by research – does not merit the name of `University’. The old polytechnics offered a range of valuable opportunities that complemented the traditional honours degree, but that capacity was basically eliminated in 1992 when all such institutions became universities.

Though my main objection to two-year degrees is their impact on research, there are problems from the teaching side too. One is that keeping up the intensity of full-time study throughout a whole year will, in my opinion, exacerbate the difficulty many students have managing their workload without stress or other mental health difficulties. Moreover, many students currently use the long summer vacation either to work, either to earn money to help offset the cost of study, or to participate in placements, internships or other activities to help make them more employable after graduation.

It would be particularly difficult to manage two-year degrees in STEM disciplines, as the teaching laboratories need maintenance and installation of new equipment, for which the proposed system allows no time. And how would project work fit into the fast-track system? On top of all that there’s the fact that the current fee level does not cover the cost of teaching in STEM disciplines, so having to do it faster and for less money is not going to be possible. Incidentally, many STEM students currently pursue undergraduate programmes that last four years, not three…

These points have no doubt been made before, but there is another point that is less widely understood. The fact is that a two-year Bachelors degree *may not be a recognised qualification outside the UK*. This is, in fact, already a problem with the four-year undergraduate programmes we call, e.g., MPhys, and regard as Masters level in this country: these are not regarded as Masters qualifications in many European countries. Perhaps this is part of some cunning plan to stop graduates leaving the UK after Brexit?

In the light of these difficulties it is no surprise to me that not a single undergraduate I’ve spoken to thinks that a two-year degree is a sensible option. If the government wants to make studying cheaper, said one Physics student I was chatting to, why don’t they just cut the fees for normal degree programmes?

The impression one gets from all this `thinking’ is that the Government increasingly regards universities as businesses that trade in a commodity called `education’, where the word ‘education’ is narrowly construed as `training’ in the skills needed for future employment. I believe a University education is (or should be) far more about developing critical thinking, problem-solving ability, intellectual curiosity than it is about teaching them, e.g., programming skills. Skills are important, of course, but we also need to educate students in what to use them for.

Follow @telescoper## December 12, 2017

### Clifford V. Johnson - Asymptotia

Last week the always-interesting Maria Popova of Brain Pickings wrote a piece about the book. I was pleased to see what she wrote because it was clear that she really understood many of the several things I was trying to do in making the book. (I say this because my expectation is usually that people aren't going to click with it because it does not fit narrow presuppositions for either a non-fiction science book or for a graphic novel.) So this was a very pleasant surprise indeed. There's no point trying to paraphrase her, so let me simply point you there with this link.

The book also made the roundup of top Science Books for 2017 on NPR's [...] Click to continue reading this post

The post Noted… appeared first on Asymptotia.

### CERN Bulletin

**Every year, ever since its creation, the Staff Association has organised the CERN Children’s Christmas Party, bringing together 5- to 7-year-old children of employed members of the personnel. The success of the party continues to motivate the organizers of the Staff Association.**

This year, the party took place on Saturday, 2 December, and no less than 240 children were welcomed in two sessions, at 13.30 and at 15.30. The children attended a show with music, tales and a speaking puppet: “Zéphirine et les légendes de Noël”.

After the show, they enjoyed a snack in Restaurant 1. We would like to thank Novae for their valuable help and generous contribution.

Then, Father Christmas himself came to give the children their presents. The Staff Association would also like to warmly thank him for taking the time to bring happiness and joy to little ones and big ones alike during the busy season!

We would also like to thank all the parents for their valuable collaboration.

Finally, we wish you all happy holidays and look forward to seeing you next year!

### CERN Bulletin

**Since the beginning of 2016, the Staff Association has been in discussions with the Management to save and sustain our Nursery and School, located on the CERN site in Meyrin.**

**Where are we now with the discussions and what does the future hold for our Children’s Day-Care Centre and School (EVEE)?**

**A closer look at the creation of the Kindergarten and its management**

A group of parents founded the Kindergarten at CERN in the 1960s, and in 1969, the CERN Staff Association took the structure under its aegis. This change in management resulted in a partnership agreement between CERN and the Staff Association. The agreement defined the rights and duties of both parties with regard to the Staff Association operating a kindergarten on the CERN site. Since then, the Staff Association has been the employer and manager of the structure providing early childhood services.

**Development of the structure over time**

In 1977, the Kindergarten changed premises and a new agreement was signed between CERN and the Staff Association. This agreement is still in force today.

More recently, the Staff Association, concerned with the wellbeing of children, and to meet the parents’ expectations, has put in place new services in *concertation *with the CERN Management:

- in 2009, creation of a canteen to serve approximately 60 children per day;
- in 2013, creation of a nursery to accommodate children from 4 months to 3 years old (around 35 toddlers);
- in 2015, creation of a summer camp with the capacity to accommodate 40 children during the month of July.

**EVEE facing budgetary difficulties – how is the crisis managed?**

The Children’s Day-Care Centre and School (EVEE) structure is facing consecutive budgetary difficulties, in large part due to the establishment of the canteen and the nursery. Indeed, these two services have led to an annual structural deficit which the current financial support of the Organization and the increases in school fees do not suffice to cover.

In 2015, the EVEE Steering Committee endeavoured, in spite of great difficulty, to maximise the revenue and to contain the expenses in order to achieve a balanced budget.

In 2016, informed of the precarious situation of the EVEE structure, the CERN Management decided to put in place a working group to take stock of the situation, to assess the needs of the members of the personnel (MPE and MPA) in terms of early childhood, and to make proposals for a sustainable and viable solution together with the Staff Association.

In 2017, upon the request of the Staff Association, an audit of the accounts was carried out. This audit shows that the management is globally sound and that the optimization measures alone are not sufficient to return to a balanced budget. An adapted subvention from our “State”, CERN, is necessary.

At the same time, CERN has agreed to cover the deficit with extra subventions allowing the Staff Association to finish the year 2016-2017 and to ensure the start of the school year 2017-2018.

**How has CERN responded at the end of 2017?**

**First response: Privatisation of the Nursery and imminent closure of the School**

After more than a year of discussions to enable the Management and the Staff Association to find a sustainable and viable solution for the EVEE structure, CERN decides unilaterally to subcontract the operation of the nursery and to remove the school.

Indeed, at the end of November, an *Invitation to Tender* was sent out to several companies that manage multiple early childhood structures on Swiss territory, in order to take a decision at the beginning of 2018.

It was only on reading this *Invitation to Tender*, drawn up by the Procurement Services, that the Staff Association learned that it would cease operating the EVEE structure by 31 August 2018 and that the operation of a new structure, no longer including a school, would be entrusted to a contractor as of 1 September 2018.

How would you think the Staff Association welcomed the news after nearly 50 years of partnership with CERN? How could the parents react, and even more so the over 40 employees of the structure?

Following this announcement, and meetings at the highest levels of the Organization, commitments have been made, reassuring, in part, the employees of the structure, the parents and the employer, the Staff Association.

**Second response: Outsourcing the Nursery and maintaining the School managed by the Staff Association**

On Wednesday, 6 December, at a meeting to which all members of the personnel with children under 4 were invited, the Head of the HR Department, James Purvis, announced the intention of CERN to outsource the nursery and to maintain the School under the management of the Staff Association.

Moreover, the Director for Finance and Human Resources, Martin Steinacher, announced the commitment that there would be no layoffs.

**Concerns persist**

Despite the latest developments, the Staff Association, the parents and the employees of the EVEE structure remain concerned about the future of the EVEE structure comprising a nursery and a school.

Indeed, during a meeting with the Management, preceding that of 6 December, the Management announced the continuation of the School, contrary to what was originally stated in the *Invitation to Tender*, but only for a limited duration yet to be determined.

Is this still the intention of the Directorate? In that case, the commitment not to dismiss the personnel is, de facto, null!

**It is not too late to sustain the partnership between CERN and the Staff Association!**

The services the Staff Association has provided for many years are above all of a high quality and adapted to an international environment. In the opinion of the parents who currently have children in the structure, as well as those who used to, the service quality is remarkable.

The EVEE is also the only structure within the canton that provides care for children from 4 months to 6 years old. This unique service makes it possible to prepare children, often non-French-speaking, for integration into French or Swiss school.

The Staff Association strives to do the utmost to save the structure as it is today, not only for the sake of the unique educational offer, but also because the School generates a profit, which helps reduce the deficit of the nursery and the canteen.

Furthermore, why does CERN seek to break a long-standing partnership if no substantial savings can be achieved, and the rupture would very likely lead to a decrease in the quality of the educational offer… Not to mention the impact such a decision may have on the employees despite the Directorate’s commitment that there would be no layoffs.

The Staff Association is a responsible and reliable employer that seeks to preserve a unique, high quality educational offer, while committing to provide, with the help of parents and employees, a viable and competitive business model.

**SAVE OUR NURSERY AND SCHOO**L, this is what the parents, the more than 40 employees of EVEE and the employer, the Staff Association, are calling for.

### CERN Bulletin

**Wednesday 20 December 2017 at 20:00**

CERN Council Chamber

**Uncle Boonmee Who Can Recall His Past Lives**

**Directed by Apichatpong Weerasethakul**

Thailand, 2010, 114 minutes

Suffering from acute kidney failure, Uncle Boonmee has chosen to spend his final days surrounded by his loved ones in the countryside. Surprisingly, the ghost of his deceased wife appears to care for him, and his long lost son returns home in a non-human form. Contemplating the reasons for his illness, Boonmee treks through the jungle with his family to a mysterious hilltop cave - the birthplace of his first life.

Original version Thai / French / Lao; English subtitles

**Wednesday 10 January 2018 at 20:00**

CERN Council Chamber

**Collateral**

**Directed by Michael Mann**

USA, 2004, 115 minutes

One night in Los Angeles, cab driver Max Durocher picks up a gray-suited man named Vincent. Vincent offers Max a large sum of money to drive him to five locations around LA before the night is up. Max accepts, but realizes that Vincent is a hitman who has been hired to kill five people that night. Max is forced to drive Vincent around the City of Angels, unsure if he'll live to see sunrise.

Original version English; French subtitles

**Wednesday 17 January 2018 at 20:00**

CERN Council Chamber

**Memories of Murder**

**Directed by Joon-ho Bong**

South Korea, 2003, 132 minutes

In a small Korean province in 1986, three detectives struggle with the case of multiple young women being found raped and murdered by an unknown culprit.

Original version Korean; English subtitles

### CERN Bulletin

**The election of the Staff Council for the period 2018-2019 is now over and the first lesson is a turnout for the vote of 56.15 %, higher than for the previous election. This clearly shows the interest that members of the Staff Association attach to the work and dedication of their delegates. Of course we also thank all those who stood up as candidates and expressed their commitment to actively defend the interests of the staff and of CERN.**

This newly-elected Staff Council (see its composition below) is truly representative of all sectors and professions of the Organization. This will be a major asset when representatives of the Staff Association discuss with Management and Member States on issues which we will have to address during the next two years.

Strong with this vote of confidence, we are certain that we can count on your active and ongoing support of our members and all personnel at CERN for the future. We know there will be no shortage of challenges. Together we will be stronger and more creative to take them on.

**NEW STAFF COUNCIL - 2018-2019 mandate**

^{1 Group A: benchmark jobs classified in grade spans 1-2-3, 2-3-4, 3-4-5 and 4-5-6.}

^{2 Group B: benchmark jobs classified in grade spans 6-7-8 and 9-10.}

### CERN Bulletin

**Wednesday 13 December 2017 at 20:00**

CERN Council Chamber

**My Winnipeg**

**Directed by Guy Maddin**

Canada, 2007, 80** ** minutes

Fact, fantasy and memory are woven seamlessly together in this portrait of film-maker Guy Maddin's home town of Winnipeg, Manitoba.

Original version English; French subtitles

**Wednesday 20 December 2017 at 20:00**

CERN Council Chamber

**Uncle Boonmee Who Can Recall His Past Lives**

**Directed by Apichatpong Weerasethakul**

Thailand, 2010, 114** ** minutes

Suffering from acute kidney failure, Uncle Boonmee has chosen to spend his final days surrounded by his loved ones in the countryside. Surprisingly, the ghost of his deceased wife appears to care for him, and his long lost son returns home in a non-human form. Contemplating the reasons for his illness, Boonmee treks through the jungle with his family to a mysterious hilltop cave - the birthplace of his first life.

Original version Thai/French/Lao; English subtitles

**Wednesday 10 January 2018 at 20:00**

CERN Council Chamber

**Collateral**

**Directed by Michael Mann**

USA, 2004, 115 minutes

One night in Los Angeles, cab driver Max Durocher picks up a gray-suited man named Vincent. Vincent offers Max a large sum of money to drive him to five locations around LA before the night is up. Max accepts, but realizes that Vincent is a hitman who has been hired to kill five people that night. Max is forced to drive Vincent around the City of Angels, unsure if he'll live to see sunrise.

Original version English; French subtitles

**Wednesday 17 January 2018 at 20:00**

CERN Council Chamber

**Memories of Murder**

**Directed by Joon-ho Bong**

South Korea, 2003, 132 minutes

In a small Korean province in 1986, three detectives struggle with the case of multiple young women being found raped and murdered by an unknown culprit.

Original version Korean; English subtitles

## December 10, 2017

### John Baez - Azimuth

Here’s a draft of a little thing I’m writing for the *Newsletter of the London Mathematical Society*. The regular icosahedron is connected to many ‘exceptional objects’ in mathematics, and here I describe two ways of using it to construct One uses a subring of the quaternions called the ‘icosians’, while the other uses Patrick du Val’s work on the resolution of Kleinian singularities. I leave it as a challenge to find the connection between these two constructions!

You can see a PDF here:

• From the icosahedron to E_{8}.

Here’s the story:

### From the Icosahedron to E_{8}

In mathematics, every sufficiently beautiful object is connected to all others. Many exciting adventures, of various levels of difficulty, can be had by following these connections. Take, for example, the icosahedron—that is, the *regular* icosahedron, one of the five Platonic solids. Starting from this it is just a hop, skip and a jump to the lattice, a wonderful pattern of points in 8 dimensions! As we explore this connection we shall see that it also ties together many other remarkable entities: the golden ratio, the quaternions, the quintic equation, a highly symmetrical 4-dimensional shape called the 600-cell, and a manifold called the Poincaré homology 3-sphere.

Indeed, the main problem with these adventures is knowing where to stop! The story we shall tell is just a snippet of a longer one involving the McKay correspondence and quiver representations. It would be easy to bring in the octonions, exceptional Lie groups, and more. But it can be enjoyed without these esoteric digressions, so let us introduce the protagonists without further ado.

The icosahedron has a long history. According to a comment in Euclid’s *Elements* it was discovered by Plato’s friend Theaetetus, a geometer who lived from roughly 415 to 369 BC. Since Theaetetus is believed to have classified the Platonic solids, he may have found the icosahedron as part of this project. If so, it is one of the earliest mathematical objects discovered as part of a classification theorem. It’s hard to be sure. In any event, it was known to Plato: in his *Timaeus*, he argued that water comes in atoms of this shape.

The icosahedron has 20 triangular faces, 30 edges, and 12 vertices. We can take the vertices to be the four points

and all those obtained from these by cyclic permutations of the coordinates, where

is the golden ratio. Thus, we can group the vertices into three orthogonal **golden rectangles**: rectangles whose proportions are to 1.

In fact, there are five ways to do this. The rotational symmetries of the icosahedron permute these five ways, and any nontrivial rotation gives a nontrivial permutation. The rotational symmetry group of the icosahedron is thus a subgroup of Moreover, this subgroup has 60 elements. After all, any rotation is determined by what it does to a chosen face of the icosahedron: it can map this face to any of the 20 faces, and it can do so in 3 ways. The rotational symmetry group of the icosahedron is therefore a 60-element subgroup of Group theory therefore tells us that it must be the alternating group

The lattice is harder to visualize than the icosahedron, but still easy to characterize. Take a bunch of equal-sized spheres in 8 dimensions. Get as many of these spheres to touch a single sphere as you possibly can. Then, get as many to touch *those* spheres as you possibly can, and so on. Unlike in 3 dimensions, where there is ‘wiggle room’, you have no choice about how to proceed, except for an overall rotation and translation. The balls will inevitably be centered at points of the lattice!

We can also characterize the lattice as the one giving the densest packing of spheres among all lattices in 8 dimensions. This packing was long suspected to be optimal even among those that do not arise from lattices—but this fact was proved only in 2016, by the young mathematician Maryna Viazovska [V].

We can also describe the lattice more explicitly. In suitable coordinates, it consists of vectors for which:

1) the components are either all integers or all integers plus and

2) the components sum to an even number.

This lattice consists of all integral linear combinations of the 8 rows of this matrix:

The inner product of any row vector with itself is 2, while the inner product of distinct row vectors is either 0 or -1. Thus, any two of these vectors lie at an angle of either 90° or 120°. If we draw a dot for each vector, and connect two dots by an edge when the angle between their vectors is 120° we get this pattern:

This is called the Dynkin diagram. In the first part of our story we shall find the lattice hiding in the icosahedron; in the second part, we shall find this diagram. The two parts of this story must be related—but the relation remains mysterious, at least to me.

### The Icosians

The quickest route from the icosahedron to goes through the fourth dimension. The symmetries of the icosahedron can be described using certain quaternions; the integer linear combinations of these form a subring of the quaternions called the ‘icosians’, but the icosians can be reinterpreted as a lattice in 8 dimensions, and this is the lattice [CS]. Let us see how this works.

The quaternions, discovered by Hamilton, are a 4-dimensional algebra

with multiplication given as follows:

It is a normed division algebra, meaning that the norm

obeys

for all The unit sphere in is thus a group, often called because its elements can be identified with unitary matrices with determinant 1. This group acts as rotations of 3-dimensional Euclidean space, since we can see any point in as a **purely imaginary** quaternion and the quaternion is then purely imaginary for any Indeed, this action gives a double cover

where is the group of rotations of

We can thus take any Platonic solid, look at its group of rotational symmetries, get a subgroup of and take its double cover in If we do this starting with the icosahedron, we see that the 60-element group is covered by a 120-element group called the **binary icosahedral group**.

The elements of are quaternions of norm one, and it turns out that they are the vertices of a 4-dimensional regular polytope: a 4-dimensional cousin of the Platonic solids. It deserves to be called the “hypericosahedron”, but it is usually called the 600-cell, since it has 600 tetrahedral faces. Here is the 600-cell projected down to 3 dimensions, drawn using Robert Webb’s Stella software:

Explicitly, if we identify with the elements of are the points

and those obtained from these by even permutations of the coordinates. Since these points are closed under multiplication, if we take integral linear combinations of them we get a subring of the quaternions:

Conway and Sloane [CS] call this the ring of **icosians**. The icosians are not a lattice in the quaternions: they are dense. However, any icosian is of the form where and live in the **golden field**

Thus we can think of an icosian as an 8-tuple of rational numbers. Such 8-tuples form a lattice in 8 dimensions.

In fact we can put a norm on the icosians as follows. For the usual quaternionic norm has

for some rational numbers and but we can define a new norm on by setting

With respect to this new norm, the icosians form a lattice that fits isometrically in 8-dimensional Euclidean space. And this is none other than

### Klein’s Icosahedral Function

Not only is the lattice hiding in the icosahedron; so is the Dynkin diagram. The space of all regular icosahedra of arbitrary size centered at the origin has a singularity, which corresponds to a degenerate special case: the icosahedron of zero size. If we resolve this singularity in a minimal way we get eight Riemann spheres, intersecting in a pattern described by the Dynkin diagram!

This remarkable story starts around 1884 with Felix Klein’s *Lectures on the Icosahedron* [Kl]. In this work he inscribed an icosahedron in the Riemann sphere, He thus got the icosahedron’s symmetry group, to act as conformal transformations of —indeed, rotations. He then found a rational function of one complex variable that is invariant under all these transformations. This function equals at the centers of the icosahedron’s faces, 1 at the midpoints of its edges, and at its vertices.

Here is Klein’s icosahedral function as drawn by Abdelaziz Nait Merzouk. The color shows its phase, while the contour lines show its magnitude:

We can think of Klein’s icosahedral function as a branched cover of the Riemann sphere by itself with 60 sheets:

Indeed, acts on and the quotient space is isomorphic to again. The function gives an explicit formula for the quotient map

Klein managed to reduce solving the quintic to the problem of solving the equation for A modern exposition of this result is Shurman’s *Geometry of the Quintic* [Sh]. For a more high-powered approach, see the paper by Nash [N]. Unfortunately, neither of these treatments avoids complicated calculations. But our interest in Klein’s icosahedral function here does not come from its connection to the quintic: instead, we want to see its connection to

For this we should actually construct Klein’s icosahedral function. To do this, recall that the Riemann sphere is the space of 1-dimensional linear subspaces of Let us work directly with While acts on this comes from an action of this group’s double cover on As we have seen, the rotational symmetry group of the icosahedron, is double covered by the binary icosahedral group To build an -invariant rational function on we should thus look for -invariant homogeneous polynomials on

It is easy to construct three such polynomials:

• of degree 12, vanishing on the 1d subspaces corresponding to icosahedron vertices.

• of degree 30, vanishing on the 1d subspaces corresponding to icosahedron edge midpoints.

• of degree 20, vanishing on the 1d subspaces corresponding to icosahedron face centers.

Remember, we have embedded the icosahedron in and each point in is a 1-dimensional subspace of so each icosahedron vertex determines such a subspace, and there is a linear function on unique up to a constant factor, that vanishes on this subspace. The icosahedron has 12 vertices, so we get 12 linear functions this way. Multiplying them gives a homogeneous polynomial of degree 12 on that vanishes on all the subspaces corresponding to icosahedron vertices! The same trick gives which has degree 30 because the icosahedron has 30 edges, and which has degree 20 because the icosahedron has 20 faces.

A bit of work is required to check that and are invariant under instead of changing by constant factors under group transformations. Indeed, if we had copied this construction using a tetrahedron or octahedron, this would not be the case. For details, see Shurman’s book [Sh], which is free online, or van Hoboken’s nice thesis [VH].

Since both and have degree 60, is homogeneous of degree zero, so it defines a rational function This function is invariant under because and are invariant under Since vanishes at face centers of the icosahedron while vanishes at vertices, equals at face centers and at vertices. Finally, thanks to its invariance property, takes the same value at every edge center, so we can normalize or to make this value 1.

Thus, has precisely the properties required of Klein’s icosahedral function! And indeed, these properties uniquely characterize that function, so that function is

### The Appearance of E_{8}

Now comes the really interesting part. Three polynomials on a 2-dimensional space must obey a relation, and and obey a very pretty one, at least after we normalize them correctly:

We could guess this relation simply by noting that each term must have the same degree. Every -invariant polynomial on is a polynomial in and and indeed

This complex surface is smooth except at where it has a singularity. And hiding in this singularity is !

To see this, we need to ‘resolve’ the singularity. Roughly, this means that we find a smooth complex surface and an onto map

that is one-to-one away from the singularity. (More precisely, if is an algebraic variety with singular points is a **resolution** of if is smooth, is proper, is dense in and is an isomorphism between and For more details see Lamotke’s book [L].)

There are many such resolutions, but one **minimal** resolution, meaning that all others factor uniquely through this one:

What sits above the singularity in this minimal resolution? Eight copies of the Riemann sphere one for each dot here:

Two of these s intersect in a point if their dots are connected by an edge: otherwise they are disjoint.

This amazing fact was discovered by Patrick Du Val in 1934 [DV]. Why is it true? Alas, there is not enough room in the margin, or even in the entire blog article, to explain this. The books by Kirillov [Ki] and Lamotke [L] fill in the details. But here is a clue. The Dynkin diagram has ‘legs’ of lengths and :

On the other hand,

where in terms of the rotational symmetries of the icosahedron:

• is a turn around some vertex of the icosahedron,

• is a turn around the center of an edge touching that vertex,

• is a turn around the center of a face touching that vertex,

and we must choose the sense of these rotations correctly to obtain To get a presentation of the binary icosahedral group we drop one relation:

The dots in the Dynkin diagram correspond naturally to conjugacy classes in not counting the conjugacy class of the central element Each of these conjugacy classes, in turn, gives a copy of in the minimal resolution of

Not only the Dynkin diagram, but also the lattice, can be found in the minimal resolution of Topologically, this space is a 4-dimensional manifold. Its real second homology group is an 8-dimensional vector space with an inner product given by the intersection pairing. The integral second homology is a lattice in this vector space spanned by the 8 copies of we have just seen—and it is a copy of the lattice [KS].

But let us turn to a more basic question: what is like as a topological space? To tackle this, first note that we can identify a pair of complex numbers with a single quaternion, and this gives a homeomorphism

where we let act by right multiplication on So, it suffices to understand

Next, note that sitting inside are the points coming from the unit sphere in These points form the 3-dimensional manifold which is called the **Poincaré homology 3-sphere** [KS]. This is a wonderful thing in its own right: Poincaré discovered it as a counterexample to his guess that any compact 3-manifold with the same homology as a 3-sphere is actually diffeomorphic to the 3-sphere, and it is deeply connected to But for our purposes, what matters is that we can think of this manifold in another way, since we have a diffeomorphism

The latter is just *the space of all icosahedra inscribed in the unit sphere in 3d space*, where we count two as the same if they differ by a rotational symmetry.

This is a nice description of the points of coming from points in the unit sphere of But every quaternion lies in *some* sphere centered at the origin of of possibly zero radius. It follows that is the space of *all* icosahedra centered at the origin of 3d space—of arbitrary size, including a degenerate icosahedron of zero size. This degenerate icosahedron is the singular point in This is where is hiding.

Clearly much has been left unexplained in this brief account. Most of the missing details can be found in the references. But it remains unknown—at least to me—how the two constructions of from the icosahedron fit together in a unified picture.

Recall what we did. First we took the binary icosahedral group took integer linear combinations of its elements, thought of these as forming a lattice in an 8-dimensional rational vector space with a natural norm, and discovered that this lattice is a copy of the lattice. Then we took took its minimal resolution, and found that the integral 2nd homology of this space, equipped with its natural inner product, is a copy of the lattice. From the same ingredients we built the same lattice in two very different ways! How are these constructions connected? This puzzle deserves a nice solution.

#### Acknowledgements

I thank Tong Yang for inviting me to speak on this topic at the Annual General Meeting of the Hong Kong Mathematical Society on May 20, 2017, and Guowu Meng for hosting me at the HKUST while I prepared that talk. I also thank the many people, too numerous to accurately list, who have helped me understand these topics over the years.

#### Bibliography

[CS] J. H. Conway and N. J. A. Sloane, *Sphere Packings, Lattices and Groups*, Springer, Berlin, 2013.

[DV] P. du Val, On isolated singularities of surfaces which do not affect the conditions of adjunction, I, II and III, *Proc. Camb. Phil. Soc. * **30**, 453–459, 460–465, 483–491.

[KS] R. Kirby and M. Scharlemann, Eight faces of the Poincaré homology 3-sphere, *Usp. Mat. Nauk.* **37** (1982), 139–159. Available at https://tinyurl.com/ybrn4pjq

[Ki] A. Kirillov, *Quiver Representations and Quiver Varieties*, AMS, Providence, Rhode Island, 2016.

[Kl] F. Klein, *Lectures on the Ikosahedron and the Solution of Equations of the Fifth Degree*, Trüubner & Co., London, 1888. Available at https://archive.org/details/cu31924059413439

[L] K. Lamotke, *Regular Solids and Isolated Singularities*, Vieweg & Sohn, Braunschweig, 1986.

[N] O. Nash, On Klein’s icosahedral solution of the quintic. Available at https://arxiv.org/abs/1308.0955

[Sh] J. Shurman, *Geometry of the Quintic*, Wiley, New York, 1997. Available at http://people.reed.edu/~jerry/Quintic/quintic.html

[Sl] P. Slodowy, Platonic solids, Kleinian singularities, and Lie groups, in *Algebraic Geometry*, Lecture Notes in Mathematics **1008**, Springer, Berlin, 1983, pp. 102–138.

[VH] J. van Hoboken, *Platonic Solids, Binary Polyhedral Groups, Kleinian Singularities and Lie Algebras of Type A, D, E*, Master’s Thesis, University of Amsterdam, 2002. Available at http://math.ucr.edu/home/baez/joris_van_hoboken_platonic.pdf

[V] M. Viazovska, The sphere packing problem in dimension 8, *Ann. Math.* **185** (2017), 991–1015. Available at https://arxiv.org/abs/1603.04246

### John Baez - Azimuth

In certain crystals you can knock an electron out of its favorite place and leave a **hole**: a place with a missing electron. Sometimes these holes can move around like particles. And naturally these holes attract electrons, since they are *places an electron would want to be*.

Since an electron and a hole attract each other, they can orbit each other. An orbiting electron-hole pair is a bit like a hydrogen atom, where an electron orbits a proton. All of this is quantum-mechanical, of course, so you should be imagining smeared-out wavefunctions, not little dots moving around. But imagine dots if it’s easier.

An orbiting electron-hole pair is called an **exciton**, because while it acts like a particle in its own right, it’s really just a special kind of ‘excited’ electron—an electron with extra energy, not in its lowest energy state where it wants to be.

An exciton usually doesn’t last long: the orbiting electron and hole spiral towards each other, the electron finds the hole it’s been seeking, and it settles down.

But excitons can last long enough to do interesting things. In 1978 the Russian physicist Abrikosov wrote a short and very creative paper in which he raised the possibility that *excitons could form a crystal in their own right!* He called this new state of matter **excitonium**.

In fact his reasoning was very simple.

Just as electrons have a mass, so do holes. That sounds odd, since a hole is just a vacant spot where an electron would like to be. But such a hole can move around. It has more energy when it moves faster, and it takes force to accelerate it—so it acts just like it has a mass! The precise mass of a hole depends on the nature of the substance we’re dealing with.

Now imagine a substance with very heavy holes.

When a hole is much heavier than an electron, it will stand almost still when an electron orbits it. So, they form an exciton that’s *very* similar to a hydrogen atom, where we have an electron orbiting a much heavier proton.

Hydrogen comes in different forms: gas, liquid, solid… and at extreme pressures, like in the core of Jupiter, hydrogen becomes *metallic*. So, we should expect that excitons can come in all these different forms too!

We should be able to create an exciton gas… an exciton liquid… an exciton solid…. and under the right circumstances, a *metallic crystal of excitons*. Abrikosov called this **metallic excitonium**.

People have been trying to create this stuff for a long time. Some claim to have succeeded. But a new paper claims to have found something else: a Bose–Einstein condensate of excitons:

• Anshul Kogar, Melinda S. Rak, Sean Vig, Ali A. Husain, Felix Flicker, Young Il Joe, Luc Venema, Greg J. MacDougall, Tai C. Chiang, Eduardo Fradkin, Jasper van Wezel and Peter Abbamonte, Signatures of exciton condensation in a transition metal dichalcogenide, *Science* **358** (2017), 1314–1317.

A lone electron acts like a fermion, so I guess a hole does do, and if so that means an exciton acts approximately like a boson. When it’s cold, a gas of bosons will ‘condense’, with a significant fraction of them settling into the lowest energy states available. I guess excitons have been seen to do this!

There’s a fairly good simplified explanation at the University of Illinois website:

• Siv Schwink, Physicists excited by discovery of new form of matter, excitonium, 7 December 2017.

However, the picture on this page, which I used above, shows domain walls moving through crystallized excitonium. I think that’s different than a Bose-Einstein condensate!

I urge you to look at Abrikosov’s paper. It’s short and beautiful:

• Alexei Alexeyevich Abrikosov, A possible mechanism of high temperature superconductivity, *Journal of the Less Common Metals*

**62** (1978), 451–455.

(Cool journal title. Is there a journal of the *more* common metals?)

In this paper, Abrikoskov points out that previous authors had the idea of metallic excitonium. Maybe his new idea was that this might be a superconductor—and that this might explain high-temperature superconductivity. The reason for his guess is that metallic hydrogen, too, is widely suspected to be a superconductor.

Later, Abrikosov won the Nobel prize for some other ideas about superconductors. I think I should read more of his papers. He seems like one of those physicists with great intuitions.

**Puzzle 1.** If a crystal of excitons conducts electricity, what is actually going on? That is, which electrons are moving around, and how?

This is a fun puzzle because an exciton crystal is a kind of *abstract* crystal created by the motion of electrons in another, ordinary, crystal. And that leads me to another puzzle, that I don’t know the answer to:

**Puzzle 2.** Is it possible to create a hole in excitonium? If so, it possible to create an exciton in excitonium? If so, is it possible to create **meta-excitonium**: an crystal of excitons in excitonium?

## December 08, 2017

### Emily Lakdawalla - The Planetary Society Blog

### Lubos Motl - string vacua and pheno

There are already two new tirades at Peter W*it's notorious website. The newest one celebrates that a non-expert has described the multiverse as the "last refuge of cowards" at a social event. I think much of the research about the multiverse is questionable but slurs like that won't make the possibility go away. Using some irrelevant expletives from a not really scientific event as "arguments" is low-brow, indeed.

The previous text titled "String theory fails another test" is based on W*it's complete lies about the predictions of cosmic strings by state-of-the-art physical theories.

LIGO has just published constraints on the cosmic strings that in principle add some noise of a characteristic color to the oscillations that LIGO can observe. The amount of this noise from cusps and kinks was shown to be a smaller than some function of the frequencies and/or the cosmic string tension.

W*it summarizes this paper as a "failure of string theory" and declares David Gross and Joe Polchinski to be have made a losing prediction. But those statements are lies – for two main reasons.

First, "cosmic strings" can be explained as objects in string theory – and even fundamental strings of string theory may be stretched in some models and manifest themselves as cosmic strings – but "cosmic strings" are still a notion in cosmology that is independent of string theory. Cosmic strings may exist independently of string theory and are predicted by other theories in high-energy physics, starting from grand unified theories (GUT). Read e.g. the last sentence of the abstract of Tristan's 2005 master thesis. The same scientist is at the LHC now.

Second, it's simply a complete lie that string theorists have made the prediction that cosmic strings would be discovered. The discovery of cosmic strings was always a possibility – and it remains a possibility. No well-known professional string theorist has ever made the prediction that it's "more likely than not" that cosmic strings would be discovered in our lifetime, let alone a foreseeable future.

The famous string theorist that was closest to it is Joe Polchinski. There was a wave of activity surrounding cosmic strings according to string theory around 2004. This excitement was amplified by the observation of CSL-1, a cosmic string candidate, in the telescopes. If you read e.g. this 2006 blog post about CSL-1 that communicated the conclusion that CSL-1 wasn't a cosmic string, you will be reminded that Joe Polchinski had declared the probability that the cosmic strings would be discovered in a reasonable future to be 10%. So it's enough to watch it and be sort of thrilled but the number still says "probably not".

Joe Polchinski was still the most enthusiastic famous string theorist when it came to the discovery prospects for cosmic strings. W*it also tries to claim that David Gross' made a failing prediction – when he quotes Gross' sentences from 2007:

String theory is full of qualitative predictions, such as the production of black holes at the LHC or cosmic strings in the sky, and this level of prediction is perfectly acceptable in almost every other field of science. It’s only in particle physics that a theory can be thrown out if the 10th decimal place of a prediction doesn’t agree with experiment.But it's very clear that this statement contains no prediction that was falsified – after all, W*it has been saying for years that string theorists couldn't ever make such a prediction, so he contradicts himself when he says that string theorists did it.

Gross said that the cosmic strings that exist out there – or that may exist out there – are an example of a

*qualitative prediction*. The adjective "qualitative" is explicitly written there and it has a very good reason. The adjective is there to emphasize that string theorists couldn't calculate the tension or density of cosmic strings in the Universe at the moment when David Gross made the statement. We still cannot. So there was obviously no prediction that would imply that "cosmic strings have to be seen by this or that experiment by the year 2017" or anything of the sort.

David Gross talked about these

*qualitative predictions*exactly because they're such a standard part of all scientific disciplines – and string theory is as scientific as other disciplines of science. He contrasted the situation with particle physics where many predictions are quantitative and extremely accurate and a tiny disagreement is enough to eliminate a theory or a hypothesis. But theories in other disciplines of science – and those include string theory in its present form of our understanding of it – don't depend on the precise

*quantitative*observations in this fatal way. That obviously doesn't mean that the questions are unscientific.

The question whether cosmic strings exist in the Universe is obviously scientific, meaningful, deep, and important regardless of whether lying dishonest savages fail to understand the scientific character, meaning, depth, and importance. And we still don't know whether there are cosmic strings in the Universe and what their tension and/or average density is. And we're still intrigued by the possibility and ready to devour new evidence whenever it emerges. Like previous experiments, LIGO has only imposed some constraints on these numbers. But it didn't falsify the whole concept. It couldn't falsify the concept because the concept is

*qualitative*. That doesn't mean that it's unimportant, shallow, meaningless, or unscientific. So cosmic strings will obviously keep on appearing in papers by cosmologists, GUT theorists, string theorists, and others.

I am staggered by the stupidity of the people who are willing to buy this self-evident W*it-like garbage.

Exactly the same comments apply to readers of Backreaction whose author claims that the estimate of a much higher cosmological constant "isn't even a prediction". Holy cow. It clearly is a prediction, it isn't a good one, but it's justified by the same kind of dimensional analysis etc. that is used all over physics to get estimates of so many things. The failure of this methodology in the case of the cosmological constant is obviously a rather important fact that requires a deep enough qualitative explanation. Ms Hossenfelder may only "denounce" such basic methods of physics because she has never done any real physics in her life. Her readers are constantly served pure feces as well but they don't mind – in fact, these Schweinehunds and pigs smack their lips.

by Luboš Motl (noreply@blogger.com) at December 08, 2017 06:43 PM

### Emily Lakdawalla - The Planetary Society Blog

## December 07, 2017

### John Baez - Azimuth

I’d like to explain a conjecture about Wigner crystals, which we came up with in a discussion on Google+. It’s a purely mathematical conjecture that’s pretty simple to state, motivated by the picture above. But let me start at the beginning.

Electrons repel each other, so they don’t usually form crystals. But if you trap a bunch of electrons in a small space, and cool them down a lot, they will try to get as far away from each other as possible—and they can do this by forming a crystal!

This is sometimes called an **electron crystal**. It’s also called a **Wigner crystal**, because the great physicist Eugene Wigner predicted in 1934 that this would happen.

Only since the late 1980s have we been able to make electron crystals in the lab. Such a crystal can only form if the electron density is low enough. The reasons is that even at absolute zero, a gas of electrons has kinetic energy. At absolute zero the gas will minimize its energy. But it can’t do this by having all the electrons in a state with zero momentum, since you can’t put two electrons in the same state, thanks to the Pauli exclusion principle. So, higher momentum states need to be occupied, and this means there’s kinetic energy. And it has more if its density is high: if there’s less room in position space, the electrons are forced to occupy more room in momentum space.

When the density is high, this prevents the formation of a crystal: instead, we have lots of electrons whose wavefunctions are ‘sitting almost on top of each other’ in position space, but with different momenta. They’ll have lots of kinetic energy, so minimizing kinetic energy becomes more important than minimizing potential energy.

When the density is low, this effect becomes unimportant, and the electrons mainly try to minimize potential energy. So, they form a crystal with each electron avoiding the rest. It turns out they form a **body-centered cubic**: a crystal lattice formed of cubes, with an extra electron in the middle of each cube.

To know whether a uniform electron gas at zero temperature forms a crystal or not, you need to work out its so-called **Wigner-Seitz radius**. This is the average inter-particle spacing measured in units of the Bohr radius. The **Bohr radius** is the unit of length you can cook up from the electron mass, the electron charge and Planck’s constant:

It’s mainly famous as the average distance between the electron and a proton in a hydrogen atom in its lowest energy state.

Simulations show that a 3-dimensional uniform electron gas crystallizes when the Wigner–Seitz radius is at least 106. The picture, however, shows an electron crystal in *2 dimensions*, formed by electrons trapped on a thin film shaped like a disk. In 2 dimensions, Wigner crystals form when the Wigner–Seitz radius is at least 31. In the picture, the density is so low that we can visualize the electrons as points with well-defined positions.

So, the picture simply shows a bunch of points trying to minimize the potential energy, which is proportional to

The lines between the dots are just to help you see what’s going on. They’re showing the **Delauney triangulation**, where we draw a graph that divides the plane into regions closer to one electron than all the rest, and then take the dual of that graph.

Thanks to energy minimization, this triangulation wants to be a lattice of equilateral triangles. But since such a triangular lattice doesn’t fit neatly into a disk, we also see some ‘defects’:

Most electrons have 6 neighbors. But there are also some red defects, which are electrons with 5 neighbors, and blue defects, which are electrons with 7 neighbors.

Note that there are 6 clusters of defects. In each cluster there is one more red defect than blue defect. I think this is not a coincidence.

**Conjecture.** When we choose a sufficiently large number of points on a disk in such a way that

is minimized, and draw the Delauney triangulation, there will be 6 more vertices with 5 neighbors than vertices with 7 neighbors.

Here’s a bit of evidence for this, which is not at all conclusive. Take a sphere and triangulate it in such a way that each vertex has 5, 6 or 7 neighbors. Then here’s a cool fact: there must be 12 more vertices with 5 neighbors than vertices with 7 neighbors.

**Puzzle.** Prove this fact.

If we think of the picture above as the top half of a triangulated sphere, then each vertex in this triangulated sphere has 5, 6 or 7 neighbors. So, there must be 12 more vertices on the sphere with 5 neighbors than with 7 neighbors. So, it makes some sense that the *top half* of the sphere will contain 6 more vertices with 5 neighbors than with 7 neighbors. But this is not a proof.

I have a feeling this energy minimization problem has been studied with various numbers of points. So, there either be a lot of evidence for my conjecture, or some counterexamples that will force me to refine it. The picture shows what happens with 600 points on the disk. Maybe something dramatically different happens with 599! Maybe someone has even proved theorems about this. I just haven’t had time to look for such work.

The picture here was drawn by Arunas.rv and placed on Wikicommons on a Creative Commons Attribution-Share Alike 3.0 Unported license.

### Tommaso Dorigo - Scientificblogging

## December 05, 2017

### Clifford V. Johnson - Asymptotia

So I've been waiting for some time to tell you about this clever joke by eminent physicist Anthony Zee. Well, I think it is a joke, I've not checked with him yet: The final production period for T*he Dialogues* was full of headaches, I must say, but there was one thing that made me laugh out loud, for a long time. I heard that Tony had agreed to write a blurb for the back cover of the book, but I did not see it until I was finally sent a digital copy of the back cover, somewhat after everything had (afaik) gone to print. The blurb was simple, and said:

"This is a fantastic book -- entertaining, informative, enjoyable, and thought-provoking."

I thought this was rather nicely done. Simple, to the point, generous.... but, after a while... *strangely familiar*. I thought about it for a while, walked over to one of my bookcases, and picked up a book. What book? My 2003 copy of the the first edition of "Quantum Field Theory in a Nutshell", by A. (for Anthony) Zee. I turned it over. The first blurb on the back says:

"This is a fantastic book -- exciting, amusing, unique, and very valuable."

The author of that blurb? Clifford V. Johnson.

Brilliantly done.

-cvj

Click to continue reading this post

The post Anthony Zee’s Joke(?) appeared first on Asymptotia.

### The n-Category Cafe

An adjunction is a pair of functors $<semantics>f:A\to B<annotation\; encoding="application/x-tex">f:A\backslash to\; B</annotation></semantics>$ and $<semantics>g:B\to A<annotation\; encoding="application/x-tex">g:B\backslash to\; A</annotation></semantics>$ along with a natural *isomorphism*

$$<semantics>A(a,gb)\cong B(fa,b).<annotation\; encoding="application/x-tex">\; A(a,g\; b)\; \backslash cong\; B(f\; a,b).\; </annotation></semantics>$$

**Question 1:** Do we get any interesting things if we replace “isomorphism” in this definition by something else?

- If we replace it by “function”, then the Yoneda lemma tells us we get just a natural transformation $<semantics>fg\to {1}_{B}<annotation\; encoding="application/x-tex">f\; g\; \backslash to\; 1\_B</annotation></semantics>$.
- If we replace it by “retraction” then we get a unit and counit, as in an adjunction, satisfying one triangle identity but not the other.
- If $<semantics>A<annotation\; encoding="application/x-tex">A</annotation></semantics>$ and $<semantics>B<annotation\; encoding="application/x-tex">B</annotation></semantics>$ are 2-categories and we replace it by “equivalence”, we get a biadjunction.
- If $<semantics>A<annotation\; encoding="application/x-tex">A</annotation></semantics>$ and $<semantics>B<annotation\; encoding="application/x-tex">B</annotation></semantics>$ are 2-categories and we replace it by “adjunction”, we get a sort of lax 2-adjunction (a.k.a. “local adjunction”)

Are there other examples?

**Question 2:** What if we do the same thing for multivariable adjunctions?

A two-variable adjunction is a triple of functors $<semantics>f:A\times B\to C<annotation\; encoding="application/x-tex">f:A\backslash times\; B\backslash to\; C</annotation></semantics>$ and $<semantics>g:{A}^{\mathrm{op}}\times C\to B<annotation\; encoding="application/x-tex">g:A^\{op\}\backslash times\; C\backslash to\; B</annotation></semantics>$ and $<semantics>h:{B}^{\mathrm{op}}\times C\to A<annotation\; encoding="application/x-tex">h:B^\{op\}\backslash times\; C\backslash to\; A</annotation></semantics>$ along with natural isomorphisms

$$<semantics>C(f(a,b),c)\cong B(b,g(a,c))\cong A(a,h(b,c)).<annotation\; encoding="application/x-tex">\; C(f(a,b),c)\; \backslash cong\; B(b,g(a,c))\; \backslash cong\; A(a,h(b,c)).\; </annotation></semantics>$$

What does it mean to “replace ‘isomorphism’ by something else” here? It could mean different things, but one thing it might mean is to ask instead for a *function*

$$<semantics>A(a,h(b,c))\times B(b,g(a,c))\to C(f(a,b),c).<annotation\; encoding="application/x-tex">\; A(a,h(b,c))\; \backslash times\; B(b,g(a,c))\; \backslash to\; C(f(a,b),c).\; </annotation></semantics>$$

Even more intriguingly, if $<semantics>A,B,C<annotation\; encoding="application/x-tex">A,B,C</annotation></semantics>$ are 2-categories, we could ask for an ordinary *two-variable adjunction* between these three hom-categories; this would give a certain notion of “lax two-variable 2-adjunction”. Question 2 is, are notions like this good for anything? Are there any natural examples?

Now, you may, instead, be wondering about

**Question 3:** In what sense is a function $<semantics>A(a,h(b,c))\times B(b,g(a,c))\to C(f(a,b),c)<annotation\; encoding="application/x-tex">\; A(a,h(b,c))\; \backslash times\; B(b,g(a,c))\; \backslash to\; C(f(a,b),c)\; </annotation></semantics>$ a “replacement” for isomorphisms $<semantics>C(f(a,b),c)\cong B(b,g(a,c))\cong A(a,h(b,c))<annotation\; encoding="application/x-tex">\; C(f(a,b),c)\; \backslash cong\; B(b,g(a,c))\; \backslash cong\; A(a,h(b,c))\; </annotation></semantics>$?

But that question, I can answer; it has to do with comparing the Chu construction and the Dialectica construction.

Last month I told you about how multivariable adjunctions form a polycategory that sits naturally inside the 2-categorical Chu construction $<semantics>\mathrm{Chu}(\mathrm{Cat},\mathrm{Set})<annotation\; encoding="application/x-tex">Chu(Cat,Set)</annotation></semantics>$.

Now the classical Chu construction is, among other things, a way to produce $<semantics>*<annotation\; encoding="application/x-tex">\backslash ast</annotation></semantics>$-autonomous categories, which are otherwise in somewhat short supply. At first, I found that rather disincentivizing to study either one: why would I be interested in a contrived way to construct things that don’t occur naturally? But then I realized that the same sentence would make sense if you replaced “Chu construction” with “sheaves on a site” and “$<semantics>*<annotation\; encoding="application/x-tex">\backslash ast</annotation></semantics>$-autonomous categories” with “toposes”, and I certainly think *those* are interesting. So now it doesn’t bother me as much.

Anyway, there is also another general construction of $<semantics>*<annotation\; encoding="application/x-tex">\backslash ast</annotation></semantics>$-autonomous categories (and, in fact, more general things), which goes by the odd name of the “Dialectica construction”. The categorical Dialectica construction is an abstraction, due to Valeria de Paiva, of a syntactic construction due to Gödel, which in turn is referred to as the “Dialectica interpretation” apparently because it was *published* in the journal *Dialectica*. I must say that I cannot subscribe to this as a general principle for the naming of mathematical definitions; fortunately it does not seem to have been very widely adopted.

Anyway, however execrable its name, the Dialectica construction *appears* quite similar to the Chu construction. Both start from a closed symmetric monoidal category $<semantics>\mathcal{C}<annotation\; encoding="application/x-tex">\backslash mathcal\{C\}</annotation></semantics>$ equipped with a chosen object, which in this post I’ll call $<semantics>\Omega <annotation\; encoding="application/x-tex">\backslash Omega</annotation></semantics>$. (Actually, there are various versions of both, but here I’m going to describe two versions that are maximally similar, as de Paiva did in her paper Dialectica and Chu constructions: Cousins?.) Moreover, both $<semantics>\mathrm{Chu}(\mathcal{C},\Omega )<annotation\; encoding="application/x-tex">Chu(\backslash mathcal\{C\},\backslash Omega)</annotation></semantics>$ and $<semantics>\mathrm{Dial}(\mathcal{C},\Omega )<annotation\; encoding="application/x-tex">Dial(\backslash mathcal\{C\},\backslash Omega)</annotation></semantics>$ have the same objects: triples $<semantics>A=({A}^{+},{A}^{-},\underline{A})<annotation\; encoding="application/x-tex">A=(A^+,A^-,\backslash underline\{A\})</annotation></semantics>$ where $<semantics>{A}^{+},{A}^{-}<annotation\; encoding="application/x-tex">A^+,A^-</annotation></semantics>$ are objects of $<semantics>\mathcal{C}<annotation\; encoding="application/x-tex">\backslash mathcal\{C\}</annotation></semantics>$ and $<semantics>\underline{A}:{A}^{+}\otimes {A}^{-}\to \Omega <annotation\; encoding="application/x-tex">\backslash underline\{A\}\; :\; A^+\; \backslash otimes\; A^-\; \backslash to\; \backslash Omega</annotation></semantics>$ is a morphism in $<semantics>\mathcal{C}<annotation\; encoding="application/x-tex">\backslash mathcal\{C\}</annotation></semantics>$. Finally, the morphisms $<semantics>f:A\to B<annotation\; encoding="application/x-tex">f:A\backslash to\; B</annotation></semantics>$ in both $<semantics>\mathrm{Chu}(\mathcal{C},\Omega )<annotation\; encoding="application/x-tex">Chu(\backslash mathcal\{C\},\backslash Omega)</annotation></semantics>$ and $<semantics>\mathrm{Dial}(\mathcal{C},\Omega )<annotation\; encoding="application/x-tex">Dial(\backslash mathcal\{C\},\backslash Omega)</annotation></semantics>$ consist of a pair of morphisms $<semantics>{f}^{+}:{A}^{+}\to {B}^{+}<annotation\; encoding="application/x-tex">f^+\; :\; A^+\; \backslash to\; B^+</annotation></semantics>$ and $<semantics>{f}^{-}:{B}^{-}\to {A}^{-}<annotation\; encoding="application/x-tex">f^-\; :\; B^-\; \backslash to\; A^-</annotation></semantics>$ (note the different directions) subject to some condition.

The only difference is in the conditions. In $<semantics>\mathrm{Chu}(\mathcal{C},\Omega )<annotation\; encoding="application/x-tex">Chu(\backslash mathcal\{C\},\backslash Omega)</annotation></semantics>$, the condition is that the composites

$$<semantics>{A}^{+}\otimes {B}^{-}\stackrel{1\otimes {f}^{-}}{\to}{A}^{+}\otimes {A}^{-}\stackrel{\underline{A}}{\to}\Omega <annotation\; encoding="application/x-tex">A^+\; \backslash otimes\; B^-\; \backslash xrightarrow\{1\backslash otimes\; f^-\}\; A^+\; \backslash otimes\; A^-\; \backslash xrightarrow\{\backslash underline\{A\}\}\; \backslash Omega\; </annotation></semantics>$$

$$<semantics>{A}^{+}\otimes {B}^{-}\stackrel{{f}^{+}\otimes 1}{\to}{B}^{+}\otimes {B}^{-}\stackrel{\underline{B}}{\to}\Omega <annotation\; encoding="application/x-tex">A^+\; \backslash otimes\; B^-\; \backslash xrightarrow\{f^+\backslash otimes\; 1\}\; B^+\; \backslash otimes\; B^-\; \backslash xrightarrow\{\backslash underline\{B\}\}\; \backslash Omega\; </annotation></semantics>$$

are *equal*. But in $<semantics>\mathrm{Dial}(\mathcal{C},\Omega )<annotation\; encoding="application/x-tex">Dial(\backslash mathcal\{C\},\backslash Omega)</annotation></semantics>$, we assume that $<semantics>\Omega <annotation\; encoding="application/x-tex">\backslash Omega</annotation></semantics>$ is equipped with an internal preorder, and require that the first of these composites is $<semantics>\le <annotation\; encoding="application/x-tex">\backslash le</annotation></semantics>$ the second with respect to this preorder.

Now you can probably see where Question 1 above comes from. The 2-category of categories and *adjunctions* sits inside $<semantics>\mathrm{Chu}(\mathrm{Cat},\mathrm{Set})<annotation\; encoding="application/x-tex">Chu(Cat,Set)</annotation></semantics>$ as the objects of the form $<semantics>(A,{A}^{\mathrm{op}},{\mathrm{hom}}_{A})<annotation\; encoding="application/x-tex">(A,A^\{op\},hom\_A)</annotation></semantics>$. The analogous category sitting inside $<semantics>\mathrm{Dial}(\mathrm{Cat},\mathrm{Set})<annotation\; encoding="application/x-tex">Dial(Cat,Set)</annotation></semantics>$, where $<semantics>\mathrm{Set}<annotation\; encoding="application/x-tex">Set</annotation></semantics>$ is regarded as an internal *category* in $<semantics>\mathrm{Cat}<annotation\; encoding="application/x-tex">Cat</annotation></semantics>$ in the obvious way, would consist of “generalized adjunctions” of the first sort, with simple functions $<semantics>A(a,gb)\to B(fa,b)<annotation\; encoding="application/x-tex">A(a,g\; b)\; \backslash to\; B(f\; a,b)</annotation></semantics>$ rather than isomorphisms. Other “2-Dialectica constructions” would yield other sorts of generalized adjunction.

What about Questions 2 and 3? Well, back up a moment: the above description of the Chu and Dialectica constructions actually exaggerates their similarity, because it omits their monoidal structures. As a mere category, $<semantics>\mathrm{Chu}(\mathcal{C},\Omega )<annotation\; encoding="application/x-tex">Chu(\backslash mathcal\{C\},\backslash Omega)</annotation></semantics>$ is clearly the special case of $<semantics>\mathrm{Dial}(\mathcal{C},\Omega )<annotation\; encoding="application/x-tex">Dial(\backslash mathcal\{C\},\backslash Omega)</annotation></semantics>$ where $<semantics>\Omega <annotation\; encoding="application/x-tex">\backslash Omega</annotation></semantics>$ has a discrete preorder (i.e. $<semantics>x\le y<annotation\; encoding="application/x-tex">x\backslash le\; y</annotation></semantics>$ iff $<semantics>x=y<annotation\; encoding="application/x-tex">x=y</annotation></semantics>$). But $<semantics>\mathrm{Chu}(\mathcal{C},\Omega )<annotation\; encoding="application/x-tex">Chu(\backslash mathcal\{C\},\backslash Omega)</annotation></semantics>$ is always $<semantics>*<annotation\; encoding="application/x-tex">\backslash ast</annotation></semantics>$-autonomous, as long as $<semantics>\mathcal{C}<annotation\; encoding="application/x-tex">\backslash mathcal\{C\}</annotation></semantics>$ has pullbacks; whereas for $<semantics>\mathrm{Dial}(\mathcal{C},\Omega )<annotation\; encoding="application/x-tex">Dial(\backslash mathcal\{C\},\backslash Omega)</annotation></semantics>$ to be monoidal, closed, or $<semantics>*<annotation\; encoding="application/x-tex">\backslash ast</annotation></semantics>$-autonomous we require the preorder $<semantics>\Omega <annotation\; encoding="application/x-tex">\backslash Omega</annotation></semantics>$ to have those same properties, which a discrete preorder certainly does not always. And even when a discrete preorder $<semantics>\Omega <annotation\; encoding="application/x-tex">\backslash Omega</annotation></semantics>$ does have some or all those properties, the resulting monoidal structure of $<semantics>\mathrm{Dial}(\mathcal{C},\Omega )<annotation\; encoding="application/x-tex">Dial(\backslash mathcal\{C\},\backslash Omega)</annotation></semantics>$ does not coincide with that of $<semantics>\mathrm{Chu}(\mathcal{C},\Omega )<annotation\; encoding="application/x-tex">Chu(\backslash mathcal\{C\},\backslash Omega)</annotation></semantics>$.

As happens so often, the situation is clarified by considering universal properties. That is, rather than comparing the concrete constructions of the tensor products in $<semantics>\mathrm{Chu}(\mathcal{C},\Omega )<annotation\; encoding="application/x-tex">Chu(\backslash mathcal\{C\},\backslash Omega)</annotation></semantics>$ and $<semantics>\mathrm{Dial}(\mathcal{C},\Omega )<annotation\; encoding="application/x-tex">Dial(\backslash mathcal\{C\},\backslash Omega)</annotation></semantics>$, we should compare the functors that they represent. A morphism $<semantics>A\otimes B\to C<annotation\; encoding="application/x-tex">A\backslash otimes\; B\backslash to\; C</annotation></semantics>$ in $<semantics>\mathrm{Chu}(\mathcal{C},\Omega )<annotation\; encoding="application/x-tex">Chu(\backslash mathcal\{C\},\backslash Omega)</annotation></semantics>$ consists of three mophisms $<semantics>f:{A}^{+}\otimes {B}^{+}\to {C}^{+}<annotation\; encoding="application/x-tex">f:A^+\backslash otimes\; B^+\backslash to\; C^+</annotation></semantics>$ and $<semantics>g:{A}^{+}\otimes {C}^{-}\to {B}^{-}<annotation\; encoding="application/x-tex">g:A^+\; \backslash otimes\; C^-\; \backslash to\; B^-</annotation></semantics>$ and $<semantics>h:{B}^{+}\otimes {C}^{-}\to {A}^{-}<annotation\; encoding="application/x-tex">h:B^+\; \backslash otimes\; C^-\; \backslash to\; A^-</annotation></semantics>$ such that a certain three morphisms $<semantics>{A}^{+}\otimes {B}^{+}\otimes {C}^{-}\to \Omega <annotation\; encoding="application/x-tex">A^+\; \backslash otimes\; B^+\; \backslash otimes\; C^-\; \backslash to\; \backslash Omega</annotation></semantics>$ are equal. In terms of “formal elements” $<semantics>a:{A}^{+},b:{B}^{+},c:{C}^{-}<annotation\; encoding="application/x-tex">a:A^+,\; b:B^+,c:C^-</annotation></semantics>$ in the internal type theory of $<semantics>\mathcal{C}<annotation\; encoding="application/x-tex">\backslash mathcal\{C\}</annotation></semantics>$, these certain three morphisms can be written as

$$<semantics>\underline{C}(f(a,b),c)\phantom{\rule{2em}{0ex}}\underline{B}(b,g(a,c))\phantom{\rule{2em}{0ex}}\underline{A}(a,h(b,c))<annotation\; encoding="application/x-tex">\; \backslash underline\{C\}(f(a,b),c)\; \backslash qquad\; \backslash underline\{B\}(b,g(a,c))\; \backslash qquad\; \backslash underline\{A\}(a,h(b,c))\; </annotation></semantics>$$

just as in a two-variable adjunction. By contrast, a morphism $<semantics>A\otimes B\to C<annotation\; encoding="application/x-tex">A\backslash otimes\; B\backslash to\; C</annotation></semantics>$ in $<semantics>\mathrm{Dial}(\mathcal{C},\Omega )<annotation\; encoding="application/x-tex">Dial(\backslash mathcal\{C\},\backslash Omega)</annotation></semantics>$ consists of three morphisms $<semantics>f,g,h<annotation\; encoding="application/x-tex">f,g,h</annotation></semantics>$ of the same sorts, but such that

$$<semantics>\underline{B}(b,g(a,c))\u22a0\underline{A}(a,h(b,c))\le \underline{C}(f(a,b),c)<annotation\; encoding="application/x-tex">\; \backslash underline\{B\}(b,g(a,c))\; \backslash boxtimes\; \backslash underline\{A\}(a,h(b,c))\; \backslash le\; \backslash underline\{C\}(f(a,b),c)\; </annotation></semantics>$$

where $<semantics>\u22a0<annotation\; encoding="application/x-tex">\backslash boxtimes</annotation></semantics>$ denotes the tensor product of the monoidal preorder $<semantics>\Omega <annotation\; encoding="application/x-tex">\backslash Omega</annotation></semantics>$. Now you can probably see where Question 2 comes from: if in constructing $<semantics>\mathrm{Dial}(\mathrm{Cat},\mathrm{Set})<annotation\; encoding="application/x-tex">Dial(Cat,Set)</annotation></semantics>$ we equip $<semantics>\mathrm{Set}<annotation\; encoding="application/x-tex">Set</annotation></semantics>$ with its usual monoidal structure, we get generalized 2-variable adjunctions with a function $<semantics>A(a,h(b,c))\times B(b,g(a,c))\to C(f(a,b),c)<annotation\; encoding="application/x-tex">A(a,h(b,c))\; \backslash times\; B(b,g(a,c))\; \backslash to\; C(f(a,b),c)</annotation></semantics>$, and for other choices of $<semantics>\Omega <annotation\; encoding="application/x-tex">\backslash Omega</annotation></semantics>$ we get other kinds.

This is already somewhat of an answer to Question 3: the analogy between ordinary adjunctions and these “generalized adjunctions” is the same as between the Chu and Dialectica constructions. But it’s more satisfying to make both of those analogies precise, and we can do that by generalizing the Dialectica construction to allow $<semantics>\Omega <annotation\; encoding="application/x-tex">\backslash Omega</annotation></semantics>$ to be an internal *polycategory* rather than merely an internal poset (or category). If this polycategory structure is representable, then we recover the original Dialectica construction. Whereas if we give an arbitrary object $<semantics>\Omega <annotation\; encoding="application/x-tex">\backslash Omega</annotation></semantics>$ the (non-representable) “Frobenius-discrete” polycategory structure, in which a morphism $<semantics>({x}_{1},\dots ,{x}_{m})\to ({y}_{1},\dots ,{y}_{n})<annotation\; encoding="application/x-tex">(x\_1,\backslash dots,x\_m)\; \backslash to\; (y\_1,\backslash dots,y\_n)</annotation></semantics>$ is the assertion that $<semantics>{x}_{1}=\cdots ={x}_{m}={y}_{1}=\cdots ={y}_{n}<annotation\; encoding="application/x-tex">x\_1=\backslash cdots=x\_m=y\_1=\backslash cdots=y\_n</annotation></semantics>$, then we recover the original Chu construction.

For a general internal polycategory $<semantics>\Omega <annotation\; encoding="application/x-tex">\backslash Omega</annotation></semantics>$, the resulting “Dialectica-Chu” construction will be only a polycategory. But it is representable in the Dialectica case if $<semantics>\Omega <annotation\; encoding="application/x-tex">\backslash Omega</annotation></semantics>$ is representable, and it is representable in the Chu case if $<semantics>\mathcal{C}<annotation\; encoding="application/x-tex">\backslash mathcal\{C\}</annotation></semantics>$ has pullbacks. This explains why the tensor products in $<semantics>\mathrm{Chu}(\mathcal{C},\Omega )<annotation\; encoding="application/x-tex">Chu(\backslash mathcal\{C\},\backslash Omega)</annotation></semantics>$ and $<semantics>\mathrm{Dial}(\mathcal{C},\Omega )<annotation\; encoding="application/x-tex">Dial(\backslash mathcal\{C\},\backslash Omega)</annotation></semantics>$ look different: they are representing two instances of the same functor, but they represent it for different reasons.

So… what about Questions 1 and 2? In other words: if the reason I care about the Chu construction is because it’s an abstraction of multivariable adjunctions, why should I care about the Dialectica construction?

by shulman (viritrilbia@gmail.com) at December 05, 2017 06:35 AM

## December 04, 2017

### Andrew Jaffe - Leaves on the Line

It was announced this morning that the WMAP team has won the $3 million Breakthrough Prize. Unlike the Nobel Prize, which infamously is only awarded to three people each year, the Breakthrough Prize was awarded to the whole 27-member WMAP team, led by Chuck Bennett, Gary Hinshaw, Norm Jarosik, Lyman Page, and David Spergel, but including everyone through postdocs and grad students who worked on the project. This is great, and I am happy to send my hearty congratulations to all of them (many of whom I know well and am lucky to count as friends).

I actually knew about the prize last week as I was interviewed by Nature for an article about it. Luckily I didn’t have to keep the secret for long. Although I admit to a little envy, it’s hard to argue that the prize wasn’t deserved. WMAP was ideally placed to solidify the current standard model of cosmology, a Universe dominated by dark matter and dark energy, with strong indications that there was a period of cosmological inflation at very early times, which had several important observational consequences. First, it made the geometry of the Universe — as described by Einstein’s theory of general relativity, which links the contents of the Universe with its shape — flat. Second, it generated the tiny initial seeds which eventually grew into the galaxies that we observe in the Universe today (and the stars and planets within them, of course).

By the time WMAP released its first results in 2003, a series of earlier experiments (including MAXIMA and BOOMERanG, which I had the privilege of being part of) had gone much of the way toward this standard model. Indeed, about ten years one of my Imperial colleagues, Carlo Contaldi, and I wanted to make that comparison explicit, so we used what were then considered fancy Bayesian sampling techniques to combine the data from balloons and ground-based telescopes (which are collectively known as “sub-orbital” experiments) and compare the results to WMAP. We got a plot like the following (which we never published), showing the main quantity that these CMB experiments measure, called the power spectrum (which I’ve discussed in a little more detail here). The horizontal axis corresponds to the size of structures in the map (actually, its inverse, so smaller is to the right) and the vertical axis to how large the the signal is on those scales.

As you can see, the suborbital experiments, en masse, had data at least as good as WMAP on most scales except the very largest (leftmost; this is because you really do need a satellite to see the entire sky) and indeed were able to probe smaller scales than WMAP (to the right). Since then, I’ve had the further privilege of being part of the Planck Satellite team, whose work has superseded all of these, giving much more precise measurements over all of these scales:

Am I jealous? Ok, a little bit.

But it’s also true, perhaps for entirely sociological reasons, that the community is more apt to trust results from a single, monolithic, very expensive satellite than an ensemble of results from a heterogeneous set of balloons and telescopes, run on (comparative!) shoestrings. On the other hand, the overall agreement amongst those experiments, and between them and WMAP, is remarkable.

And that agreement remains remarkable, even if much of the effort of the cosmology community is devoted to understanding the small but significant differences that remain, especially between one monolithic and expensive satellite (WMAP) and another (Planck). Indeed, those “real and serious” (to quote myself) differences would be hard to see even if I plotted them on the same graph. But since both are ostensibly measuring exactly the same thing (the CMB sky), any differences — even those much smaller than the error bars — must be accounted for almost certainly boil down to differences in the analyses or misunderstanding of each team’s own data. Somewhat more interesting are differences between CMB results and measurements of cosmology from other, very different, methods, but that’s a story for another day.

## December 03, 2017

### Lubos Motl - string vacua and pheno

**Are opinions and tirades as important as research and results?**

Two days ago, I described my shock after the Quanta Magazine edited a sentence about the set of candidates for a theory of everything according to the order issued by a scientific nobody who had nothing to do with the interview with Witten.

Natalie Wolchover literally thanked (!) Sabine Hossenfelder for providing her absolutely deluded feedback:

Thanks to Sabine, I realized that Edward Witten was just totally misguided. After all, he's just an old white male and those suck. Sabine Hossenfelder told me that there are lots and lots of candidates for a theory of everything and lots and lots of people like Edward Witten, for example the surfer dude Garrett Lisi. I have absolutely no reason not to trust her so I immediately edited the propaganda imprinted into my article by the old white male dinosaur.She didn't use these words

*exactly*but my words describe more clearly what was

*actually*going on. Crackpots such as Ms Hossenfelder simply control science journalism these days. They have nurtured their contacts, they have the right politics which is what the science journalists actually place at the top, and that's why these disgusting SJWs may overshadow Witten or anyone else.

But I don't want to report just the bad and shocking news of this sort. There are sometimes events that could have evolved insanely but that didn't. On December 1st, Brooklyn saw an example of those. A cultural foundation called the Pioneer Works organized a debate about string theory, Scientific Controversy No 13.

The room was full, mostly of younger people (hundreds of them), Jim Simons funded the event, and popular science writer Janna Levin hosted it. It turns out that a Peter W*it was in the audience but he hasn't talked to anybody and no one has noticed him. If you weren't familiar with these things for years, Peter W*it was one of the most annoying, most hostile, and most dishonest anti-physics demagogues who appeared in the pop science press all the time some 11 years ago.

What could have happened is the following: The host, e.g. Janna Levin, could have said:

Wow, dear visitors, screw our panelists, Clifford Johnson (who got some space to promote his new book, The Dialogues, which has lots of his own impressive illustrations) and especially David Gross. Because of some miracle, we have a true hero here. Let me introduce Peter W*it. Applause. Now, you can go home, Clifford Johnson and David Gross. We will talk to him instead.Thank God, that didn't happen, the piece of šit was treated as a piece of šit, indeed. But some of the sentences voiced at the event were weird, anyway. We learn that at the very beginning, Janna Levin asked both Johnson and Gross whether they were for or against string theory or neutral.

Peter W*it reports that "it flustered David Gross a bit". Well, it surely flusters me, and not just a little bit. Janna Levin is one of the talking heads who often paints herself as a pundit who is familiar with modern physics and events in it. Does she really need to ask David Gross – and, in fact, even Clifford Johnson – whether they are for string theory or against? In fact, it's worse than that. She must have been really ignorant about everything because an answer by Gross – an answer that perhaps deliberately said nothing clearly – was interpreted by Levin as if Gross were a

*string agnostic*. Wow.

David Gross internally acknowledged that the time for obfuscating jokes was over because her ignorance was serious and responded by saying that he's been married to string theory for 50 years and doesn't plan any divorce now.

I don't want to waste my time with detailed comments about new anti-physics tirades by crackpot Mr Peter W*it – which contain no ideas, let alone new ones – but let me say a few words about one sentence because it conveys some delusion that is widespread:

This flustered Gross a bit (he’s one of the world’s most well-known and vigorous proponents of string theory)OK, so W*it agrees it's silly for Levin not to know that Gross is pro-string. But look at the detailed explanation why it's silly. It's silly because "Gross is a well-known and vigorous proponent" of string theory. Is it really the first relevant explanation?

What's going on is that Mr W*it is using language that makes the scientific research itself – and its results – look irrelevant. People are important because of their

*opinions*, W*it wants you to think. So one divides the world to proponents and opponents.

But David Gross isn't important because he's an advocate of something, or because he speaks vigorously. He's important because he has moved our knowledge of physics – and even string theory – forward. And by a lot.

What's really relevant – and what a host of physics debates who is as competent as Janna Levin pretends to be should know – is that he has written lots of papers about physics and also string theory. What's important isn't that he is a proponent of string theory; what's important is that he is an important string theorist!

INSPIRE tells us that he has written some 200+ published and 300+ total number of scientific papers. The total citation count is about 40,000. 19 papers stand above 500 citations.

You know, a citation isn't an easy job. It really means that the author of the followup "mostly or completely" reads your paper. Not every reader cites you so in average, some dozens of highly qualified people spend their time by going through the dozens of pages of your paper which are much more difficult than reading a romantic novel. That's needed for you to earn one citation and Gross has some 40,000 of those.

Look at the list of Gross' 19 papers above 500 citations. One paper with Callan is there from the late 1960s. Then you have the Nobel-prize-related papers about QCD (one with Wilczek), asymptotic freedom, and the structure of the gauge theory's vacuum, anomalies (with Jackiw). Aside from others, there are some 5 papers about the heterotic string from 1985-1987 – Gross is one of the players in the "Princeton quartet" that discovered the heterotic string. A Gross-Witten paper is there to discuss stringy modification of Einstein's equations, and two Gross-Mende papers discussed the high-energy scattering in string theory.

If you look at the groups of papers above 250 or 100 citations, you will find some newer ones – newer papers often have fewer citations "mostly" just because they're newer. You will see that Gross has participated in many important developments, including some very recent ones in string theory. He's had important papers about AdS/CFT (including some analysis of Wilson loops and the stringy dual of QCD), two-dimensional string theory, non-commutative gauge theory within and outside string theory, string field theory, and many many other topics.

Note that lots of grad students sometimes face the need to study Gross' papers (and lots of other papers) in quite some detail. And they remain silent and modest. Janna Levin who paints herself as a physics guru not only fails to study Gross' papers on string theory and other things. She doesn't even know (one superficial bit of information) whether he is "pro string theory"!

Gross is a charismatic and assertive guy and it's great. But the core what makes him a "string theorist" isn't his opinion but his contributions and knowledge. The activists such as W*it, Hossenfelder, Horgan, and dozens of others constantly brainwash their mentally crippled audiences into believing that results and research don't matter. What matters is the "right opinions" and just writing a hostile rant full of lies, idiocy, and hatred – e.g. what the "Not Even Wrong" weblog is all about – is pretty much on par with Gross' contributions to physics. W*it's or Hossenfelder's "work" may be opposite in sign to Gross' but it is the "work" of equal importance in its absolute value, W*it et al. implicitly claim.

Sorry, comrades, but it's not and that's why Gross is a top physicist while you're just worthless deceitful piles of whining excrements, along with the brain-dead scum that keeps on visiting your blogs as of 2017.

And that's the memo.

by Luboš Motl (noreply@blogger.com) at December 03, 2017 05:53 PM

### Tommaso Dorigo - Scientificblogging

## December 02, 2017

### ZapperZ - Physics and Physicists

This is an article on the history of the first controlled nuclear fission that was conducted at the University of Chicago 75 years ago that marked the beginning of the atomic/nuclear age.

They called this 20x6x25-foot setup Chicago Pile Number One, or CP-1 for short – and it was here they obtained world’s the first controlled nuclear chain reaction on December 2, 1942. A single random neutron was enough to start the chain reaction process once the physicists assembled CP-1. The first neutron would induce fission on a uranium nucleus, emitting a set of new neutrons. These secondary neutrons hit carbon nuclei in the graphite and slowed down. Then they’d run into other uranium nuclei and induce a second round of fission reactions, emit even more neutrons, and on and on. The cadmium control rods made sure the process wouldn’t continue indefinitely, because Fermi and his team could choose exactly how and where to insert them to control the chain reaction.

Sadly, other than a commemorative statue/plaque, there's not much left of this historic site. One of the outcome of this work is the creation of Argonne National Lab just outside of Chicago, where, I believe, the research on nuclear chain reaction continued at that time. Argonne now no longer carries any nuclear research work.

Zz.

by ZapperZ (noreply@blogger.com) at December 02, 2017 03:17 PM

### Lubos Motl - string vacua and pheno

The monster group is the largest among the 26 or 27 "sporadic groups" in the classification of all the simple finite groups. The CFT – which was the player that proved the "monstrous moonshine" – may be constructed from bosonic strings propagating on the (24-dimensional space divided by) the Leech lattice, the most interesting even self-dual lattice in 24 dimensions, the only one among 24 of those that doesn't have any "length squared equals to two" lattice sites.

*I didn't have enough space here for a picture of Witten and a picture of a monster so I merged them. Thanks to Warren Siegel who took the photograph of Cyborg-Witten.*

The absence of these sites represents to the absence of any massless fields. So the corresponding gravity only has massive objects, the black hole microstates, and they transform as representations of the monster group. I will only discuss the monster group CFT with the "minimum radius" – Davide Gaiotto has proven that the infinite family of the larger CFTs cannot exist, at least not for all the radii and with the required monster group symmetry, because there's no good candidate for a spin field corresponding to a conjugacy class.

I think that the single CFT with the single radius is sufficiently fascinating a playground to test lots of ideas in quantum gravity – and especially the relationship between the continuous and discrete structures (including global and gauge groups) in the bulk and on the boundary.

It's useful to look at the list of irreducible representations of the monster group for at least 10 minutes. There are 194 different irreps – which, by Schur's tricks, means that there are 194 conjugacy classes in the monster group. Don't forget that the order of any element has to be a supersingular prime.

However, you will only find 170 different dimensionalities of the irreps. For years ;-), I have assumed that it means that 146 dimensionalities are unique while for 24 others, the degeneracy is two – so the total number of irreps is 146*1+24*2 = 194. It makes sense to think that some of the representations are complex and they're complex conjugate to each other, in pairs.

Well, just very very recently ;-), I looked very very carefully, made a histogram and saw that one dimension of the irreps, namely the dimension

5 514 132 424 881 463 208 443 904,(5.5 American septillions) appears thrice – like the 8-dimensional representation of \(SO(8)\) appears in "three flavors" due to triality. Why hasn't anyone told me about the "tripled" irrep of the monster group? I am sure that all monster minds know about this triplet of representations in the kindergarten but I didn't. So the right answer is that there are 147 representations uniquely given by their dimension, 22 dimensionalities appear twice, and 1 dimensionality (above) appears thrice.

BTW those 5.5 septillions has the factor of \(2^{43}\) – a majority of the factor \(2^{46}\) in the number of elements in the monster group – and no factors of three. This large power of two is similar to the spinor representations (e.g. in the triality example above).

Fine. Among the 194 irreps, there's obviously the 1-dimensional "singlet" representation. Each group has singlets. The first nontrivial representation is 196,883-dimensional. This many states, along with a singlet (so 196,884 in total), appear on the first excited level of the CFT – so there are 196,884 black hole microstates in pure \(AdS_3\) gravity with the minimum positive mass (this number appears as a coefficient in an expension of the \(j\)-invariant, a fact that was known as the "monstrous moonshine" decades before this "not so coincidence" was explained). This level of black holes has some energy and nicely enough,\[

196,883\sim \exp(4\pi),

\] as Witten was very aware, and this approximate relationship is no coincidence. So the entropy at this level is roughly \(S\approx 4\pi\) which corresponds to \(S=A/4G\) i.e. \(A\approx 16\pi G\). Note that the "areas" are really lengths in 2+1 dimensions and Newton's constant has the units of length, too. The entropy proportional to \(\pi\) is almost a matter of common sense for those who have ever calculated entropy of 3D black holes using stringy methods. But it's also fascinating for me because of my research on quasinormal modes and loop quantum gravity.

The real part of the asymptotic, highly-damped quasinormal modes was approaching\[

\frac{\ln 3}{8\pi G M}

\] where \(M\) is the Schwarzschild black hole mass. The argument \(3\) in the logarithm could have been interpreted as the degeneracy of some links in \(SO(3)\) spin networks – and that's why I or we were treated as prophets among the loop quantum gravity and other discrete cultists and why my and our paper got overcited (although we still loved them). It's a totally unnatural number that appears there by coincidence, and I – and I and Andy Neitzke – gave full analytic proofs that the number is \(3\) exactly. It's not a big deal, it's a coincidence, and \(3\) is a simple enough number so that it can appear by chance.

But the funny thing is that the quasinormal frequency becomes a more natural expression if \(3\) is replaced with another dimension of an irrep. Fifteen years ago, I would play with its being replaced by \(248\) of \(E_8\) which could have been relevant in 11-dimensional M-theory, and so on. (\(E_8\) appears on boundaries of M-theory, as shown by Hořava and Witten, but is also useful to classify fluxes in the bulk of M-theory spacetimes in a K-theory-like way, as argued by Diaconescu, Moore, and Witten. Note that "Witten" is in all these author lists so there could be some extra unknown dualities.) And while no convincing theory has come out of it, I still find it plausible that something like that might be relevant in M-theory. The probability isn't too high for M-theory, however, because M-theory doesn't seem to be "just" about the fluxes, so the bulk \(E_8\) shouldn't be enough to parameterize all of the physical states.

But let's replace \(3\) with \(196,883\) or \(196,884\), the dimension of the smallest nontrivial irrep of the monster group (perhaps plus one). You will get\[

\frac{\ln 196,883}{8\pi G M} \approx \frac{1}{2GM}

\] The \(\pi\) canceled and the expression for the frequency dramatically simplified. Very generally, this nice behavior may heuristically lead you to study Chern-Simons-like or loop-quantum-gravity-like structures where the groups \(SU(2)\) or \(SO(3)\) or \(SL(2,\CC)\) which have 3-dimensional representations is replaced with the discrete, monster group.

A fascinating fact is that aside from this numerological observation, I've had numerous other reasons to consider Chern-Simons-like theories based on the finite, monster group. Which ones?

Well, one reason is simple. The boundary CFT of Witten's has the monster group as its global symmetry. So the monster group is analogous e.g. to \(SO(6)\) in the \({\mathcal N}=4\) supersymmetric gauge theory in \(D=4\) which is dual to the \(AdS_5\) vacuum of type IIB string theory. Just like the \(SO(6)\) becomes a

*gauge group*in the bulk gravitational theory (symmetry groups have to be gauged in quantum gravity theories; this one is a Kaluza-Klein-style local group), the monster group should analogously be viewed as a gauge group in the \(AdS_3\) gravitational bulk.

On top of that, there are gauge groups in \(AdS_3\) gravity. In 1988, the same Witten has showed the relationship between the Chern-Simons theory and 3D gravity. It was a duality at the level of precision of the 1980s although decades later, Witten told us that the duality isn't exact non-perturbatively etc. But that Chern-Simons theory replacing the fields in 3-dimensional gravity could be correct in principle. Just the gauge group could be incorrect.

Well, maybe it's enough to replace \(SL(2,\CC)\) and similar groups with the monster group.

One must understand what we mean by a Chern-Simons theory with a discrete gauge group and how to work with it. Those of us who are loop quantum gravity experts ;-) are extremely familiar with the spin network such as

This is how the LQG cultists imagine the structure of the 3+1-dimensional spacetime at the Planckian level. There is obviously no evidence that this is the right theory, nothing seems to work, nothing nice happens when the 4D gravity is linked to those gauge fields in this way, no problem or puzzle of quantum gravity is solved by these pictures. But the spin networks are still a cute, important way to parameterize some wave functionals that depend on a gauge field. Well, I guess that Roger Penrose, and not Lee Smolin, should get the credit for the aspects of the spin networks that have a chance to be relevant or correct somewhere.

If you consider an \(SU(2)\) gauge field in a 3-dimensional space, you may calculate the "open Wilson lines", the transformations induced by the path-ordered exponential of the integral of the gauge field over some line interval. It takes values in the group itself. As an operator, it transforms as \(R\) according to the transformations at the initial point, \(\bar R\) according to the final point – you need to pick a reference representation where the transformations are considered. And you may create gauge-invariant operators by connecting these open Wilson lines – whose edges are specified by a transformation – using vertices that bring you the Clebsch-Gordan coefficients capable of connecting three (or more) representations at the vertex.

Above, you see a spin network. The edges carry labels like \(j=1/2\), the non-trivial irreps of \(SU(2)\). They're connected at the vertices so that the addition of the angular momentum allows the three values of \(j\) to be "coupled". For \(SU(2)\), the Clebsch-Gordan coefficients are otherwise unique. Each irrep appears at most once in the tensor product of any pair of irreps.

Now, my proposal to derive the right bulk description of the \(AdS_3\) gravity is to identify an \(SO(3)\) Chern-Simons-style description of the 3D gravity and replace all the 3-dimensional representations – in \(SO(3)\), the half-integral spin irreps are prohibited – with the monster group.

In this replacement, it should be true that a majority of the edges of the spin network carry \(j=1\) i.e. the 3-dimensional representation. And that 3-dimensional representation is replaced with the \(196,883\)-dimensional one in the monster group case. Otherwise the structures should be analogous. I tend to believe that the relevant spin networks should be allowed to be attached to the boundary of the Anti de Sitter space, and therefore resemble something that is called Witten's diagrams – the appearance of "Witten" seems like another coincidence here ;-) because I don't know of good arguments (older than mine) relating these different ideas from Witten's assorted papers.

Note that the 196,883-dimensional representation is vastly smaller than the larger irreps: the next smallest one is 21-million-dimensional, more than 100 times larger. And it's also useful to see how the tensor product of two copies of the \(d_2=196,883\)-dimensional irrep decompose to irreps. We have:\[

d_2^2 = 2(d_5+d_4+d_1) + d_2.

\] Both sides are equal to 38,762,915,689, almost 39 billion. So the singlet appears twice, much like the fifth and fourth representation. But the same 196,883-dimensional representation appears exactly once (and the third, 21-million-dimensional one is absent). It means that there's exactly one cubic vertex that couples three 196,883-dimensional representations. On top of that, because of the "two singlets" \(2d_1\) on the right hand side above, there are two ways to define the quadratic form on two 196,883-dimensional representation.

I think that in some limit, the spin networks with the "edges 196,883" only will dominate, and the extra subtlety is that each of these edges may or may not include a "quadratic vertex" that switches us to the "less usual" singlet among the two. The presence or absence of this quadratic vertex could basically have the same effect as if there were two different types of the 196,883-dimensional irrep, unless I miss some important detail which I probably do.

Now, there might exist a spin-network-like description of the black hole microstates in \(AdS_3\) and the reason why it works could be a relatively minor variation of the proof of the approximate equivalence of the Chern-Simons theory and the three-dimensional general relativity. The mass of the black hole microstates could be obtained from some "complexity of the spin network" – some weighted number of vertices in the network etc. which could follow from the \(\int A\wedge F\)-style Hamiltonians.

I believe that according to some benchmarks, the \(AdS_3\) pure gravity vacuum should be the "simplest" or "most special" vacuum of quantum gravity. The gauge group is purely discrete which is probably an exception. That's related to the complete absence of the massless fields or excitations which is also an exception. And things just should be more or less solvable and the solution could be a clever variation of the equivalences that have already been written in the literature.

If some deep new conceptual principles are hacked in the case of the monstrous \(AdS_3\) gravity, the remaining work needed to understand the logic of all quantum gravity vacua could be as easy as a generalization of the finite group's representation theory to Lie groups and infinite-dimensional gauged Lie groups. Those also have irreps and conjugacy classes and the relationships between those could be a clever version of the proof that the old matrix model is equivalent to free fermions. Such a unified principle obeyed by all quantum gravity vacua should apply to spacetimes, world sheets, as well as configuration spaces of effective field theories.

by Luboš Motl (noreply@blogger.com) at December 02, 2017 12:58 PM

## December 01, 2017

### Clifford V. Johnson - Asymptotia

Now that #thedialoguesbook is out is I get even more people telling me how they can’t draw. I don’t believe them. Just as with science (and other subjects), everybody has a doorway in to a subject. It is just a matter of taking time to finding your individual Door. Individual doors is what makes us all wonderfully different. For me it is mostly geometry that is my Door. It gives a powerful way to see things, but isn’t the only way. Moreover, I have to work hard to not be trapped by it sometimes. But it is how I truly see things most often - through geometry. Wonderful geometry everywhere.

-cvj Click to continue reading this post

The post The Geometry Door appeared first on Asymptotia.

### Lubos Motl - string vacua and pheno

Unsurprisingly, these deep disagreements had extra consequences. One of the answers that Edward Witten "dared" to say was that M-theory was our candidate for a description unifying separate theoretical formalisms quantifying particles and forces that exist or may exist in Nature. Wolchover, the journalist, announced her interview on Twitter and one dissatisfied reaction by Ms Hossenfelder was posted soon afterwards:Young Sheldon, off-topic:in the newest episode of the sitcom, we could have seen Elon Musk whose SpaceX stole the ideas how to land on the ocean from the 8-year-old Sheldon Cooper and made Sheldon's notebook disappear. It's great he played it because that's how I imagine Musk's companies to operate whenever they do something right. ;-)

There are various other candidates for a theory of everything, eg Alain Connes' noncommutative geometry, Asymptotically Safe Gravity, causal fermion systems, E8 theory. The statement that M-theory is the only candidate isn't only misleading, it's plainly wrong.

— Sabine Hossenfelder (@skdh) November 28, 2017

Hossenfelder repeats the insane 2010 meme by Nude Socialist that Edward Witten is one of approximately 7 similar geniuses – the list includes Garrett Lisi and a former spouse of Lee Smolin – who have proposed comparably promising theories of everything. Needless to say, none of the "alternative theories" above could be called by a "candidate for a theory of everything" by a sane person who knows the basic stuff about the limiting and approximate theories that contemporary theoretical physics uses and why they're hard to be unified.

But even if you just don't get any of the physics, you should be able to understand some sociology. Take INSPIRE, the database of papers on high-energy physics, and compare the publication and citation record of Garrett Lisi and Edward Witten. They have some 90 and 130,000 citations, respectively – and I need to emphasize that the first number is

*not*90,000. ;-)

The difference is more than three orders of magnitude. Even a decade after his "groundbreaking" work, the surfer dude's quantifiable impact is more than three orders of magnitude weaker than Witten's. On top of that, every citation among 90 Lisi's citations is vastly less credible than the average followup of Witten's papers. There are very good reasons to say that the people who have followed "theories of everything" for some years and who consider Lisi and Witten to be peers (or even think that Lisi is superior) suffer from a serious psychiatric illness.

I think that this is not Hossenfelder's case – she must know that what she writes are just plain and absurd lies, lies whose propagation would be convenient for herself personally.

OK, so the fringe would-be researcher Ms Sabine Hossenfelder has suggested that it was politically incorrect for Edward Witten to say that the contemporary physics only has one viable candidate for a theory of everything, namely string/M-theory.

Even if you hypothetically imagined that Hossenfelder's proposition about "numerous alternatives" was right while Edward Witten was wrong, it shouldn't have mattered, should it have? Wolchover interviewed Edward Witten, not Sabine Hossenfelder, so the answers should be aligned with the views of Edward Witten, not Sabine Hossenfelder. But look what happened.

Thanks to Sabine, we've changed the article to describe M-theory as the leading TOE candidate, noting that other ideas are out there claiming to unify the fundamental forces. Most are duds, but what do people think of Connes' noncommutative geometry? https://t.co/ajTgIrzTsO https://t.co/09ZYtgDpXI

— Natalie Wolchover (@nattyover) November 29, 2017

(Just to be sure, Connes' theory is in no way a step towards a theory of everything. It has nothing to say about any problem related to quantum gravity. It's an unusual, non-commutative-geometry-based proposal to unify the non-gravitational forces, like in GUT. At the end, it produces some subset of the effective field theories we know. The subset doesn't look natural and it has produced predictions that were falsified. For example, Connes predicted a Higgs boson of mass \(170\GeV\) which was ironically the first value that was ruled out experimentally – by the Tevatron in 2008. Some huge, TOE-like statements can always be made by someone, perhaps even by Connes, but it is spectacularly obvious to everyone who has at least the basic background that these ambitious claims have nothing to do with reality.)

Hossenfelder's dumb comment was not only taken into account. The interview was actually edited and Hossenfelder was even thanked for that! So for some time, Witten's answer in the interview probably contained some statement of the type "M-theory is just one of many similar alternatives and I, who first conjectured the existence of M-theory, am on par with monster minds such as Garrett Lisi". The Quanta Magazine didn't edit the answer quite this colorfully but it did edit it, as Wolchover's tweet says, so that Witten's proposition was severely altered.

Wolchover isn't a bad journalist but even independently of any knowledge of the beef of physics (i.e. even if you assumed that it's just OK for a science journalist not to know that all claims by Hossenfelder about physics of the recent 40 years are wrong), I think that this retroactive edit was an example of a breakdown of the basic journalistic integrity. You simply cannot severely edit the answers by EW in an interview just because someone else, SH, would prefer a totally different answer. Such a change is nothing less than a serious deception of the reader. And note that the Quanta Magazine, funded mainly by Jim Simons, a rather legendary physical mathematician (and a co-author of a theory that Witten wrote lots of important followups about), is surely among the outlets where it's least expected that journalists distort scientists' views. It's almost certainly worse in all truly mainstream media.

Now, I am virtually the only person in the world who reacts – at a frequency that detectably differs from zero – to these constant scams by the various activists, journalists, pop science advocates and hysterical "critics of science". But in this case, I had a pal. David Simmons-Duffin, an assistant professor at Caltech and a co-author of the newest paper co-written by Witten, may still be grateful for the good enough furniture I sold him at Harvard, shortly before escaping the U.S. on the 2007 Independence Day when my visa was expiring.

Or maybe he had deeper reasons than the furniture. Hopefully. ;-)

I think most high energy theorists would agree with Lubos Motl's (scientific) comments on that article.

— David Simmons-Duffin (@davidsd) November 30, 2017

OK, Wolchover was told – and she could literally remain

*ignorant*of this elementary fact if David remained silent – that most theoretical physicists would agree with my (scientific) comments about the "encounter of pop science and Edward Witten", and not with the comments by Sabine Hossenfelder. David was careful to restrict the approval by the adjective "scientific" to be sure that he didn't dare to accuse someone from believing in some right-wing or otherwise politically incorrect views about broader questions. I will return to a discussion whether these things are really as separate as David suggests.

Right now, Witten's answer in the interview says "M-theory is the candidate for the better description" which sounds OK enough. At the top, the article says (seemingly on behalf of Ms Wolchover) that M-theory is the "leading candidate" and a clarification at the bottom says that an edit was made. Maybe the statements were edited twice and they're tolerable now. But the very fact that Hossenfelder's

*hostile opinion*about the number of theories of everything was incorporated to an interview that should have built on

*Witten's views on physics*is something that I consider absolutely shocking. In effect, we seem to live in a society where a scientific Niemand of Hossenfelder's caliber stands above Edward Witten and has the power to "correct" his statements about Witten's field made in any media outlet on Earth.

Clearly, the main purpose of David's tweet was to inform Natalie Wolchover, the journalist, about some "basic sentiments" that are widespread among actual professional theoretical physicists – and indeed, about actual beliefs of Edward Witten's, too. But David sent a copy to Sabine as well who reacted angrily:

I think I don't care what you think about what other people think about what someone thinks about someone's thoughts about something.

— Sabine Hossenfelder (@skdh) November 30, 2017

Hossenfelder "doesn't care" what Witten's recent co-authors think about the opinions of theoretical physicists about an interview. In the repetitively, stupidly sounding joke, she has clearly added at least one redundant level of recursion that shouldn't be there. ;-) But she, apparently rightfully, assumes that journalists

*do care*what

*she thinks*about what Witten

*should think*.

Again, the irony is that Hossenfelder believes that Wolchover should care what Hossenfelder thinks about theories of everything but she doesn't care what people around Witten actually believe, even though it was an interview with Witten that is at the center of all these discussions. Thankfully, Wolchover at least replied that "she cared" about David's reports – and blamed Hossenfelder's arrogant "I don't care" reaction to Hossenfelder's wacky sense of humor.

To assure everyone that this is not the end of her jihad against physics, Sabine Hossenfelder posted another obnoxious tirade against modern physics. By studying SUSY, string theory, inflation, or the multiverse, scientists have stopped doing science, she claims, and we need to pick a new word instead of science to describe what they're doing. A fairy-tale science? Higher speculations? Fake physics? She mentions proposals by some of her true soulmates who are considered sources of worthless and hostile pseudointellectual junk by everyone who has a clue.

SUSY, string theory, inflation, and to a lesser degree the multiverse are examples of

*science par excellence*. It's the critics of science like Hossenfelder and the assorted activists who are at most fake scientists and who have nothing to do with the actual science. They're not cracking equations that may link or do link observable quantities with each other. They are doing politics for the stupid masses.

By the way, Hossenfelder's argumentation becomes almost indistinguishable from the deluded tirades by the postmodern pseudo-philosophers and feminist bitches who would say that science is just another cultural habit, a tool of oppression, and so on. Hossenfelder wrote, among many similar things:

“Science,” then, is an emergent concept that arises in communities of people with a shared work practices. “Communities of practice,” as the sociologists say.Wow, so just like the "sociologists" say, the readers is invited to believe that every "community" that shares work practices is equally justified to describe itself as a "group of scientists" who do "science". Perhaps if they say that they're looking for "useful descriptions of Nature", that is certainly enough. For this reason, science is on par with the ritualistic dances of savages in the Pacific Ocean, the "sociologists" say.

But this "sociological" view is just flabbergastingly stupid. Science obviously isn't

*any*group of practices of

*any*community. After all, science doesn't really depend on a

*community*at all and some of the most important scientists were true solitaires in their work – and often outside their work, too. The scientific method is a rather particular template how to make progress while learning the truth about Nature. This template had to be discovered or invented – by Galileo and perhaps a few others – and these principles have to be kept, otherwise it's not science and, more importantly, otherwise it's really not working and doesn't systematically bring us closer to a deeper understanding of the world. Hypotheses must imply something that may be expressed sufficiently accurately, ideally and typically in the mathematical language, and the network of these assumptions of the theories and their logical implications are elaborated upon and finally compared to the facts that are known for certain – and the facts that are known to be certain ultimately come from experiments, observations, and mathematically solid proofs.

Only a very small percentage of people in the world actually does science related to theoretical physics and if it makes sense to talk about a community – especially a community of theoretical physicists – there is really just one global community. It's doing science defined by the same general template. The relevant theories and questions have become much more advanced than they were in the past – the theories are more unifying, they require a deeper, more difficult, and more abstract mathematical formalism, the experimental tests of the new things are increasingly expensive and often impossible to be made in a foreseeable future, and this forces the researchers to be even more careful, intellectually penetrating, and employing increasingly indirect strategies to probe questions. But those are quantitative changes reflecting the change of the phenomena that are investigated by the cutting-edge science. They are not negations of the basic meaning of the scientific rigor – and its dependence on honesty, mental power, patience, and the illegitimacy of philosophically justified dogmas.

Let me return to the comment by David that the actual working physicists may only declare the agreement with my "scientific" views about the interview and other things. It's true but it's a pity. David implicitly declares the view that politics and science are sharply separated. So I could be thrilled by lots of amazing discoveries made by the people who are overwhelmingly politically left-wingers – and they, if they're similarly honest as scientists, should be able to proudly say that they agree with my (a right-winger) many of my multi-layered comments about physics. Surely some people with Che Guevara T-shirts have been doing so, too. ;-)

When this setup of "politics separated from science" works, it works and it's great if it works. But the problem that David and others seem to overlook is that people like Sabine Hossenfelder – and basically everyone on the list of her "soulmates" who participate in this jihad against physics – are doing everything they can to obfuscate the boundary between science and politics. It wasn't your humble correspondent who would write blog posts addressing both groups of questions, scientific and political ones, because I find their increasingly intense mixing desirable. Those blog posts were a reaction to events that reflected the "dirty" mixing of science and politics. How many times did I have to explain that one can't understand any physics through sociology and similar things?

The likes of Hossenfelder and Šmoits are full of the word "science" but what they're actually up to is a disgusting political movement that is trying to brainwash millions of gullible morons and turn them against science. And to do so, the likes of Hossenfelder find it very convenient to pretend that they are or they could be equally excellent scientists – the peers of Witten – themselves. And there are millions of people who are ignorant enough so that they buy this nonsense. Maybe these brainwashed laymen are honest – they are just sufficiently intellectually limited which makes them unable to figure out that all these critics are scientifically worthless relatively to Witten but also hundreds of others.

David, you and others don't want to participate in a fight that is political which is an understandable reflection of an idealist, morally pure scientific ethics. But that desire won't make this fight go away. You may deny it but this fight is taking place, anyway (because the likes of Hossenfelder are deliberately waging it), and it has far-reaching consequences for the future of science in the real world.

You may combine this "pure focus on science" with some polite, "nice", politically correct attitudes, while persuading yourself that these things have nothing to do with each other. But in reality, the political attitudes influence the future strength of science in the society – and the ability of wise kids of future generations to do pure science as their job – tremendously, surely much more strongly than some extra $10,000 that a physics grad student will lose according to an (excellent) planned tax reform (I actually saved over $10,000 every year as a grad student, so under the new system, my budget would be just balanced – which would probably increased my desire that I simply had to stay in the U.S. for a few years as a postdoc or more). When the public overwhelmingly buys the idea that Edward Witten or the IAS at Princeton don't do anything that would go beyond what a surfer dude may do in Hawaii while surfing, the efforts to selectively defund high-brow science will accelerate.

Every time you're silent when someone like Ms Hossenfelder spreads this hostile garbage, you are helping this movement to win and destroy pure science – as a realistically sustainable occupation that hundreds or thousands of people can do – in the coming years. Every time you are allowing someone to get a degree for political reasons such as her sex or skin color, you are increasing the chances that you have produced a new weapon that will be used to obfuscate the separation of science and politics and to attack science using political tools. If someone got her PhD or jobs because of (identity) politics, be sure that she will be grateful to (identity) politics and will help (identity) politics to defeat science.

One more comment – about the wrong classification of topics. A week ago, I posted two comments under an article Our Bargain at the 4gravitons blog. The second one – mainly explaining that some people's screaming that they have to make this or that amazing progress very soon – was just a wishful thinking (analogous to planners in socialist planned economies). And the owner of the blog erased it because he has a "policy not to allow comments that are about politics".

But this justification was obviously completely fraudulent because the whole original text, Our Bargain, was about politics. It was mainly about funding and self-funding of scientists and the period for which a financial injection should last and other things. These topics are actually totally political in character. My deleted comment was actually much less political than his original text. So the explanation of the erasure was just plain dishonest.

It's very obvious what's going on. My comment was erased as a political one because it wasn't sufficiently respectful towards the left-wing orthodoxy. It's even possible – because of his policies, I can't know for certain and I can (and I must) only speculate – that Tetragraviton loves the planned economy and can't tolerate any criticism of it! Right-wing comments are often erased with the explanation that they're political while totally analogous left-wing comments – and even left-wing original articles – may be kept and sometimes even thanked to. The double standards must be absolutely self-evident to any honest person, including a left-winger.

These double standards are not only unjust. They are ultimately very harmful to science. By insisting that they're "primarily" scholars who are loyal to the left-wing orthodoxy of the Academia, the scientists help the actual "leftists-in-chief" to succeed in their plans and one of them involves the obfuscation of the boundary between science and politics and the eradication of pure science as we have known it for centuries. It's bad when you, the leftists who still do excellent research, don't realize what you're actually helping to do with science by your subtle and not so subtle endorsements of the radical left-wing positions. And it's even worse if you realize it but you keep on doing it, anyway.

Idealized science is separate from politics and the best physics groups are still close enough to this ideal. But the actual particular discoveries are being made by real scientists in the real world and that world is affected by politics – and by political movements of science haters such as Ms Hossenfelder, Mr Woit, and hundreds of others who simply don't want tolerate science because they're shocked by the fact that they're not good enough to practice it themselves. So you should better understand some of this politics and the consequences of some of your actions – including affirmative action – that you incorrectly believe to he harmless.

And that's the memo.

by Luboš Motl (noreply@blogger.com) at December 01, 2017 02:16 PM

### Tommaso Dorigo - Scientificblogging

What is a neutrino? Nothing - it's a particle as close to nothing as you can imagine. Almost massless, almost perfectly non-interacting, and yet incredibly mysterious and the key to the solution of many riddles in fundamental physics and cosmology. But it's really nothing you should worry about, or care about, if you want to lead your life oblivious of the intricacies of subnuclear physics. Which is fine of course - unless you try to use your ignorance to stop progress.

## November 30, 2017

### ZapperZ - Physics and Physicists

In the back page section of this months (Nov. 2017) APS News called.... wait for it... "The Back Page", Andrew Zwicker Princeton Plasma Physics Lab also a legislator in the state of New Jersey, US, reflects on the lack of scientists, and scientific methodology in politics and government. I completely agree on this part that I'm quoting here:

As scientists we are, by nature and training, perpetually skeptical yet constantly open to new ideas. We are guided by data, by facts, by evidence to make decisions and eventually come to a conclusion that we immediately question. We strive to understand the "big picture", and we understand the limitations of our conclusions and predictions. Imagine how different the political process would be if everyone in office took a data-driven, scientific approach to creating legislation instead of one based on who can make the best argument for a particular version of the "facts".

Anyone who has followed this blog for a length of time would have noticed my comments many times on this subject, especially in regards to scientists or physicists in the US Congress (right now there's only one left, Bill Foster). I have always poinpointed the major problem with people that we elect, that the public tends to vote for people who agree with their views, rather than individuals who are able to think, who have a clear-cut way of figuring out who to ask or where to look to seek answer. In other words, if a monkey agrees with their view on a number of issues, even that monkey can get elected, regardless of whether that monkey can think rationally.

It is why we have politicians bunkered-in with their views rather than thinking of what is the right or appropriate thing to do based on the facts. This is also why it is so important to teach science, and about science, especially on arriving at an idea or conclusion rationally and analytically, to students who are NOT going to go into science. Law schools should make it compulsory that their students understand science, not for the sake of the material, but rather as a method to think things through.

Unfortunately, I'm skeptical for any of that to happen, which is why the crap that we are seeing in politics right now will never change.

Zz.

by ZapperZ (noreply@blogger.com) at November 30, 2017 07:40 PM

### Axel Maas - Looking Inside the Standard Model

by Axel Maas (noreply@blogger.com) at November 30, 2017 05:15 PM

### Lubos Motl - string vacua and pheno

**Off-topic, science:**China's DAMPE's dark matter signal

Natalie Wolchover is among the best popular writers about theoretical physics. But when I read her interview with Edward Witten at the Quanta Magazine,

A Physicist’s Physicist Ponders the Nature of Reality,the clash of cultures seemed amusingly obvious to me. Witten is much smarter than myself and he also loves to formulate things in ways that look insanely diplomatic or cautious to me but I can still feel that his underlying sentiments are extremely close to mine.

They have discussed the conceptual and, I would say, emotional aspects of the (2,0) theory, M-theory, dualities, Wheeler's "it from bit", tennis, a hypothetical new overarching description of all of physics, and other things. It looks so obvious that Wolchover "wanted" to hear completely different answers than she did! ;-)

OK, let me start to comment on the interview. Wolchover explains he is a physicists' physicist, geniuses' genius, and a Fields Medal winner, among other things. She managed to interview him but he managed to make her invisible on the stone paths. Well, I felt some dissatisfaction already there. We were told about the children's drawings and piles of papers in Witten's office, not to mention a picture of a woman's buttocks in a vase. After some introduction to dualities and M-theory etc., we we told about her first question.

OK, Wolchover asks why Witten was interested in dualities which physicists sometimes talk about recently. Here it's not quite waterproof but even in the first question, I could hear her "why would you study something so boring and ugly like dualities?". Well, exchanges that appear later in the interview have reinforced this initial impression of mine. I surely think that Wolchover is totally turned off by dualities and, like almost all laymen, she doesn't appreciate them at all.

Witten answered that dualities give you new tools to study questions that looked hopelessly hard. Some examples are sketched, including a few words on the AdS/CFT. Wolchover asks about (AdS) holography but Witten redirects the discussion to a more general concept, the dualities of all the kinds, and says that it's often more than two descriptions that are dual to each other. Again, I think that you can see tension between the two people concerning "what should be discussed and/or celebrated". In this situation, Wolchover seems eager to repeat some of the usual clichés about holography in quantum gravity while Witten wants to emphasize some more general features of all dualities and what they mean.

The following question by Wolchover is the aforementioned confirmation of her negative attitude towards dualities:

Given this web of relationships and the issue of how hard it is to characterize all duality, do you feel that this reflects a lack of understanding of the structure, or is it that we’re seeing the structure, only it’s very complicated?Do you see her proposed two answers? Both of them are "negative". Either dualities mean that we're dumb; or they mean that the structure is "complicated" by which she rather clearly means "disappointingly ugly". But both of these propositions are completely upside down. Dualities mean that physicists are smart, not dumb; and they imply that the underlying structure is more beautiful, robust, constrained, and exceptional than we have thought. I am pretty sure that Witten would basically agree with these words of mine but there's this tendency of his to avoid any disagreements so he prefers not to address the apparent underlying assumptions of such questions directly. He wants the people – e.g. Wolchover in this case – to find the wisdom themselves. But can they? Does it work? If you don't tell Ms Wolchover that dualities should be studied and celebrated rather than spitted upon, can she figure it out herself? I doubt so.

Witten's answer is interesting. He doesn't know whether there is some "simplified description" i.e. one that would make dualities and other things manifest. We don't have it so it's obvious that we must accept the possibility that no such description exists. Nati Seiberg seems to believe that such a description exists. It's a matter of faith at this point.

But Witten makes a general important point (which I have made many times on this blog, too): It's not only mathematics that is central in theoretical physics. It's mathematics that is

*hard for mathematicians*. For mathematicians, it's even hard to rigorously define a quantum field theory and/or prove its existence. It's even harder with the concepts that string theory forces you to add. Why is it so? Well, I would say that the need for "mathematics that is hard for mathematicians" simply means that the Universe is even more mathematical than the contemporary mathematicians. Contemporary mathematician still discuss objects that are too close to the everyday life while the concepts needed to discuss the laws of physics at the fundamental level are even more abstract, more mathematical.

After Wolchover asks him about the relationship between mathematics and physics, Witten returns to the question about the simplified formulation of quantum field theories etc. and says that he tends to believe that nothing of the sort exists, he can't imagine it. What is Wolchover's question right afterwards? You may guess! :-)

You can’t imagine it at all?Witten has just said that he couldn't imagine. Why would you ask "can't you imagine it at all"? Wasn't his previous sentence sufficiently clear? Or does she believe that the words "at all" provide the conversation with an exceptionally high added value? It's clear what's going on here. The statement that "the simplified universal definition of quantum field theory probably doesn't exist" is a heresy. It's politically incorrect and the question "you can't imagine it at all?" means nothing else than "recant it immediately!". ;-)

If "I can't imagine such a description" is so shocking to Ms Wolchover, can we ask her: And do

*you*know what such a description should look like? ;-) Obviously, she can't. No one can. If someone could, Witten would have probably learned about it already.

Well, after her "recant it", Witten didn't recant it. He said "No, I can't". If he can't imagine it, he can't imagine it. Among reasonable scientists, there just can't be similar taboos about such questions. Of course the view that such a description doesn't exist is entirely possible. It may very well be true. Wolchover asked a question about the (2,0) theory in 5+1 dimensions – it seems to me that it was a pre-engineered, astroturf question because it doesn't seem natural that she would need to ask about the (2,0) theory. And Witten says that it's a theory that we can't define by Lagrangian-like quantization of a known classical system. But there's a huge body of evidence that the theory exists and its existence also makes lots of the properties of theories in lower-dimensional spacetimes manifest – e.g. the \(SL(2,\ZZ)\) S-duality group of the \({\mathcal N}=4\) supersymmetric gauge theory in \(D=4\).

Witten ends up saying that the question "is there a six-dimensional theory with a list of properties" is a more fundamental restatement of the statements about the dualities. Well, it's also a "deeper way of thinking" than just constructing some quantum theories by a quantization of a particular classical system. The previous sentence is mine but I think he would agree with it, too.

Wolchover's jihad against dualities apparently continued:

Dualities sometimes make it hard to maintain a sense of what’s real in the world, given that there are radically different ways you can describe a single system. How would you describe what’s real or fundamental?Great. So Witten was asked "what's real". She clearly wants some of the dual descriptions of the same physics "not to be real", to be banned or killed and declared "unreal" or "blasphemous in physics" – so that the dualities are killed, too. Well, all of the dual descriptions are exactly equally real – that's why we talk about dualities at all. But she doesn't reveal her intent explicitly so the question is just "what's real".

Needless to say, "what's real" is an extremely vague question from a physicist's viewpoint. Almost any question about physics, science, or anything else could be framed as a version of a "what's real" question. "What's real" may be asked as an elaboration building on basically any previous reason. People may ask whether something is real just to confirm that they should trust some answer they were previously given. People may ask whether the eigenvalues of Hermitian operators are real and they are, in the technical sense of "real". They may ask whether quarks are real – they are even though they can't exist in isolation. They may use the word "real" for "useful scientific concepts", for "gauge-invariant observables". Lots of things may be said to be "real" or "unreal" for dozens of reasons that are ultimately very different.

The question doesn't mean anything, not even in the context of dualities – except for the fact that I mentioned, namely that concepts used to describe theories on both or all sides of a duality are equally real. OK, what can Witten answer to a question "what's real"? He's not your humble correspondent so he doesn't explode in a profound and vitally important tirade about Wolchover's meaningless vague questions. Instead, he said:

What aspect of what’s real are you interested in? What does it mean that we exist? Or how do we fit into our mathematical descriptions?This is just Witten's way of saying "Please think about the rubbish question you have asked. Can you see that it has no beef and it can mean anything?" OK, so Witten said that her question could very well be interpreted as a question by the New Age religious people who are constantly high and who ask whether the Universe is real at all, and so on. But he gave her another option: Do you want to keep on discussing our mathematical description of the Universe?

I can only see the written interview, not the emotions. But I would probably bet that the adrenaline was elevated. Wolchover reacted to Witten's answer by a special tweet:

I asked Witten what’s real, and he asked me what aspect of what’s real I’m interested in. Has anyone ever asked anyone that before, I wonder? https://t.co/pWm3cUF3DL

— Natalie Wolchover (@nattyover) November 29, 2017

The tweet sounds like "Witten has given the most original answer (a counter-question) to the question what's real" in the history so far. (Well, I actually respond in almost the same way to "what's real" when I am expected to be polite.) But what I actually read in between the lines is "look, Witten has answered my very deep philosophical question disrespectfully, please help me to spread the idea that he's quite a jerk". ;-)

OK, so which kind of "what's real" do you want to discuss, Ms Wolchover? The latter, the mathematical descriptions, she answers.

Witten keeps on talking about the hypothetical "simpler unified description that clarifies everything". At this point of the interview, it's already staggeringly obvious that Wolchover tries to impose the

*faith*in the existence of this description on Witten but shockingly enough, she finds out that Edward Witten doesn't automatically accept beliefs provided by popular writers to him. Witten's answer is a damn good argument – which I have been well aware of for decades – why this whole search for a single universal description of a TOE may be misguided:

Well, unfortunately, even if it’s correct I can’t guarantee it would help. Part of what makes it difficult to help is that the description we have now, even though it’s not complete, does explain an awful lot. And so it’s a little hard to say, even if you had a truly better description or a more complete description, whether it would help in practice.The point is that we already have some descriptions that simply must be correct at some rather high level of accuracy. They may be close enough to some observations – they are really helpful to explain the observations. So if you add a new, at least equally correct description of all of physics, you must still explain why that new description basically reduces one to the known and successful ones in some situations or limits. In practice, we will always use the limiting, old descriptions when they work and they will almost certainly be the descriptions of choice for some situations even if we find a deeper description.

Dualities relate so many different environments and vacua that the underlying hypothetical "universal description" must be extremely flexible. It just can't say anything "particular" about the spectrum of particles and other things because those properties may be extremely diverse. So if such a deeper universal description exists, it has to be "at most" a paradigm that justifies the known descriptions – and perhaps allows us to compute tiny corrections in these theories even more accurately or completely precisely (at least in principle). But you simply shouldn't expect a new description that is both universal

*and*directly useful (or even "simplified") to analyze the particular situations!

Another exchange is about M-theory. Witten says that it's totally settled that the theory exists today but we still don't know too much more than in the mid 1990s what the theory is. Some new progress in the bulk-based description of gravitational theories would be useful – I completely agree (too much focus has been on the boundary CFT description in this duality) – but Witten doesn't have too much useful stuff to say except that it's probably more abstract and vague about the spacetime than we are used to from existing descriptions. This "I have nothing useful to say" is a sentence he modestly says often. Well, most other people have 500 times less useful things to say but they present themselves as if they were megagods flying above Witten. The contrast between monster mind Witten's almost unlimited modesty and lots of speculative mediocre minds' unlimited hype and narcissism couldn't be more obvious.

Witten mentions that some days ago, he read Wheeler's "it from bit" texts. Now, he's more tolerant towards similar vague stuff, we hear, because he's older. When Witten was younger and wrote his 36th paper, the best thing in the world was immediately the planned 37th paper, of course, and so on, we learn. ;-) But now he's ready to read some less serious stuff such as Wheeler's "it from bit". This increased tolerance may be partly due to the lower relative difference between the numbers 363 and 364, relatively to 36 and 37. ;-)

Nevertheless, his reactions are still basically the same to mine. Wheeler's comments about physics – "information is physical" – are hopelessly vague and carry no information, Witten reacts in the same way as your humble correspondent. On top of that, Wheeler talked about "bits" but he must have meant "qubits" – the term wasn't usual in those times but Wheeler hopefully meant it, otherwise the text would have been really dumb.

And while the spacetime is probably emergent, that's not a good reason to abandon the continuum of the real numbers. Like your humble correspondent, Witten sees evidence that you should better not try to throw away the continuum from physics. Discrete physics with no connections to the continuum just can't do almost anything. To get rid of the reals is an unpromising starting point. Witten also mentions a self-observing Wheeler's picture of an eye and suggests that the observer's being a part of the world that is observed could hide some extra wisdom to be understood. Well, I am agnostic about some ill-defined progress of this kind, I am just pretty sure that the particular ideas that have been proposed to prove this meme are bogus.

One of the last questions by Wolchover was "Do you consider Wheeler a hero?". And Witten just answered "No". Witten just wanted to see what "it from bit" could have meant but I am afraid he just confirmed his expectations that the essay had no beef at all. Witten described Wheeler as a guy who wanted to make big jumps by thousands of years while Witten has been doing incremental advances. Well, 100,000+ citations worth of those, I would add. That's how Witten confirms Wolchover's point that he preferred progress through calculations than vague visions. He also mentioned he likes to play tennis although he doesn't expect to win Wimbledon for several more years.

To summarize, I think that Wolchover must have seen that she comes from a culture that constantly hypes and worships some vague and would-be ambitious statements by big mouths, that constantly needs to worship authors of such vague visions, that is annoyed by mathematics and everything that looks complicated or that has many aspects or many solutions, and so on, and she was forced to see that the methodology and the value system of a top-tier physicist – and, indeed, most top-tier physicists – is extremely different.

And that's the memo.

P.S.: At this moment, there are 8 comments under the interview. By the Pentcho Valev crackpot, by another crackpot who fights QCD, another one that thinks that he has a theory competing with string/M-theory, the fourth crackpot who believes that dualities contradict mathematical logic, and a few more. There's quite a company over there. I am fortunate to have some of you, the brilliant readers, because if I only saw comments like on those websites, I would surely conclude that any writing like that is a waste of time.

P.P.S.: There are 20 comments now. A Vietnamese thinker has an "educated guess" that Carlo Rovelli is more impressive than Witten. ;-) Zarzuelazen talks about Sean Carroll and "nonlocality", and random mixtures of entropy with other things. Someone else quotes Hossenfelder's seven theories of everything – six cranks who are Witten's peers because it was written somewhere in the cesspool of the Internet. Indeed, quite a company.

by Luboš Motl (noreply@blogger.com) at November 30, 2017 03:19 PM

## November 28, 2017

### ZapperZ - Physics and Physicists

An employee in Perth, Australia, used the metallic package from a snack to shield his device that has a GPS and locate his whereabouts. He then went golfing... many times, during his work hours.

The tribunal found that the packet was deliberately used to operate as an elaborate “Faraday cage” - an enclosure which can block electromagnetic fields - and prevented his employer knowing his location. The cage set-up was named after English scientist Michael Faraday, who in 1836 observed that a continuous covering of conductive material could be used to block electromagnetic fields.

Now, if it works for his device, it should work to shield our credit cards as an RFID shield, don't you think? There's no reason to buy those expensive wallet or credit-card envelopes. Next time you have a Cheetos or potato chips, save those bags and wrap your wallet with them! :)

Zz.

by ZapperZ (noreply@blogger.com) at November 28, 2017 07:07 PM

## November 27, 2017

### John Baez - Azimuth

It sounds like jargon from a bad episode of *Star Trek*. But it’s a real thing. It’s a monstrous object that lives in the plane, but is impossible to draw.

Do you want to see how snake-like it is? Okay, but beware… this video clip is a warning:

This snake-like monster is also called the ‘pseudo-arc’. It’s the limit of a sequence of curves that get more and more wiggly. Here are the 5th and 6th curves in the sequence:

Here are the 8th and 10th:

But what happens if you try to draw the pseudo-arc itself, the limit of all these curves? It turns out to be infinitely wiggly—*so wiggly that any picture of it is useless.*

In fact Wayne Lewis and Piotr Minic wrote a paper about this, called Drawing the pseudo-arc. That’s where I got these pictures. The paper also shows stage 200, and it’s a big fat ugly black blob!

But the pseudo-arc is beautiful if you see through the pictures to the concepts, because it’s a universal snake-like continuum. Let me explain. This takes some math.

The nicest metric spaces are compact metric spaces, and each of these can be written as the union of connected components… so there’s a long history of interest in compact connected metric spaces. Except for the empty set, which probably doesn’t deserve to be called connected, these spaces are called **continua**.

Like all point-set topology, the study of continua is considered a bit old-fashioned, because people have been working on it for so long, and it’s hard to get good new results. But on the bright side, what this means is that many great mathematicians have contributed to it, and there are lots of nice theorems. You can learn about it here:

• W. T. Ingraham, A brief historical view of continuum theory,

*Topology and its Applications* **153** (2006), 1530–1539.

• Sam B. Nadler, Jr, *Continuum Theory: An Introduction*, Marcel Dekker, New York, 1992.

Now, if we’re doing topology, we should really talk not about metric spaces but about **metrizable** spaces: that is, topological spaces where the topology comes from *some* metric, which is not necessarily unique. This nuance is a way of clarifying that we don’t really care about the metric, just the topology.

So, we define a **continuum** to be a nonempty compact connected metrizable space. When I think of this I think of a curve, or a ball, or a sphere. Or maybe something bigger like the **Hilbert cube**: the countably infinite product of closed intervals. Or maybe something full of holes, like the Sierpinski carpet:

or the Menger sponge:

Or maybe something weird like a solenoid:

Very roughly, a continuum is ‘snake-like’ if it’s long and skinny and doesn’t loop around. But the precise definition is a bit harder:

We say that an open cover 𝒰 of a space X **refines** an open cover 𝒱 if each element of 𝒰 is contained in an element of 𝒱. We call a continuum X **snake-like** if each open cover of X can be refined by an open cover U_{1}, …, U_{n} such that for any i, j the intersection of U_{i} and U_{j} is nonempty iff i and j are right next to each other.

Such a cover is called a **chain**, so a snake-like continuum is also called **chainable**. But ‘snake-like’ is so much cooler: we should take advantage of any opportunity to bring snakes into mathematics!

The simplest snake-like continuum is the closed unit interval [0,1]. It’s hard to think of others. But here’s what Mioduszewski proved in 1962: the pseudo-arc is a **universal** snake-like continuum. That is: it’s a snake-like continuum, and it has continuous map onto *every* snake-like continuum!

This is a way of saying that the pseudo-arc is the most complicated snake-like continuum possible. A bit more precisely: it bends back on itself as much as possible while still going somewhere! You can see this from the pictures above, or from the construction on Wikipedia:

• Wikipedia, Pseudo-arc.

I like the idea that there’s a subset of the plane with this simple ‘universal’ property, which however is so complicated that it’s impossible to draw.

Here’s the paper where these pictures came from:

• Wayne Lewis and Piotr Minic, Drawing the pseudo-arc, *Houston J. Math.* **36** (2010), 905–934.

The pseudo-arc has other amazing properties. For example, it’s ‘indecomposable’. A nonempty connected closed subset of a continuum is a continuum in its own right, called a **subcontinuum**, and we say a continuum is **indecomposable** if it is not the union of two proper subcontinua.

It takes a while to get used to this idea, since all the examples of continua that I’ve listed so far are decomposable except for the pseudo-arc and the solenoid!

Of course a single point is an indecomposable continuum, but that example is so boring that people sometimes exclude it. The first interesting example was discovered by Brouwer in 1910. It’s the intersection of an infinite sequence of sets like this:

It’s called the **Brouwer–Janiszewski–Knaster continuum** or **buckethandle**. Like the solenoid, it shows up as an attractor in some chaotic dynamical systems.

It’s easy to imagine how if you write the buckethandle as the union of two closed proper subsets, at least one will be disconnected. And note: you don’t even need these subsets to be disjoint! So, it’s an indecomposable continuum.

But once you get used to indecomposable continua, you’re ready for the next level of weirdness. An even more dramatic thing is a **hereditarily indecomposable** continuum: one for which each subcontinuum is also indecomposable.

Apart from a single point, the pseudo-arc is the unique hereditarily indecomposable snake-like continuum! I believe this was first proved here:

• R. H. Bing, Concerning hereditarily indecomposable continua, *Pacific J. Math.* **1** (1951), 43–51.

Finally, here’s one more amazing fact about the pseudo-arc. To explain it, I need a bunch more nice math:

Every continuum arises as a closed subset of the Hilbert cube. There’s an obvious way to define the distance between two closed subsets of a compact metric space, called the Hausdorff distance—if you don’t know about this already, it’s fun to reinvent it yourself. The set of all closed subsets of a compact metric space thus forms a metric space in its own right—and by the way, the **Blaschke selection theorem** says this metric space is again compact!

Anyway, this stuff means that there’s a metric space whose points are all subcontinua of the Hilbert cube, and we don’t miss out on any continua by looking at these. So we can call this the **space of all continua**.

Now for the amazing fact: *pseudo-arcs are dense in the space of all continua!*

I don’t know who proved this. It’s mentioned here:

• Trevor L. Irwin and Salawomir Solecki, Projective Fraïssé limits and the pseudo-arc.

but they refer to this paper as a good source for such facts:

• Wayne Lews, The pseudo-arc, *Bol. Soc. Mat. Mexicana (3)* **5** (1999), 25–77.

Abstract.The pseudo-arc is the simplest nondegenerate hereditarily indecomposable continuum. It is, however, also the most important, being homogeneous, having several characterizations, and having a variety of useful mapping properties. The pseudo-arc has appeared in many areas of continuum theory, as well as in several topics in geometric topology, and is beginning to make its appearance in dynamical systems. In this monograph, we give a survey of basic results and examples involving the pseudo-arc. A more complete treatment will be given in a book dedicated to this topic, currently under preparation by this author. We omit formal proofs from this presentation, but do try to give indications of some basic arguments and construction techniques. Our presentation covers the following major topics: 1. Construction 2. Homogeneity 3. Characterizations 4. Mapping properties 5. Hyperspaces 6. Homeomorphism groups 7. Continuous decompositions 8. Dynamics.

It may seem surprising that one can write a whole book about the pseudo-arc… but if you like continua, it’s a fundamental structure just like spheres and cubes!

## November 26, 2017

### Clifford V. Johnson - Asymptotia

So I don’t know about you, but I’ve been really enjoying the new series of Star Trek. I started watching Star Trek Discovery because I was one of the science advisors they talked with from the early writing stages to finish, building some of the science ideas, concepts, and tone for their reimagining of the Star Trek universe.

Over many months after the initial meeting with all the writers, I would take calls from individual writers and researchers and give them ideas or phrases they could use and so forth. But much of the work was done blind, which is to say I had very little context for most of what they were asking for advice for. I think they do it this way because they wanted to protect a lot of the material from leaking because, well, it’s Star Trek! Yes, you'll know from my various writings and interviews about science advising that this is not usually my preferred way of working as an advisor, but I was happy to help in this case and make an exception because after all of this is a huge show that has a tradition of inspiring people about science over many generations. so it could be of value, just by virtue of some of the little science ideas that I helped sprinkled in there, however accurately or inaccurately. The bottom line is that Star Trek just isn’t about accuracy, it’s about inspiration and dreams.

Well, needless to say, when it came out I was curious to see how they used [...] Click to continue reading this post

The post Pleasant Discovery appeared first on Asymptotia.

## November 24, 2017

### Clifford V. Johnson - Asymptotia

(Image above courtesy of Cellar Door Books in Riverside, CA.)

Happy Thanksgiving! This coming week, there'll be two events that might be of interest to people either in the Los Angeles area, or the New York area.

The first is an event (Tues. 28th Nov., 7pm, Co-sponsored by LARB and Chevalier's Books) centered around my new book, the Dialogues. It is the first such LA event, starting with a chat with writer and delightful conversationalist [...] Click to continue reading this post

The post Two Events! appeared first on Asymptotia.

### Sean Carroll - Preposterous Universe

This year we give thanks for a simple but profound principle of statistical mechanics that extends the famous Second Law of Thermodynamics: the Jarzynski Equality. (We’ve previously given thanks for the Standard Model Lagrangian, Hubble’s Law, the Spin-Statistics Theorem, conservation of momentum, effective field theory, the error bar, gauge symmetry, Landauer’s Principle, the Fourier Transform, Riemannian Geometry, and the speed of light.)

The Second Law says that entropy increases in closed systems. But really it says that entropy *usually* increases; thermodynamics is the limit of statistical mechanics, and in the real world there can be rare but inevitable fluctuations around the typical behavior. The Jarzynski Equality is a way of quantifying such fluctuations, which is increasingly important in the modern world of nanoscale science and biophysics.

Our story begins, as so many thermodynamic tales tend to do, with manipulating a piston containing a certain amount of gas. The gas is of course made of a number of jiggling particles (atoms and molecules). All of those jiggling particles contain energy, and we call the total amount of that energy the internal energy *U* of the gas. Let’s imagine the whole thing is embedded in an environment (a “heat bath”) at temperature *T*. That means that the gas inside the piston starts at temperature *T*, and after we manipulate it a bit and let it settle down, it will relax back to *T* by exchanging heat with the environment as necessary.

Finally, let’s divide the internal energy into “useful energy” and “useless energy.” The useful energy, known to the cognoscenti as the (Helmholtz) free energy and denoted by *F*, is the amount of energy potentially available to do useful work. For example, the pressure in our piston may be quite high, and we could release it to push a lever or something. But there is also useless energy, which is just the entropy *S* of the system times the temperature *T.* That expresses the fact that once energy is in a highly-entropic form, there’s nothing useful we can do with it any more. So the total internal energy is the free energy plus the useless energy,

Our piston starts in a boring equilibrium configuration *a*, but we’re not going to let it just sit there. Instead, we’re going to push in the piston, decreasing the volume inside, ending up in configuration *b*. This squeezes the gas together, and we expect that the total amount of energy will go up. It will typically cost us energy to do this, of course, and we refer to that energy as the work *W _{ab}* we do when we push the piston from

*a*to

*b*.

Remember that when we’re done pushing, the system might have heated up a bit, but we let it exchange heat *Q* with the environment to return to the temperature *T*. So three things happen when we do our work on the piston: (1) the free energy of the system changes; (2) the entropy changes, and therefore the useless energy; and (3) heat is exchanged with the environment. In total we have

(There is no Δ*T*, because *T* is the temperature of the environment, which stays fixed.) The Second Law of Thermodynamics says that entropy increases (or stays constant) in closed systems. Our system isn’t closed, since it might leak heat to the environment. But really the Second Law says that the total of the last two terms on the right-hand side of this equation add up to a positive number; in other words, the increase in entropy will more than compensate for the loss of heat. (Alternatively, you can lower the entropy of a bottle of champagne by putting it in a refrigerator and letting it cool down; no laws of physics are violated.) One way of stating the Second Law for situations such as this is therefore

The work we do on the system is greater than or equal to the change in free energy from beginning to end. We can make this inequality into an equality if we act as efficiently as possible, minimizing the entropy/heat production: that’s an *adiabatic* process, and in practical terms amounts to moving the piston as gradually as possible, rather than giving it a sudden jolt. That’s the limit in which the process is reversible: we can get the same energy out as we put in, just by going backwards.

Awesome. But the language we’re speaking here is that of classical thermodynamics, which we all know is the limit of statistical mechanics when we have many particles. Let’s be a little more modern and open-minded, and take seriously the fact that our gas is actually a collection of particles in random motion. Because of that randomness, there will be *fluctuations* over and above the “typical” behavior we’ve been describing. Maybe, just by chance, all of the gas molecules happen to be moving away from our piston just as we move it, so we don’t have to do any work at all; alternatively, maybe there are more than the usual number of molecules hitting the piston, so we have to do more work than usual. The Jarzynski Equality, derived 20 years ago by Christopher Jarzynski, is a way of saying something about those fluctuations.

One simple way of taking our thermodynamic version of the Second Law (3) and making it still hold true in a world of fluctuations is simply to say that it holds true on average. To denote an average over all possible things that could be happening in our system, we write angle brackets around the quantity in question. So a more precise statement would be that the *average* work we do is greater than or equal to the change in free energy:

(We don’t need angle brackets around Δ*F*, because *F* is determined completely by the equilibrium properties of the initial and final states *a* and *b*; it doesn’t fluctuate.) Let me multiply both sides by -1, which means we need to flip the inequality sign to go the other way around:

Next I will exponentiate both sides of the inequality. Note that this keeps the inequality sign going the same way, because the exponential is a monotonically increasing function; if *x* is less than *y*, we know that *e ^{x}* is less than

*e*.

^{y}(More typically we will see the exponents divided by *kT*, where *k* is Boltzmann’s constant, but for simplicity I’m using units where *kT* = 1.)

Jarzynski’s equality is the following remarkable statement: in equation (6), if we exchange the exponential of the average work for the average of the exponential of the work , we get a precise **equality**, not merely an inequality:

That’s the Jarzynski Equality: the average, over many trials, of the exponential of minus the work done, is equal to the exponential of minus the free energies between the initial and final states. It’s a stronger statement than the Second Law, just because it’s an equality rather than an inequality.

In fact, we can *derive* the Second Law from the Jarzynski equality, using a math trick known as Jensen’s inequality. For our purposes, this says that the exponential of an average is less than the average of an exponential, . Thus we immediately get

as we had before. Then just take the log of both sides to get , which is one way of writing the Second Law.

So what does it mean? As we said, because of fluctuations, the work we needed to do on the piston will sometimes be a bit less than or a bit greater than the average, and the Second Law says that the average will be greater than the difference in free energies from beginning to end. Jarzynski’s Equality says there is a quantity, the exponential of minus the work, that averages out to be exactly the exponential of minus the free-energy difference. The function is convex and decreasing as a function of *W*. A fluctuation where *W* is lower than average, therefore, contributes a greater shift to the average of than a corresponding fluctuation where *W* is higher than average. To satisfy the Jarzynski Equality, we must have more fluctuations upward in *W* than downward in *W*, by a precise amount. So on average, we’ll need to do more work than the difference in free energies, as the Second Law implies.

It’s a remarkable thing, really. Much of conventional thermodynamics deals with inequalities, with equality being achieved only in adiabatic processes happening close to equilibrium. The Jarzynski Equality is fully non-equilibrium, achieving equality no matter how dramatically we push around our piston. It tells us not only about the average behavior of statistical systems, but about the full ensemble of possibilities for individual trajectories around that average.

The Jarzynski Equality has launched a mini-revolution in nonequilibrium statistical mechanics, the news of which hasn’t quite trickled to the outside world as yet. It’s one of a number of relations, collectively known as “fluctuation theorems,” which also include the Crooks Fluctuation Theorem, not to mention our own Bayesian Second Law of Thermodynamics. As our technological and experimental capabilities reach down to scales where the fluctuations become important, our theoretical toolbox has to keep pace. And that’s happening: the Jarzynski equality isn’t just imagination, it’s been experimentally tested and verified. (Of course, I remain just a poor theorist myself, so if you want to understand this image from the experimental paper, you’ll have to talk to someone who knows more about Raman spectroscopy than I do.)

## November 23, 2017

### The n-Category Cafe

Good news! Janelidze and Street have tackled some puzzles that are perennial favorites here on the $<semantics>n<annotation\; encoding="application/x-tex">n</annotation></semantics>$-Café:

- George Janelidze and Ross Street, Real sets,
*Tbilisi Mathematical Journal*,**10**(2017), 23–49.

Abstract.After reviewing a universal characterization of the extended positive real numbers published by Denis Higgs in 1978, we define a category which provides an answer to the questions:• what is a set with half an element?

• what is a set with π elements?

The category of these extended positive real sets is equipped with a countable tensor product. We develop somewhat the theory of categories with countable tensors; we call the commutative such categories

series monoidaland conclude by only briefly mentioning the non-commutative possibility calledω-monoidal. We include some remarks on sets having cardinalities in $<semantics>[-\mathrm{\infty},\mathrm{\infty}]<annotation\; encoding="application/x-tex">[-\backslash infty,\backslash infty]</annotation></semantics>$.

First they define a **series magma**, which is a set $<semantics>A<annotation\; encoding="application/x-tex">A</annotation></semantics>$ equipped with an element $<semantics>0<annotation\; encoding="application/x-tex">0</annotation></semantics>$ and a summation function

$$<semantics>\sum :{A}^{\mathbb{N}}\to A<annotation\; encoding="application/x-tex">\; \backslash sum\; \backslash colon\; A^\{\backslash mathbb\{N\}\}\; \backslash to\; A\; </annotation></semantics>$$

obeying a nice generalization of the law $<semantics>a+0=0+a=a<annotation\; encoding="application/x-tex">a\; +\; 0\; =\; 0\; +\; a\; =\; a</annotation></semantics>$. Then they define a **series monoid** in which this summation function obeys a version of the commutative law.

(Yeah, the terminology here seems a bit weird: their summation function already has associativity built in, so their ‘series magma’ is associative and their ‘series monoid’ is also commutative!)

The forgetful functor from series monoids to sets has a left adjoint, and as you’d expect, the free series monoid on the one-element set is $<semantics>\mathbb{N}\cup \{\mathrm{\infty}\}<annotation\; encoding="application/x-tex">\backslash mathbb\{N\}\; \backslash cup\; \backslash \{\backslash infty\backslash \}</annotation></semantics>$. A more interesting series monoid is $<semantics>[0,\mathrm{\infty}]<annotation\; encoding="application/x-tex">[0,\backslash infty]</annotation></semantics>$, and one early goal of the paper is to recall Higgs’ categorical description of this. That’s Denis Higgs. Peter Higgs has a boson, but Denis Higgs has a nice theorem.

First, some preliminaries:

Countable products of series monoids coincide with countable coproducts, just as finite products of commutative monoids coincide with finite coproducts.

There is a tensor product of series monoids, which is very similar to the tensor product of commutative monoids —- or, to a lesser extent, the more familiar tensor product of abelian groups. Monoids with respect to this tensor product are called **series rigs**. For abstract nonsense reasons, because $<semantics>\mathbb{N}\cup \{\mathrm{\infty}\}<annotation\; encoding="application/x-tex">\backslash mathbb\{N\}\; \backslash cup\; \backslash \{\backslash infty\backslash \}</annotation></semantics>$ is the free series monoid on one elements, it also becomes a series rig… with the usual multiplication and addition. (Well, more or less usual: if you’re not familiar with this stuff, a good exercise is to figure out what $<semantics>0<annotation\; encoding="application/x-tex">0</annotation></semantics>$ times $<semantics>\mathrm{\infty}<annotation\; encoding="application/x-tex">\backslash infty</annotation></semantics>$ must be.)

Now for the characterization of $<semantics>[0,\mathrm{\infty}]<annotation\; encoding="application/x-tex">[0,\backslash infty]</annotation></semantics>$. Given an endomorphism $<semantics>f:A\to A<annotation\; encoding="application/x-tex">f\; \backslash colon\; A\; \backslash to\; A</annotation></semantics>$ of a series monoid $<semantics>A<annotation\; encoding="application/x-tex">A</annotation></semantics>$ you can define a new endomorphism $<semantics>\overline{f}:A\to A<annotation\; encoding="application/x-tex">\backslash overline\{f\}\; \backslash colon\; A\; \backslash to\; A</annotation></semantics>$ by

$$<semantics>\overline{f}=f+f\circ f+f\circ f\circ f+\cdots <annotation\; encoding="application/x-tex">\; \backslash overline\{f\}\; =\; f\; +\; f\backslash circ\; f\; +\; f\; \backslash circ\; f\; \backslash circ\; f\; +\; \backslash cdots\; </annotation></semantics>$$

where the infinite sum is defined using the series monoid structure on $<semantics>A<annotation\; encoding="application/x-tex">A</annotation></semantics>$. Following Higgs, Janelidze and Street define a **Zeno morphism** to be an endomorphism $<semantics>hmapsA\to A<annotation\; encoding="application/x-tex">h\; \backslash maps\; A\; \backslash to\; A</annotation></semantics>$ such that

$$<semantics>\overline{h}={1}_{A}<annotation\; encoding="application/x-tex">\; \backslash overline\{h\}\; =\; 1\_A\; </annotation></semantics>$$

The reason for this name is that in $<semantics>[0,\mathrm{\infty}]<annotation\; encoding="application/x-tex">[0,\backslash infty]</annotation></semantics>$ we have

$$<semantics>1=\frac{1}{2}+{\left(\frac{1}{2}\right)}^{2}+{\left(\frac{1}{2}\right)}^{3}+\cdots <annotation\; encoding="application/x-tex">\; 1\; =\; \backslash frac\{1\}\{2\}\; +\; \backslash left(\backslash frac\{1\}\{2\}\backslash right)^2\; +\; \backslash left(\backslash frac\{1\}\{2\}\backslash right)^3\; +\; \backslash cdots\; </annotation></semantics>$$

putting us in mind of Zeno’s paradox:

That which is in locomotion must arrive at the half-way stage before it arrives at the goal.— Aristotle,PhysicsVI:9, 239b10.

So, it makes lots of sense to think of any Zeno morphism $<semantics>h:A\to A<annotation\; encoding="application/x-tex">h\; \backslash colon\; A\; \backslash to\; A</annotation></semantics>$ as a ‘halving’ operation. Hence the name $<semantics>h<annotation\; encoding="application/x-tex">h</annotation></semantics>$.

In particular, one can show any Zeno morphism obeys

$$<semantics>h+h={1}_{A}<annotation\; encoding="application/x-tex">\; h\; +\; h\; =\; 1\_A\; </annotation></semantics>$$

Higgs called a series monoid equipped with a Zeno morphism a **magnitude module**, and he showed that the free magnitude module on one element is $<semantics>[0,\mathrm{\infty}]<annotation\; encoding="application/x-tex">[0,\backslash infty]</annotation></semantics>$. By the same flavor of abstract nonsense as before, this implies that $<semantics>[0,\mathrm{\infty}]<annotation\; encoding="application/x-tex">[0,\backslash infty]</annotation></semantics>$ is a series rig…. with the usual addition and multiplication.

### Categorification

Next, Janelidze and Street *categorify* the entire discussion so far! They define a ‘series monoidal category’ to be a category $<semantics>A<annotation\; encoding="application/x-tex">A</annotation></semantics>$ with an object $<semantics>0\in A<annotation\; encoding="application/x-tex">0\; \backslash in\; A</annotation></semantics>$ and summation functor

$$<semantics>\sum :{A}^{\mathbb{N}}\to A<annotation\; encoding="application/x-tex">\; \backslash sum\; \backslash colon\; A^\{\backslash mathbb\{N\}\}\; \backslash to\; A\; </annotation></semantics>$$

obeying some reasonable properties… up to natural isomorphisms that themselves obey some reasonable properties. So, it’s a category where we can add infinite sequences of objects. For example, every series monoid gives a series monoidal category with only identity morphisms. The maps between series monoidal categories are called ‘series monoidal functors’.

They define a ‘Zeno functor’ to be a series monoidal functor $<semantics>h:A\to A<annotation\; encoding="application/x-tex">h\; \backslash colon\; A\; \backslash to\; A</annotation></semantics>$ obeying a categorified version of the definition of Zeno morphism. A series monoidal category with a Zeno functor is called a ‘magnitude category’.

As you’d guess, there are also ‘magnitude functors’ and ‘magnitude natural transformations’, giving a 2-category $<semantics>\mathrm{MgnCat}<annotation\; encoding="application/x-tex">MgnCat</annotation></semantics>$. There’s a forgetful 2-functor

$$<semantics>U:\mathrm{MgnCat}\to \mathrm{Cat}<annotation\; encoding="application/x-tex">\; U\; \backslash colon\; MgnCat\; \backslash to\; Cat\; </annotation></semantics>$$

and it has a left adjoint (or, as Janelidze and Street say, a left ‘biadjoint’)

$$<semantics>F:\mathrm{Cat}\to \mathrm{MgnCat}<annotation\; encoding="application/x-tex">\; F\; \backslash colon\; Cat\; \backslash to\; MgnCat\; </annotation></semantics>$$

Applying $<semantics>F<annotation\; encoding="application/x-tex">F</annotation></semantics>$ to the terminal category $<semantics>1<annotation\; encoding="application/x-tex">1</annotation></semantics>$, they get a magnitude category $<semantics>{\mathrm{RSet}}_{g}<annotation\; encoding="application/x-tex">RSet\_g</annotation></semantics>$ of **positive real sets**. These are like sets, but their cardinality can be anything in $<semantics>[0,\mathrm{\infty}]<annotation\; encoding="application/x-tex">[0,\backslash infty]</annotation></semantics>$!

For example, Janelidze and Street construct a positive real set of cardinality $<semantics>\pi <annotation\; encoding="application/x-tex">\backslash pi</annotation></semantics>$. Unfortunately they do it starting from the binary expansion of $<semantics>\pi <annotation\; encoding="application/x-tex">\backslash pi</annotation></semantics>$, so it doesn’t connect in a very interesting way with anything I know about the number $<semantics>\pi <annotation\; encoding="application/x-tex">\backslash pi</annotation></semantics>$.

What’s that little subscript $<semantics>g<annotation\; encoding="application/x-tex">g</annotation></semantics>$? Well, unfortunately $<semantics>{\mathrm{RSet}}_{g}<annotation\; encoding="application/x-tex">RSet\_g</annotation></semantics>$ is a groupoid: the only morphisms between positive real sets we get from this construction are the isomorphisms.

So, there’s a lot of great stuff here, but apparently a lot left to do.

### Digressive Postlude

There is more to say, but I need to get going — I have to walk 45 minutes to Paris 7 to talk to Mathieu Anel about symplectic geometry, and then have lunch with him and Paul-André Melliès. Paul-André kindly invited me to participate in his habilitation defense on Monday, along with Gordon Plotkin, André Joyal, Jean-Yves Girard, Thierry Coquand, Pierre-Louis Curien, George Gonthier, and my friend Karine Chemla (an expert on the history of Chinese mathematics). Paul-André has some wonderful ideas on linear logic, Frobenius pseudomonads, game semantics and the like, and we want to figure out more precisely how all this stuff is connected to topological quantum field theory. I think nobody has gotten to the bottom of this! So, I hope to spend more time here, figuring it out with Paul-André.

## November 19, 2017

### ZapperZ - Physics and Physicists

A major flaw in the painting — which is the only one of da Vinci's that remains in private hands — makes some historians think it's a fake. The crystal orb in the image doesn't distort light in the way that natural physics does, which would be an unusual error for da Vinci.

My reaction when I first read this is that, it is not as if da Vinci was painting this live with the actual Jesus Christ holding the orb. So either he made a mistake, or he knew what he was doing and didn't think it would matter. I don't think this observation is enough to call the painting a fake.

Still, it may make a good class example in Intro Physics optics.

Zz.

by ZapperZ (noreply@blogger.com) at November 19, 2017 02:58 AM

## November 17, 2017

### Tommaso Dorigo - Scientificblogging

### ZapperZ - Physics and Physicists

I have not read the book yet, and probably won't get to it till some time next year. But if you have read it, I'd like to hear what you think of it.

Zz.

by ZapperZ (noreply@blogger.com) at November 17, 2017 08:03 PM

## November 16, 2017

## November 09, 2017

### Robert Helling - atdotde

By Original upload by en:User:Tbower - USGS animation A08, Public Domain, Link

In fact, that was only the last in a series of supercontinents, that keep forming and breaking up in the "supercontinent cycle".

By SimplisticReps - Own work, CC BY-SA 4.0, Link

So here is the question: I am happy with the idea of several (say $N$) plates roughly containing a continent each that a floating around on the magma driven by all kinds of convection processes in the liquid part of the earth. They are moving around in a pattern that looks to me to be pretty chaotic (in the non-technical sense) and of course for random motion you would expect that from time to time two of those collide and then maybe stick for a while.

Then it would be possible that also a third plate collides with the two but that would be a coincidence (like two random lines typically intersect but if you have three lines they would typically intersect in pairs but typically not in a triple intersection). But to form a supercontinent, you need all $N$ plates to miraculously collide at the same time. This order-$N$ process seems to be highly unlikely when random let alone the fact that it seems to repeat. So this motion cannot be random (yes, Sabine, this is a naturalness argument). This needs an explanation.

So, why, every few hundred million years, do all the land masses of the earth assemble on side of the earth?

One explanation could for example be that during those tines, the center of mass of the earth is not in the symmetry center so the water of the oceans flow to one side of the earth and reveals the seabed on the opposite side of the earth. Then you would have essentially one big island. But this seems not to be the case as the continents (those parts that are above sea-level) appear to be stable on much longer time scales. It is not that the seabed comes up on one side and the land on the other goes under water but the land masses actually move around to meet on one side.

I have already asked this question whenever I ran into people with a geosciences education but it is still open (and I have to admit that in a non-zero number of cases I failed to even make the question clear that an $N$-body collision needs an explanation). But I am sure, you my readers know the answer or even better can come up with one.

by Robert Helling (noreply@blogger.com) at November 09, 2017 09:35 AM

## October 28, 2017

## October 24, 2017

### Andrew Jaffe - Leaves on the Line

first direct detection of gravitational waves was announced in February of 2015 by the LIGO team, after decades of planning, building and refining their beautiful experiment. Since that time, the US-based LIGO has been joined by the European Virgo gravitational wave telescope (and more are planned around the globe).The first four events that the teams announced were from the spiralling in and eventual mergers of pairs of black holes, with masses ranging from about seven to about forty times the mass of the sun. These masses are perhaps a bit higher than we expect to by typical, which might raise intriguing questions about how such black holes were formed and evolved, although even comparing the results to the predictions is a hard problem depending on the details of the statistical properties of the detectors and the astrophysical models for the evolution of black holes and the stars from which (we think) they formed.

Last week, the teams announced the detection of a very different kind of event, the collision of two neutron stars, each about 1.4 times the mass of the sun. Neutron stars are one possible end state of the evolution of a star, when its atoms are no longer able to withstand the pressure of the gravity trying to force them together. This was first understood by S Chandrasekhar in the early years of the 20th Century, who realised that there was a limit to the mass of a star held up simply by the quantum-mechanical repulsion of the electrons at the outskirts of the atoms making up the star. When you surpass this mass, known, appropriately enough, as the Chandrasekhar mass, the star will collapse in upon itself, combining the electrons and protons into neutrons and likely releasing a vast amount of energy in the form of a supernova explosion. After the explosion, the remnant is likely to be a dense ball of neutrons, whose properties are actually determined fairly precisely by similar physics to that of the Chandrasekhar limit (discussed for this case by Oppenheimer, Volkoff and Tolman), giving us the magic 1.4 solar mass number.

(Last week also coincidentally would have seen Chandrasekhar’s 107th birthday, and Google chose to illustrate their home page with an animation in his honour for the occasion. I was a graduate student at the University of Chicago, where Chandra, as he was known, spent most of his career. Most of us students were far too intimidated to interact with him, although it was always seen as an auspicious occasion when you spotted him around the halls of the Astronomy and Astrophysics Center.)

This process can therefore make a single 1.4 solar-mass neutron star, and we can imagine that in some rare cases we can end up with two neutron stars orbiting one another. Indeed, the fact that LIGO saw one, but only one, such event during its year-and-a-half run allows the teams to constrain how often that happens, albeit with very large error bars, between 320 and 4740 events per cubic gigaparsec per year; a cubic gigaparsec is about 3 billion light-years on each side, so these are rare events indeed. These results and many other scientific inferences from this single amazing observation are reported in the teams’ overview paper.

A series of other papers discuss those results in more detail, covering the physics of neutron stars to limits on departures from Einstein’s theory of gravity (for more on some of these other topics, see this blog, or this story from the NY Times). As a cosmologist, the most exciting of the results were the use of the event as a “standard siren”, an object whose gravitational wave properties are well-enough understood that we can deduce the distance to the object from the LIGO results alone. Although the idea came from Bernard Schutz in 1986, the term “Standard siren” was coined somewhat later (by Sean Carroll) in analogy to the (heretofore?) more common cosmological standard candles and standard rulers: objects whose

intrinsicbrightness and distances are known and so whose distances can be measured by observations of theirapparentbrightness or size, just as you can roughly deduce how far away a light bulb is by how bright it appears, or how far away a familiar object or person is by how big how it looks.Gravitational wave events are standard sirens because our understanding of relativity is good enough that an observation of the shape of gravitational wave pattern as a function of time can tell us the properties of its source. Knowing that, we also then know the amplitude of that pattern when it was released. Over the time since then, as the gravitational waves have travelled across the Universe toward us, the amplitude has gone down (further objects look dimmer sound quieter); the expansion of the Universe also causes the frequency of the waves to decrease — this is the cosmological redshift that we observe in the spectra of distant objects’ light.

Unlike LIGO’s previous detections of binary-black-hole mergers, this new observation of a binary-neutron-star merger was also seen in photons: first as a gamma-ray burst, and then as a “nova”: a new dot of light in the sky. Indeed, the observation of the afterglow of the merger by teams of literally thousands of astronomers in gamma and x-rays, optical and infrared light, and in the radio, is one of the more amazing pieces of academic teamwork I have seen.

And these observations allowed the teams to identify the host galaxy of the original neutron stars, and to measure the redshift of its light (the lengthening of the light’s wavelength due to the movement of the galaxy away from us). It is most likely a previously unexceptional galaxy called NGC 4993, with a redshift

z=0.009, putting it about 40 megaparsecs away, relatively close on cosmological scales.But this means that we can measure all of the factors in one of the most celebrated equations in cosmology, Hubble’s law:

cz=H₀d, wherecis the speed of light,zis the redshift just mentioned, anddis the distance measured from the gravitational wave burst itself. This just leavesH₀, the famous Hubble Constant, giving the current rate of expansion of the Universe, usually measured in kilometres per second per megaparsec. The old-fashioned way to measure this quantity is via the so-called cosmic distance ladder, bootstrapping up from nearby objects of known distance to more distant ones whose properties can only be calibrated by comparison with those more nearby. But errors accumulate in this process and we can be susceptible to the weakest rung on the chain (see recent work by some of my colleagues trying to formalise this process). Alternately, we can use data from cosmic microwave background (CMB) experiments like thePlanckSatellite (see here for lots of discussion on this blog); the typical size of the CMB pattern on the sky is something very like a standard ruler. Unfortunately, it, too, needs to calibrated, implicitly by other aspects of the CMB pattern itself, and so ends up being a somewhat indirect measurement. Currently, the best cosmic-distance-ladder measurement gives something like 73.24 ± 1.74 km/sec/Mpc whereasPlanckgives 67.81 ± 0.92 km/sec/Mpc; these numbers disagree by “a few sigma”, enough that it is hard to explain as simply a statistical fluctuation.Unfortunately, the new LIGO results do not solve the problem. Because we cannot observe the inclination of the neutron-star binary (i.e., the orientation of its orbit), this blows up the error on the distance to the object, due to the Bayesian marginalisation over this unknown parameter (just as the

Planckmeasurement requires marginalization over all of the other cosmological parameters to fully calibrate the results). Because the host galaxy is relatively nearby, the teams must also account for the fact that the redshift includes the effect not only of the cosmological expansion but also the movement of galaxies with respect to one another due to the pull of gravity on relatively large scales; this so-called peculiar velocity has to be modelled which adds further to the errors.This procedure gives a final measurement of 70.0

^{+12}_{-8.0}, with the full shape of the probability curve shown in the Figure, taken directly from the paper. Both thePlanckand distance-ladder results are consistent with these rather large error bars. But this is calculated from a single object; as more of these events are seen these error bars will go down, typically by something like the square root of the number of events, so it might not be too long before this is the best way to measure the Hubble Constant.[Apologies: too long, too technical, and written late at night while trying to get my wonderful not-quite-three-week-old daughter to sleep through the night.]

## October 17, 2017

### Matt Strassler - Of Particular Significance

Yesterday’s post on the results from the LIGO/VIRGO network of gravitational wave detectors was aimed at getting information out, rather than providing the pedagogical backdrop. Today I’m following up with a post that attempts to answer some of the questions that my readers and my personal friends asked me. Some wanted to understand better how to visualize what had happened, while others wanted more clarity on why the discovery was so important. So I’ve put together a post which (1) explains what neutron stars and black holes are and what their mergers are like, (2) clarifies why yesterday’s announcement was important — and there were many reasons, which is why it’s hard to reduce it all to a single soundbite. And (3) there are some miscellaneous questions at the end.

First, a disclaimer: I am *not* an expert in the very complex subject of neutron star mergers and the resulting explosions, called kilonovas. These are much more complicated than black hole mergers. I am still learning some of the details. Hopefully I’ve avoided errors, but you’ll notice a few places where I don’t know the answers … yet. Perhaps my more expert colleagues will help me fill in the gaps over time.

Please, if you spot any errors, don’t hesitate to comment!! And feel free to ask additional questions whose answers I can add to the list.

**BASIC QUESTIONS ABOUT NEUTRON STARS, BLACK HOLES, AND MERGERS**

**What are neutron stars and black holes, and how are they related?**

Every atom is made from a tiny atomic nucleus, made of neutrons and protons (which are very similar), and loosely surrounded by electrons. Most of an atom is empty space, so it can, under extreme circumstances, be crushed — but only if every electron and proton convert to a neutron (which remains behind) and a neutrino (which heads off into outer space.) When a giant star runs out of fuel, the pressure from its furnace turns off, and it collapses inward under its own weight, creating just those extraordinary conditions in which the matter can be crushed. Thus: a star’s interior, with a mass one to several times the Sun’s mass, is all turned into a several-mile(kilometer)-wide ball of neutrons — the number of neutrons approaching a 1 with 57 zeroes after it.

If the star is big but not too big, the neutron ball stiffens and holds its shape, and the star explodes outward, blowing itself to pieces in a what is called a core-collapse supernova. The ball of neutrons remains behind; this is what we call a neutron star. It’s a ball of the densest material that we know can exist in the universe — a pure atomic nucleus many miles(kilometers) across. It has a very hard surface; if you tried to go inside a neutron star, your experience would be a lot worse than running into a closed door at a hundred miles per hour.

If the star is very big indeed, the neutron ball that forms may immediately (or soon) collapse under its own weight, forming a black hole. A supernova may or may not result in this case; the star might just disappear. A black hole is very, very different from a neutron star. Black holes are what’s left when matter collapses irretrievably upon itself under the pull of gravity, shrinking down endlessly. While a neutron star has a surface that you could smash your head on, a black hole has no surface — it has an edge that is simply a point of no return, called a horizon. In Einstein’s theory, you can just go right through, as if passing through an open door. You won’t even notice the moment you go in. *[Note: this is true in Einstein’s theory. But there is a big controversy as to whether the combination of Einstein’s theory with quantum physics changes the horizon into something novel and dangerous to those who enter; this is known as the firewall controversy, and would take us too far afield into speculation.] * But once you pass through that door, you can never return.

Black holes can form in other ways too, but not those that we’re observing with the LIGO/VIRGO detectors.

**Why are their mergers the best sources for gravitational waves?**

One of the easiest and most obvious ways to make gravitational waves is to have two objects orbiting each other. If you put your two fists in a pool of water and move them around each other, you’ll get a pattern of water waves spiraling outward; this is in rough (*very* rough!) analogy to what happens with two orbiting objects, although, since the objects are moving in space, the waves aren’t in a material like water. They are waves in space itself.

To get powerful gravitational waves, you want objects each with a very big mass that are orbiting around each other at very high speed. To get the fast motion, you need the force of gravity between the two objects to be strong; and to get gravity to be as strong as possible, you need the two objects to be as close as possible (since, as Isaac Newton already knew, gravity between two objects grows stronger when the distance between them shrinks.) But if the objects are large, they can’t get too close; they will bump into each other and merge long before their orbit can become fast enough. So to get a really fast orbit, you need **two relatively small objects, each with a relatively big mass** — what scientists refer to as **compact objects**. Neutron stars and black holes are the most compact objects we know about. Fortunately, they do indeed often travel in orbiting pairs, and do sometimes, for a very brief period before they merge, orbit rapidly enough to produce gravitational waves that LIGO and VIRGO can observe.

**Why do we find these objects in pairs in the first place?**

Stars very often travel in pairs… they are called binary stars. They can start their lives in pairs, forming together in large gas clouds, or even if they begin solitary, they can end up pairing up if they live in large densely packed communities of stars where it is common for multiple stars to pass nearby. Perhaps surprisingly, their pairing can survive the collapse and explosion of either star, leaving two black holes, two neutron stars, or one of each in orbit around one another.

**What happens when these objects merge?**

Not surprisingly, there are three classes of mergers which can be detected: two black holes merging, two neutron stars merging, and a neutron star merging with a black hole. The first class was observed in 2015 (and announced in 2016), the second was announced yesterday, and it’s a matter of time before the third class is observed. The two objects may orbit each other for billions of years, very slowly radiating gravitational waves (an effect observed in the 70’s, leading to a Nobel Prize) and gradually coming closer and closer together. Only in the last day of their lives do their orbits really start to speed up. And just before these objects merge, they begin to orbit each other once per second, then ten times per second, then a hundred times per second. Visualize that if you can: objects a few dozen miles (kilometers) across, a few miles (kilometers) apart, each with the mass of the Sun or greater, orbiting each other *100 times each second*. It’s truly mind-boggling — a spinning dumbbell beyond the imagination of even the greatest minds of the 19th century. I don’t know any scientist who isn’t awed by this vision. It all sounds like science fiction. But it’s not.

**How do we know this isn’t science fiction?**

We know, if we believe Einstein’s theory of gravity (and I’ll give you a very good reason to believe in it in just a moment.) Einstein’s theory predicts that such a rapidly spinning, large-mass dumbbell formed by two orbiting compact objects will produce a telltale pattern of ripples in space itself — gravitational waves. That pattern is both complicated and precisely predicted. In the case of black holes, the predictions go right up to and past the moment of merger, to the ringing of the larger black hole that forms in the merger. In the case of neutron stars, the instants just before, during and after the merger are more complex and we can’t yet be confident we understand them, but during tens of seconds before the merger Einstein’s theory is very precise about what to expect. The theory further predicts how those ripples will cross the vast distances from where they were created to the location of the Earth, and how they will appear in the LIGO/VIRGO network of three gravitational wave detectors. The prediction of what to expect at LIGO/VIRGO thus involves not just one prediction but many: the theory is used to predict the existence and properties of black holes and of neutron stars, the detailed features of their mergers, the precise patterns of the resulting gravitational waves, and how those gravitational waves cross space. That LIGO/VIRGO have detected the telltale patterns of these gravitational waves. That these wave patterns agree with Einstein’s theory in every detail is the strongest evidence ever obtained that there is nothing wrong with Einstein’s theory when used in these combined contexts. That then in turn gives us confidence that our interpretation of the LIGO/VIRGO results is correct, confirming that black holes and neutron stars really exist and really merge. *(Notice the reasoning is slightly circular… but that’s how scientific knowledge proceeds, as a set of detailed consistency checks that gradually and eventually become so tightly interconnected as to be almost impossible to unwind. Scientific reasoning is not deductive; it is inductive. We do it not because it is logically ironclad but because it works so incredibly well — as witnessed by the computer, and its screen, that I’m using to write this, and the wired and wireless internet and computer disk that will be used to transmit and store it.)*

**THE SIGNIFICANCE(S) OF YESTERDAY’S ANNOUNCEMENT OF A NEUTRON STAR MERGER**

What makes it difficult to explain the significance of yesterday’s announcement is that it consists of many important results piled up together, rather than a simple takeaway that can be reduced to a single soundbite. (That was also true of the black hole mergers announcement back in 2016, which is why I wrote a long post about it.)

So here is a list of important things we learned. No one of them, by itself, is earth-shattering, but each one is profound, and taken together they form a major event in scientific history.

**First confirmed observation of a merger of two neutron stars**: We’ve known these mergers must occur, but there’s nothing like being sure. And since these things are too far away and too small to see in a telescope, *the only way to be sure these mergers occur, and to learn more details about them, is with gravitational waves*. We expect to see many more of these mergers in coming years as gravitational wave astronomy increases in its sensitivity, and we will learn more and more about them.

**New information about the properties of neutron stars:** Neutron stars were proposed almost a hundred years ago and were confirmed to exist in the 60’s and 70’s. But their precise details aren’t known; we believe they are like a giant atomic nucleus, but they’re so vastly larger than ordinary atomic nuclei that can’t be sure we understand all of their internal properties, and there are debates in the scientific community that can’t be easily answered… until, perhaps, now.

From the detailed pattern of the gravitational waves of this one neutron star merger, scientists already learn two things. First, we confirm that Einstein’s theory correctly predicts the basic pattern of gravitational waves from orbiting neutron stars, as it does for orbiting and merging black holes. Unlike black holes, however, there are more questions about what happens to neutron stars when they merge. The question of what happened to this pair after they merged is still out — did the form a neutron star, an unstable neutron star that, slowing its spin, eventually collapsed into a black hole, or a black hole straightaway?

But something important was already learned about the internal properties of neutron stars. The stresses of being whipped around at such incredible speeds would tear you and I apart, and would even tear the Earth apart. We know neutron stars are much tougher than ordinary rock, but how much more? If they were too flimsy, they’d have broken apart at some point during LIGO/VIRGO’s observations, and the simple pattern of gravitational waves that was expected would have suddenly become much more complicated. That didn’t happen until perhaps just before the merger. So scientists can use the simplicity of the pattern of gravitational waves to infer some new things about how stiff and strong neutron stars are. More mergers will improve our understanding. Again, *there is no other simple way to obtain this information.*

**First visual observation of an event that produces both immense gravitational waves and bright electromagnetic waves:** Black hole mergers aren’t expected to create a brilliant light display, because, as I mentioned above, they’re more like open doors to an invisible playground than they are like rocks, so they merge rather quietly, without a big bright and hot smash-up. But neutron stars are big balls of stuff, and so the smash-up can indeed create lots of heat and light of all sorts, just as you might naively expect. By “light” I mean not just visible light but all forms of electromagnetic waves, at all wavelengths (and therefore at all frequencies.) Scientists divide up the range of electromagnetic waves into categories. These categories are radio waves, microwaves, infrared light, visible light, ultraviolet light, X-rays, and gamma rays, listed from lowest frequency and largest wavelength to highest frequency and smallest wavelength. (Note that these categories and the dividing lines between them are completely arbitrary, but the divisions are useful for various scientific purposes. The **only** fundamental difference between yellow light, a radio wave, and a gamma ray is the wavelength and frequency; otherwise they’re exactly the same type of thing, a wave in the electric and magnetic fields.)

So if and when two neutron stars merge, we expect both gravitational waves and electromagnetic waves, the latter of many different frequencies created by many different effects that can arise when two huge balls of neutrons collide. But just because we expect them doesn’t mean they’re easy to see. These mergers are pretty rare — perhaps one every hundred thousand years in each big galaxy like our own — so the ones we find using LIGO/VIRGO will generally be very far away. If the light show is too dim, none of our telescopes will be able to see it.

But this light show was plenty bright. Gamma ray detectors out in space detected it instantly, confirming that the gravitational waves from the two neutron stars led to a collision and merger that produced very high frequency light. Already, that’s a first. It’s as though one had seen lightning for years but never heard thunder; or as though one had observed the waves from hurricanes for years but never observed one in the sky. Seeing both allows us a whole new set of perspectives; one plus one is often much more than two.

Over time — hours and days — effects were seen in visible light, ultraviolet light, infrared light, X-rays and radio waves. Some were seen earlier than others, which itself is a story, but each one contributes to our understanding of what these mergers are actually like.

**Confirmation of the best guess concerning the origin of “short” gamma ray bursts**: For many years, bursts of gamma rays have been observed in the sky. Among them, there seems to be a class of bursts that are shorter than most, typically lasting just a couple of seconds. They come from all across the sky, indicating that they come from distant intergalactic space, presumably from distant galaxies. Among other explanations, the most popular hypothesis concerning these short gamma-ray bursts has been that they come from merging neutron stars. *The only way to confirm this hypothesis is with the observation of the gravitational waves from such a merger. * That test has now been passed; it appears that the hypothesis is correct. That in turn means that we have, for the first time, both a good explanation of these short gamma ray bursts and, because we know how often we observe these bursts, a good estimate as to how often neutron stars merge in the universe.

**First distance measurement to a source using both a gravitational wave measure and a redshift in electromagnetic waves, allowing a new calibration of the distance scale of the universe and of its expansion rate: **The pattern over time of the gravitational waves from a merger of two black holes or neutron stars is complex enough to reveal many things about the merging objects, including a rough estimate of their masses and the orientation of the spinning pair relative to the Earth. The overall strength of the waves, combined with the knowledge of the masses, reveals how far the pair is from the Earth. That by itself is nice, but the real win comes when the discovery of the object using visible light, or in fact any light with frequency below gamma-rays, can be made. In this case, the galaxy that contains the neutron stars can be determined.

Once we know the host galaxy, we can do something really important. We can, by looking at the starlight, determine how rapidly the galaxy is moving away from us. For distant galaxies, the speed at which the galaxy recedes should be related to its distance because the universe is expanding.

How rapidly the universe is expanding has been recently measured with remarkable precision, but the problem is that there are two different methods for making the measurement, **and they disagree**. This disagreement is one of the most important problems for our understanding of the universe. Maybe one of the measurement methods is flawed, or maybe — and this would be much more interesting — the universe simply doesn’t behave the way we think it does.

What gravitational waves do is give us a third method: the gravitational waves directly provide the distance to the galaxy, and the electromagnetic waves directly provide the speed of recession. *There is no other way to make this type of joint measurement directly for distant galaxies. *The method is not accurate enough to be useful in just one merger, but once dozens of mergers have been observed, the average result will provide important new information about the universe’s expansion. When combined with the other methods, it may help resolve this all-important puzzle.

**Best test so far of Einstein’s prediction that the speed of light and the speed of gravitational waves are identical**: Since gamma rays from the merger and the peak of the gravitational waves arrived within two seconds of one another after traveling 130 million years — that is, about 5 thousand million million seconds — we can say that the speed of light and the speed of gravitational waves are both equal to the cosmic speed limit to within one part in 2 thousand million million. *Such a precise test requires the combination of gravitational wave and gamma ray observations.*

**Efficient production of heavy elements confirmed**: It’s long been said that we are star-stuff, or stardust, and it’s been clear for a long time that it’s true. But there’s been a puzzle when one looks into the details. While it’s known that all the chemical elements from hydrogen up to iron are formed inside of stars, and can be blasted into space in supernova explosions to drift around and eventually form planets, moons, and humans, it hasn’t been quite as clear how the other elements with heavier atoms — atoms such as iodine, cesium, gold, lead, bismuth, uranium and so on — predominantly formed. Yes they can be formed in supernovas, but not so easily; and there seem to be more atoms of heavy elements around the universe than supernovas can explain. There are many supernovas in the history of the universe, but the efficiency for producing heavy chemical elements is just too low.

It was proposed some time ago that the mergers of neutron stars might be a suitable place to produce these heavy elements. Even those these mergers are rare, they might be much more efficient, because the nuclei of heavy elements contain lots of neutrons and, not surprisingly, a collision of two neutron stars would produce lots of neutrons in its debris, suitable perhaps for making these nuclei. A key indication that this is going on would be the following: if a neutron star merger could be identified using gravitational waves, and if its location could be determined using telescopes, then one would observe a pattern of light that would be characteristic of what is now called a “kilonova” explosion. *Warning: I don’t yet know much about kilonovas and I may be leaving out important details.* A kilonova is powered by the process of forming heavy elements; most of the nuclei produced are initially radioactive — i.e., unstable — and they break down by emitting high energy particles, including the particles of light (called photons) which are in the gamma ray and X-ray categories. The resulting characteristic glow would be expected to have a pattern of a certain type: it would be initially bright but would dim rapidly in visible light, with a long afterglow in infrared light. The reasons for this are complex, so let me set them aside for now. The important point is that this pattern was observed, confirming that a kilonova of this type occurred, and thus that, in this neutron star merger, enormous amounts of heavy elements were indeed produced. So we now have a lot of evidence, for the first time, that almost all the heavy chemical elements on and around our planet were formed in neutron star mergers. Again, *we could not know this if we did not know that this was a neutron star merger, and that information comes only from the gravitational wave observation.*

**MISCELLANEOUS QUESTIONS**

**Did the merger of these two neutron stars result in a new black hole, a larger neutron star, or an unstable rapidly spinning neutron star that later collapsed into a black hole?**

We don’t yet know, and maybe we won’t know. Some scientists involved appear to be leaning toward the possibility that a black hole was formed, but others seem to say the jury is out. I’m not sure what additional information can be obtained over time about this.

**If the two neutron stars formed a black hole, why was there a kilonova? Why wasn’t everything sucked into the black hole?**

Black holes aren’t vacuum cleaners; they pull things in via gravity just the same way that the Earth and Sun do, and don’t suck things in some unusual way. The only crucial thing about a black hole is that once you go in you can’t come out. But just as when trying to avoid hitting the Earth or Sun, you can avoid falling in if you orbit fast enough or if you’re flung outward before you reach the edge.

The point in a neutron star merger is that the forces at the moment of merger are so intense that one or both neutron stars are partially ripped apart. The material that is thrown outward in all directions, at an immense speed, somehow creates the bright, hot flash of gamma rays and eventually the kilonova glow from the newly formed atomic nuclei. Those details I don’t yet understand, but I know they have been carefully studied both with approximate equations and in computer simulations such as this one and this one. However, *the accuracy of the simulations can only be confirmed through the detailed studies of a merger, such as the one just announced.* It seems, from the data we’ve seen, that the simulations did a fairly good job. I’m sure they will be improved once they are compared with the recent data.

Filed under: Astronomy, Gravitational Waves Tagged: black holes, Gravitational Waves, LIGO, neutron stars

## October 16, 2017

### Sean Carroll - Preposterous Universe

Everyone is rightly excited about the latest gravitational-wave discovery. The LIGO observatory, recently joined by its European partner VIRGO, had previously seen gravitational waves from coalescing black holes. Which is super-awesome, but also a bit lonely — black holes are black, so we detect the gravitational waves and little else. Since our current gravitational-wave observatories aren’t very good at pinpointing source locations on the sky, we’ve been completely unable to say which galaxy, for example, the events originated in.

This has changed now, as we’ve launched the era of “multi-messenger astronomy,” detecting both gravitational and electromagnetic radiation from a single source. The event was the merger of two neutron stars, rather than black holes, and all that matter coming together in a giant conflagration lit up the sky in a large number of wavelengths simultaneously.

Look at all those different observatories, and all those wavelengths of electromagnetic radiation! Radio, infrared, optical, ultraviolet, X-ray, and gamma-ray — soup to nuts, astronomically speaking.

A lot of cutting-edge science will come out of this, see e.g. this main science paper. Apparently some folks are very excited by the fact that the event produced an amount of gold equal to several times the mass of the Earth. But it’s my blog, so let me highlight the aspect of personal relevance to me: using “standard sirens” to measure the expansion of the universe.

We’re already pretty good at measuring the expansion of the universe, using something called the cosmic distance ladder. You build up distance measures step by step, determining the distance to nearby stars, then to more distant clusters, and so forth. Works well, but of course is subject to accumulated errors along the way. This new kind of gravitational-wave observation is something else entirely, allowing us to completely jump over the distance ladder and obtain an independent measurement of the distance to cosmological objects. See this LIGO explainer.

The simultaneous observation of gravitational and electromagnetic waves is crucial to this idea. You’re trying to compare two things: the distance to an object, and the apparent velocity with which it is moving away from us. Usually velocity is the easy part: you measure the redshift of light, which is easy to do when you have an electromagnetic spectrum of an object. But with gravitational waves alone, you can’t do it — there isn’t enough structure in the spectrum to measure a redshift. That’s why the exploding neutron stars were so crucial; in this event, GW170817, we can for the first time determine the precise redshift of a distant gravitational-wave source.

Measuring the distance is the tricky part, and this is where gravitational waves offer a new technique. The favorite conventional strategy is to identify “standard candles” — objects for which you have a reason to believe you know their intrinsic brightness, so that by comparing to the brightness you actually observe you can figure out the distance. To discover the acceleration of the universe, for example, astronomers used Type Ia supernovae as standard candles.

Gravitational waves don’t quite give you standard candles; every one will generally have a different intrinsic gravitational “luminosity” (the amount of energy emitted). But by looking at the precise way in which the source evolves — the characteristic “chirp” waveform in gravitational waves as the two objects rapidly spiral together — we can work out precisely what that total luminosity actually is. Here’s the chirp for GW170817, compared to the other sources we’ve discovered — much more data, almost a full minute!

So we have both distance and redshift, without using the conventional distance ladder at all! This is important for all sorts of reasons. An independent way of getting at cosmic distances will allow us to measure properties of the dark energy, for example. You might also have heard that there is a discrepancy between different ways of measuring the Hubble constant, which either means someone is making a tiny mistake or there is something dramatically wrong with the way we think about the universe. Having an independent check will be crucial in sorting this out. Just from this one event, we are able to say that the Hubble constant is 70 kilometers per second per megaparsec, albeit with large error bars (+12, -8 km/s/Mpc). That will get much better as we collect more events.

So here is my (infinitesimally tiny) role in this exciting story. The idea of using gravitational-wave sources as standard sirens was put forward by Bernard Schutz all the way back in 1986. But it’s been developed substantially since then, especially by my friends Daniel Holz and Scott Hughes. Years ago Daniel told me about the idea, as he and Scott were writing one of the early papers. My immediate response was “Well, you have to call these things `standard sirens.'” And so a useful label was born.

Sadly for my share of the glory, my Caltech colleague Sterl Phinney also suggested the name simultaneously, as the acknowledgments to the paper testify. That’s okay; when one’s contribution is this extremely small, sharing it doesn’t seem so bad.

By contrast, the glory attaching to the physicists and astronomers who pulled off this observation, and the many others who have contributed to the theoretical understanding behind it, is substantial indeed. Congratulations to all of the hard-working people who have truly opened a new window on how we look at our universe.

### Matt Strassler - Of Particular Significance

Gravitational waves are now the most important new tool in the astronomer’s toolbox. Already they’ve been used to confirm that large black holes — with masses ten or more times that of the Sun — and mergers of these large black holes to form even larger ones, are not uncommon in the universe. Today it goes a big step further.

It’s long been known that neutron stars, remnants of collapsed stars that have exploded as supernovas, are common in the universe. And it’s been known almost as long that sometimes neutron stars travel in pairs. (In fact that’s how gravitational waves were first discovered, indirectly, back in the 1970s.) Stars often form in pairs, and sometimes both stars explode as supernovas, leaving their neutron star relics in orbit around one another. Neutron stars are small — just ten or so kilometers (miles) across. According to Einstein’s theory of gravity, a pair of stars should gradually lose energy by emitting gravitational waves into space, and slowly but surely the two objects should spiral in on one another. Eventually, after many millions or even billions of years, they collide and merge into a larger neutron star, or into a black hole. This collision does two things.

- It makes some kind of brilliant flash of light — electromagnetic waves — whose details are only guessed at. Some of those electromagnetic waves will be in the form of visible light, while much of it will be in invisible forms, such as gamma rays.
- It makes gravitational waves, whose details are easier to calculate and which are therefore distinctive, but couldn’t have been detected until LIGO and VIRGO started taking data, LIGO over the last couple of years, VIRGO over the last couple of months.

It’s possible that we’ve seen the light from neutron star mergers before, but no one could be sure. Wouldn’t it be great, then, if we could see gravitational waves AND electromagnetic waves from a neutron star merger? It would be a little like seeing the flash and hearing the sound from fireworks — seeing and hearing is better than either one separately, with each one clarifying the other. *(Caution: scientists are often speaking as if detecting gravitational waves is like “hearing”. This is only an analogy, and a vague one! It’s not at all the same as acoustic waves that we can hear with our ears, for many reasons… so please don’t take it too literally.) * If we could do both, we could learn about neutron stars and their properties in an entirely new way.

Today, we learned that this has happened. LIGO , with the world’s first two gravitational observatories, detected the waves from two merging neutron stars, 130 million light years from Earth, on August 17th. (Neutron star mergers last much longer than black hole mergers, so the two are easy to distinguish; and this one was so close, relatively speaking, that it was seen for a long while.) VIRGO, with the third detector, allows scientists to triangulate and determine roughly where mergers have occurred. They saw only a very weak signal, but that was extremely important, because it told the scientists* that the merger must have occurred in a small region of the sky where VIRGO has a relative blind spot*. That told scientists where to look.

The merger was detected for more than a full minute… to be compared with black holes whose mergers can be detected for less than a second. It’s not exactly clear yet what happened at the end, however! Did the merged neutron stars form a black hole or a neutron star? The jury is out.

At almost exactly the moment at which the gravitational waves reached their peak, a blast of gamma rays — electromagnetic waves of very high frequencies — were detected by a different scientific team, the one from FERMI. FERMI detects gamma rays from the distant universe every day, and a two-second gamma-ray-burst is not unusual. And INTEGRAL, another gamma ray experiment, also detected it. The teams communicated within minutes. The FERMI and INTEGRAL gamma ray detectors can only indicate the rough region of the sky from which their gamma rays originate, and LIGO/VIRGO together also only give a rough region. But the scientists saw those regions overlapped. The evidence was clear. And with that, astronomy entered a new, highly anticipated phase.

Already this was a huge discovery. Brief gamma-ray bursts have been a mystery for years. One of the best guesses as to their origin has been neutron star mergers. Now the mystery is solved; that guess is apparently correct. *(Or is it? Probably, but the gamma ray discovery is surprisingly dim, given how close it is. So there are still questions to ask.)*

Also confirmed by the fact that these signals arrived within a couple of seconds of one another, after traveling for over 100 million years from the same source, is that, indeed, the speed of light and the speed of gravitational waves are exactly the same — both of them equal to the cosmic speed limit, just as Einstein’s theory of gravity predicts.

Next, these teams quickly told their astronomer friends to train their telescopes in the general area of the source. Dozens of telescopes, from every continent and from space, and looking for electromagnetic waves at a huge range of frequencies, pointed in that rough direction and scanned for anything unusual. *(A big challenge: the object was near the Sun in the sky, so it could be viewed in darkness only for an hour each night!)* Light was detected! At all frequencies! The object was very bright, making it easy to find the galaxy in which the merger took place. The brilliant glow was seen in gamma rays, ultraviolet light, infrared light, X-rays, and radio. (Neutrinos, particles that can serve as another way to observe distant explosions, were not detected this time.)

And with so much information, so much can be learned!

Most important, perhaps, is this: from the pattern of the spectrum of light, the conjecture seems to be confirmed that the mergers of neutron stars are important sources, perhaps the dominant one, for many of the heavy chemical elements — iodine, iridium, cesium, gold, platinum, and so on — that are forged in the intense heat of these collisions. It used to be thought that the same supernovas that form neutron stars in the first place were the most likely source. But now it seems that this second stage of neutron star life — merger, rather than birth — is just as important. That’s fascinating, because neutron star mergers are much more rare than the supernovas that form them. There’s a supernova in our Milky Way galaxy every century or so, but it’s tens of millenia or more between these “kilonovas”, created in neutron star mergers.

If there’s anything disappointing about this news, it’s this: almost everything that was observed by all these different experiments was predicted in advance. Sometimes it’s more important and useful when some of your predictions fail completely, because then you realize how much you have to learn. Apparently our understanding of gravity, of neutron stars, and of their mergers, and of all sorts of sources of electromagnetic radiation that are produced in those merges, is even better than we might have thought. But fortunately there are a few new puzzles. The X-rays were late; the gamma rays were dim… we’ll hear more about this shortly, as NASA is holding a second news conference.

Some highlights from the second news conference:

- New information about neutron star interiors, which affects how large they are and therefore how exactly they merge, has been obtained
- The first ever visual-light image of a gravitational wave source, from the Swope telescope, at the outskirts of a distant galaxy; the galaxy’s center is the blob of light, and the arrow points to the explosion.

- The theoretical calculations for a kilonova explosion suggest that debris from the blast should rather quickly block the visual light, so the explosion dims quickly in visible light — but infrared light lasts much longer. The observations by the visible and infrared light telescopes confirm this aspect of the theory; and you can see evidence for that in the picture above, where four days later the bright spot is both much dimmer and much redder than when it was discovered.
- Estimate: the total mass of the gold and platinum produced in this explosion is vastly larger than the mass of the Earth.
- Estimate: these neutron stars were formed about 10 or so billion years ago. They’ve been orbiting each other for most of the universe’s history, and ended their lives just 130 million years ago, creating the blast we’ve so recently detected.
- Big Puzzle: all of the previous gamma-ray bursts seen up to now have always had shone in ultraviolet light and X-rays as well as gamma rays. But X-rays didn’t show up this time, at least not initially. This was a big surprise. It took 9 days for the Chandra telescope to observe X-rays, too faint for any other X-ray telescope. Does this mean that the two neutron stars created a black hole, which then created a jet of matter that points not quite directly at us but off-axis, and shines by illuminating the matter in interstellar space? This had been suggested as a possibility twenty years ago, but this is the first time there’s been any evidence for it.
- One more surprise: it took 16 days for radio waves from the source to be discovered, with the Very Large Array, the most powerful existing radio telescope. The radio emission has been growing brighter since then! As with the X-rays, this seems also to support the idea of an off-axis jet.
- Nothing quite like this gamma-ray burst has been seen — or rather, recognized — before. When a gamma ray burst doesn’t have an X-ray component showing up right away, it simply looks odd and a bit mysterious. Its harder to observe than most bursts, because without a jet pointing right at us, its afterglow fades quickly. Moreover, a jet pointing at us is bright, so it blinds us to the more detailed and subtle features of the kilonova. But this time, LIGO/VIRGO told scientists that “Yes, this is a neutron star merger”, leading to detailed study from all electromagnetic frequencies, including patient study over many days of the X-rays and radio. In other cases those observations would have stopped after just a short time, and the whole story couldn’t have been properly interpreted.

Filed under: Astronomy, Gravitational Waves

## October 15, 2017

## October 13, 2017

### Sean Carroll - Preposterous Universe

Trying to climb out from underneath a large pile of looming (and missed) deadlines, and in the process I’m hoping to ramp back up the real blogging. In the meantime, here are a couple of videos to tide you over.

First, an appearance a few weeks ago on Joe Rogan’s podcast. Rogan is a professional comedian and mixed-martial arts commentator, but has built a great audience for his wide-ranging podcast series. One of the things that makes him a good interviewer is his sincere delight in the material, as evidenced here by noting repeatedly that his mind had been blown. We talked for over two and a half hours, covering cosmology and quantum mechanics but also some bits about AI and pop culture.

And here’s a more straightforward lecture, this time at King’s College in London. The topic was “Extracting the Universe from the Wave Function,” which I’ve used for a few talks that ended up being pretty different in execution. This one was aimed at undergraduate physics students, some of whom hadn’t even had quantum mechanics. So the first half is a gentle introduction to many-worlds theory and why it’s the best version of quantum mechanics, and the second half tries to explain our recent efforts to emerge space itself out of quantum entanglement.

I was invited to King’s by Eugene Lim, one of my former grad students and now an extremely productive faculty member in his own right. It’s always good to see your kids grow up to do great things!

## October 09, 2017

### Alexey Petrov - Symmetry factor

I wanted to share some ideas about a teaching method I am trying to develop and implement this semester. Please let me know if you’ve heard of someone doing something similar.

This semester I am teaching our undergraduate mechanics class. This is the first time I am teaching it, so I started looking into a possibility to shake things up and maybe apply some new method of teaching. And there are plenty offered: flipped classroom, peer instruction, Just-in-Time teaching, etc. They all look to “move away from the inefficient old model” where there the professor is lecturing and students are taking notes. I have things to say about that, but not in this post. It suffices to say that most of those approaches are essentially trying to make students *work* (both with the lecturer and their peers) in class and outside of it. At the same time those methods attempt to “compartmentalize teaching” i.e. make large classes “smaller” by bringing up each individual student’s contribution to class activities (by using “clickers”, small discussion groups, etc). For several reasons those approaches did not fit my goal this semester.

Our Classical Mechanics class is a *gateway class* for our physics majors. It is the first class they take after they are done with general physics lectures. So the students are already familiar with the (simpler version of the) material they are going to be taught. The goal of this class is to start *molding physicists out of students*: they learn to simplify problems so physics methods can be properly applied (that’s how “a Ford Mustang improperly parked at the top of the icy hill slides down…” turns into “a block slides down the incline…”), learn to always derive the final formula before plugging in numbers, look at the asymptotics of their solutions as a way to see if the solution makes sense, and many other wonderful things.

So with all that I started doing something I’d like to call *non-linear teaching*. The gist of it is as follows. I give a lecture (and don’t get me wrong, I do make my students talk and work: I ask questions, we do “duels” (students argue different sides of a question), etc — all of that can be done efficiently in a class of 20 students). But instead of one homework with 3-4 problems per week I have two types of homework assignments for them: *short homeworks* and *projects*.

Short homework assignments are *single-problem assignments* given after each class that must be done by the next class. They are designed such that a student need to re-derive material that we discussed previously in class with small new twist added. For example, in the block-down-to-incline problem discussed in class I ask them to choose coordinate axes in a different way and prove that the result is independent of the choice of the coordinate system. Or ask them to find at which angle one should throw a stone to get the maximal possible range (including air resistance), etc. This way, instead of doing an assignment in the last minute at the end of the week, students have to work out what they just learned in class every day! More importantly, I get to change *how* I teach. Depending on how they did on the previous short homework, I adjust the material (both speed and volume) discussed in class. I also design examples for the future sections in such a way that I can repeat parts of the topic that was hard for the students previously. Hence, instead of a linear propagation of the course, we are moving along something akin to helical motion, returning and spending more time on topics that students find more difficult. That’t why my teaching is “non-linear”.

Project homework assignments are designed to develop understanding of how topics in a given chapter relate to each other. There are as many project assignments as there are chapters. Students get two weeks to complete them.

Overall, students solve exactly the same number of problems they would in a normal lecture class. Yet, those problems are scheduled in a different way. In my way, students are forced to learn by constantly re-working what was just discussed in a lecture. And for me, I can quickly react (by adjusting lecture material and speed) using constant feedback I get from students in the form of short homeworks. Win-win!

I will do benchmarking at the end of the class by comparing my class performance to aggregate data from previous years. I’ll report on it later. But for now I would be interested to hear your comments!

## October 05, 2017

### Symmetrybreaking - Fermilab/SLAC

Instead of searching for dark matter particles, a new device will search for dark matter waves.

Researchers are testing a prototype “radio” that could let them listen to the tune of mysterious dark matter particles.

Dark matter is an invisible substance thought to be five times more prevalent in the universe than regular matter. According to theory, billions of dark matter particles pass through the Earth each second. We don’t notice them because they interact with regular matter only very weakly, through gravity.

So far, researchers have mostly been looking for dark matter particles. But with the dark matter radio, they want to look for dark matter waves.

Direct detection experiments for dark matter particles use large underground detectors. Researchers hope to see signals from dark matter particles colliding with the detector material. However, this only works if dark matter particles are heavy enough to deposit a detectable amount energy in the collision.

“If dark matter particles were very light, we might have a better chance of detecting them as waves rather than particles,” says Peter Graham, a theoretical physicist at the Kavli Institute for Particle Astrophysics and Cosmology, a joint institute of Stanford University and the Department of Energy’s SLAC National Accelerator Laboratory. “Our device will take the search in that direction.”

The dark matter radio makes use of a bizarre concept of quantum mechanics known as wave-particle duality: Every particle can also behave like a wave.

Take, for example, the photon: the massless fundamental particle that carries the electromagnetic force. Streams of them make up electromagnetic radiation, or light, which we typically describe as waves—including radio waves.

The dark matter radio will search for dark matter waves associated with two particular dark matter candidates. It could find hidden photons—hypothetical cousins of photons with a small mass. Or it could find axions, which scientists think can be produced out of light and transform back into it in the presence of a magnetic field.

“The search for hidden photons will be completely unexplored territory,” says Saptarshi Chaudhuri, a Stanford graduate student on the project. “As for axions, the dark matter radio will close gaps in the searches of existing experiments.”

### Intercepting dark matter vibes

A regular radio intercepts radio waves with an antenna and converts them into sound. What sound depends on the station. A listener chooses a station by adjusting an electric circuit, in which electricity can oscillate with a certain resonant frequency. If the circuit’s resonant frequency matches the station’s frequency, the radio is tuned in and the listener can hear the broadcast.

The dark matter radio works the same way. At its heart is an electric circuit with an adjustable resonant frequency. If the device were tuned to a frequency that matched the frequency of a dark matter particle wave, the circuit would resonate. Scientists could measure the frequency of the resonance, which would reveal the mass of the dark matter particle.

The idea is to do a frequency sweep by slowly moving through the different frequencies, as if tuning a radio from one end of the dial to the other.

The electric signal from dark matter waves is expected to be very weak. Therefore, Graham has partnered with a team led by another KIPAC researcher, Kent Irwin. Irwin’s group is developing highly sensitive magnetometers known as superconducting quantum interference devices, or SQUIDs, which they’ll pair with extremely low-noise amplifiers to hunt for potential signals.

In its final design, the dark matter radio will search for particles in a mass range of trillionths to millionths of an electronvolt. (One electronvolt is about a billionth of the mass of a proton.) This is somewhat problematic because this range includes kilohertz to gigahertz frequencies—frequencies used for over-the-air broadcasting.

“Shielding the radio from unwanted radiation is very important and also quite challenging,” Irwin says. “In fact, we would need a several-yards-thick layer of copper to do so. Fortunately we can achieve the same effect with a thin layer of superconducting metal.”

One advantage of the dark matter radio is that it does not need to be shielded from cosmic rays. Whereas direct detection searches for dark matter particles must operate deep underground to block out particles falling from space, the dark matter radio can operate in a university basement.

The researchers are now testing a small-scale prototype at Stanford that will scan a relatively narrow frequency range. They plan on eventually operating two independent, full-size instruments at Stanford and SLAC.

“This is exciting new science,” says Arran Phipps, a KIPAC postdoc on the project. “It’s great that we get to try out a new detection concept with a device that is relatively low-budget and low-risk.”

The dark matter disc jockeys are taking the first steps now and plan to conduct their dark matter searches over the next few years. Stay tuned for future results.

## October 03, 2017

### Jon Butterworth - Life and Physics

At the Guardian.

Filed under: Astrophysics, Particle Physics, Physics, Science Tagged: dark energy, dark matter, Perimeter Institute, Relativity

### Symmetrybreaking - Fermilab/SLAC

Scientists Rainer Weiss, Kip Thorne and Barry Barish won the 2017 Nobel Prize in Physics for their roles in creating the LIGO experiment.

Three scientists who made essential contributions to the LIGO collaboration have been awarded the 2017 Nobel Prize in Physics.

Rainer Weiss will share the prize with Kip Thorne and Barry Barish for their roles in the discovery of gravitational waves, ripples in space-time predicted by Albert Einstein. Weiss and Thorne conceived of LIGO, and Barish is credited with reviving the struggling experiment and making it happen.

“I view this more as a thing that recognizes the work of about 1000 people,” Weiss said during a Q&A after the announcement this morning. “It’s really a dedicated effort that has been going on, I hate to tell you, for as long as 40 years, people trying to make a detection in the early days and then slowly but surely getting the technology together to do it.”

Another founder of LIGO, scientist Ronald Drever, died in March. Nobel Prizes are not awarded posthumously.

According to Einstein’s general theory of relativity, powerful cosmic events release energy in the form of waves traveling through the fabric of existence at the speed of light. LIGO detects these disturbances when they disrupt the symmetry between the passages of identical laser beams traveling identical distances.

The setup for the LIGO experiment looks like a giant L, with each side stretching about 2.5 miles long. Scientists split a laser beam and shine the two halves down the two sides of the L. When each half of the beam reaches the end, it reflects off a mirror and heads back to the place where its journey began.

Normally, the two halves of the beam return at the same time. When there’s a mismatch, scientists know something is going on. Gravitational waves compress space-time in one direction and stretch it in another, giving one half of the beam a shortcut and sending the other on a longer trip. LIGO is sensitive enough to notice a difference between the arms as small as 1000^{th} the diameter of an atomic nucleus.

Scientists on LIGO and their partner collaboration, called Virgo, reported the first detection of gravitational waves in February 2016. The waves were generated in the collision of two black holes with 29 and 36 times the mass of the sun 1.3 billion years ago. They reached the LIGO experiment as scientists were conducting an engineering test.

“It took us a long time, something like two months, to convince ourselves that we had seen something from outside that was truly a gravitational wave,” Weiss said.

LIGO, which stands for Laser Interferometer Gravitational-Wave Observatory, consists of two of these pieces of equipment, one located in Louisiana and another in Washington state.

The experiment is operated jointly by Weiss’s home institution, MIT, and Barish and Thorne’s home institution, Caltech. The experiment has collaborators from more than 80 institutions from more than 20 countries. A third interferometer, operated by the Virgo collaboration, recently joined LIGO to make the first joint observation of gravitational waves.

## September 28, 2017

### Symmetrybreaking - Fermilab/SLAC

A Fermilab technical specialist recently invented a device that could help alert oncoming trains to large vehicles stuck on the tracks.

Browsing YouTube late at night, Fermilab Technical Specialist Derek Plant stumbled on a series of videos that all begin the same way: a large vehicle—a bus, semi or other low-clearance vehicle—is stuck on a railroad crossing. In the end, the train crashes into the stuck vehicle, destroying it and sometimes even derailing the train. According to the Federal Railroad Administration, every year hundreds of vehicles meet this fate by trains, which can take over a mile to stop.

“I was just surprised at the number of these that I found,” Plant says. “For every accident that’s videotaped, there are probably many more.”

Inspired by a workplace safety class that preached a principle of minimizing the impact of accidents, Plant set about looking for solutions to the problem of trains hitting stuck vehicles.

Railroad tracks are elevated for proper drainage, and the humped profile of many crossings can cause a vehicle to bottom out. “Theoretically, we could lower all the crossings so that they’re no longer a hump. But there are 200,000 crossings in the United States,” Plant says. “Railroads and local governments are trying hard to minimize the number of these crossings by creating overpasses, or elevating roadways. That’s cost-prohibitive, and it’s not going to happen soon.”

Other solutions, such as re-engineering the suspension on vehicles likely to get stuck, seemed equally improbable.

After studying how railroad signaling systems work, Plant came up with an idea: to fake the presence of a train. His invention was developed in his spare time using techniques and principles he learned over his almost two decades at Fermilab. It is currently in the patent application process and being prosecuted by Fermilab’s Office of Technology Transfer.

“If you cross over a railroad track and you look down the tracks, you’ll see red or yellow or green lights,” he says. “Trains have traffic signals too.”

These signals are tied to signal blocks—segments of the tracks that range from a mile to several miles in length. When a train is on the tracks, its metal wheels and axle connect both rails, forming an electric circuit through the tracks to trigger the signals. These signals inform other trains not to proceed while one train occupies a block, avoiding pileups.

Plant thought, “What if other vehicles could trigger the same signal in an emergency?” By faking the presence of a train, a vehicle stuck on the tracks could give advanced warning for oncoming trains to stop and stall for time. Hence the name of Plant’s invention: the Ghost Train Generator.

To replicate the train’s presence, Plant knew he had to create a very strong electric current between the rails. The most straightforward way to do this is with massive amounts of metal, as a train does. But for the Ghost Train Generator to be useful in a pinch, it needs to be small, portable and easily applied. The answer to achieving these features lies in strong magnets and special wire.

“Put one magnet on one rail and one magnet on the other and the device itself mimics—electrically—what a train would look like to the signaling system,” he says. “In theory, this could be carried in vehicles that are at high risk for getting stuck on a crossing: semis, tour buses and first-response vehicles,” Plant says. “Keep it just like you would a fire extinguisher—just behind the seat or in an emergency compartment.”

Once the device is deployed, the train would receive the signal that the tracks were obstructed and stop. Then the driver of the stuck vehicle could call for emergency help using the hotline posted on all crossings.

Plant compares the invention to a seatbelt.

“Is it going to save your life 100 percent of the time? Nope, but smart people wear them,” he says. “It’s designed to prevent a collision when a train is more than two minutes from the crossing.”

And like a seatbelt, part of what makes Plant’s invention so appealing is its simplicity.

“The first thing I thought was that this is a clever invention,” says Aaron Sauers from Fermilab’s technology transfer office, who works with lab staff to develop new technologies for market. “It’s an elegant solution to an existing problem. I thought, ‘This technology could have legs.’”

The organizers of the National Innovation Summit seem to agree. In May, Fermilab received an Innovation Award from TechConnect for the Ghost Train Generator. The invention will also be featured as a showcase technology in the upcoming Defense Innovation Summit in October.

The Ghost Train Generator is currently in the pipeline to receive a patent with help from Fermilab, and its prospects are promising, according to Sauers. It is a nonprovisional patent, which has specific claims and can be licensed. After that, if the generator passes muster and is granted a patent, Plant will receive a portion of the royalties that it generates for Fermilab.

Fermilab encourages a culture of scientific innovation and exploration beyond the field of particle physics, according to Sauers, who noted that Plant’s invention is just one of a number of technology transfer initiatives at the lab.

Plant agrees—Fermilab’s environment helped motivate his efforts to find a solution for railroad crossing accidents.

“It’s just a general problem-solving state of mind,” he says. “That’s the philosophy we have here at the lab.”

*Editor's note: A version of this article was originally published by Fermilab.*

### Symmetrybreaking - Fermilab/SLAC

The national laboratory opened usually inaccessible areas of its campus to thousands of visitors to celebrate 50 years of discovery.

Fermi National Accelerator Laboratory’s yearlong 50th anniversary celebration culminated on Saturday with an Open House that drew thousands of visitors despite the unseasonable heat.

On display were areas of the lab not normally open to guests, including neutrino and muon experiments, a portion of the accelerator complex, lab spaces and magnet and accelerator fabrication and testing areas, to name a few. There were also live links to labs around the world, including CERN, a mountaintop observatory in Chile, and the mile-deep Sanford Underground Research Facility that will house the international neutrino experiment, DUNE.

But it wasn’t all physics. In addition to hands-on demos and a STEM fair, visitors could also learn about Fermilab’s art and history, walk the prairie trails or hang out with the ever-popular bison. In all, some 10,000 visitors got to go behind-the-scenes at Fermilab, shuttled around on 80 buses and welcomed by 900 Fermilab workers eager to explain their roles at the lab. Below, see a few of the photos captured as Fermilab celebrated 50 years of discovery.

## September 27, 2017

### Matt Strassler - Of Particular Significance

Welcome, VIRGO! Another merger of two big black holes has been detected, this time by both LIGO’s two detectors and by VIRGO as well.

Aside from the fact that this means that the VIRGO instrument actually works, which is great news, why is this a big deal? By adding a third gravitational wave detector, built by the VIRGO collaboration, to LIGO’s Washington and Louisiana detectors, the scientists involved in the search for gravitational waves now can determine fairly accurately the direction from which a detected gravitational wave signal is coming. And this allows them to do something new: to tell their astronomer colleagues roughly where to look in the sky, using ordinary telescopes, for some form of electromagnetic waves (perhaps visible light, gamma rays, or radio waves) that might have been produced by whatever created the gravitational waves.

The point is that with three detectors, one can triangulate. The gravitational waves travel for billions of years, traveling at the speed of light, and when they pass by, they are detected at both LIGO detectors and at VIRGO. But because it takes light a few thousandths of a second to travel the diameter of the Earth, the waves arrive at slightly different times at the LIGO Washington site, the LIGO Louisiana site, and the VIRGO site in Italy. The precise timing tells the scientists what direction the waves were traveling in, and therefore roughly where they came from. In a similar way, using the fact that sound travels at a known speed, the times that a gunshot is heard at multiple locations can be used by police to determine where the shot was fired.

You can see the impact in the picture below, which is an image of the sky drawn as a sphere, as if seen from outside the sky looking in. In previous detections of black hole mergers by LIGO’s two detectors, the scientists could only determine a large swath of sky where the observed merger might have occurred; those are the four colored regions that stretch far across the sky. But notice the green splotch at lower left. That’s the region of sky where the black hole merger announced today occurred. The fact that this region is many times smaller than the other four reflects what including VIRGO makes possible. It’s a small enough region that one can search using an appropriate telescope for something that is making visible light, or gamma rays, or radio waves.

While a black hole merger isn’t expected to be observable by other telescopes, and indeed nothing was observed by other telescopes this time, other events that LIGO might detect, such as a merger of two neutron stars, may create an observable effect. We can hope for such exciting news over the next year or two.

Filed under: Astronomy, Gravitational Waves Tagged: black holes, Gravitational Waves, LIGO

## September 26, 2017

### Symmetrybreaking - Fermilab/SLAC

As Jordan-based SESAME nears its first experiments, members are connecting in new ways.

Early in the morning, physicist Roy Beck Barkai boards a bus in Tel Aviv bound for Jordan. By 10:30 a.m., he is on site at SESAME, a new scientific facility where scientists plan to use light to study everything from biology to archaeology. He is back home by 7 p.m., in time to have dinner with his children.

Before SESAME opened, the closest facility like it was in Italy. Beck Barkai often traveled for two days by airplane, train and taxi for a day or two of work—an inefficient and expensive process that limited his ability to work with specialized equipment from his home lab and required him to spend days away from his family.

“For me, having the ability to kiss them goodbye in the morning and just before they went to sleep at night is a miracle,” Beck Barkai says. “It felt like a dream come true. Having SESAME at our doorstep is a big plus."

SESAME, also known as the International Centre for Synchrotron-Light for Experimental Science and Applications in the Middle East, opened its doors in May and is expected to host its first beams of synchrotron light this year. Scientists from around the world will be able to apply for time to use the facility’s powerful light source for their experiments. It’s the first synchrotron in the region.

Beck Barkai says SESAME provides a welcome dose of convenience, as scientists in the region can now drive to a research center instead of flying with sensitive equipment to another country. It’s also more cost-effective.

Located in Jordan to the northwest of the city of Amman, SESAME was built by a collaboration made up of Cyprus, Egypt, Iran, Israel, Jordan, Pakistan, Turkey and the Palestinian Authority—a partnership members hope will improve relations among the eight neighbors.

“SESAME is a very important step in the region,” says SESAME Scientific Advisory Committee Chair Zehra Sayers. “The language of science is objective. It’s based on curiosity. It doesn’t need to be affected by the differences in cultural and social backgrounds. I hope it is something that we will leave the next generations as a positive step toward stability.”

Protein researcher and a University of Jordan professor Areej Abuhammad says she hopes SESAME will provide an environment that encourages collaboration.

“I think through having the chance to interact, the scientists from around this region will learn to trust and respect each other,” she says. “I don’t think that this will result in solving all the problems in the region from one day to the next, but it will be a big step forward.”

The $100 million center is a state-of-the-art research facility that should provide some relief to scientists seeking time at other, overbooked facilities. SESAME plans to eventually host 100 to 200 users at a time.

SESAME’s first two beamlines will open later this year. About twice per year, SESAME will announce calls for research proposals, the next of which is expected for this fall. Sayers says proposals will be evaluated for originality, preparedness and scientific quality.

Groups of researchers hoping to join the first round of experiments submitted more than 50 applications. Once the lab is at full operation, Sayers says, the selection committee expects to receive four to five times more than that.

Opening up a synchrotron in the Middle East means that more people will learn about these facilities and have a chance to use them. Because some scientists in the region are new to using synchrotrons or writing the style of applications SESAME requires, Sayers asked the selection committee to provide feedback with any rejections.

Abuhammad is excited for the learning opportunity SESAME presents for her students—and for the possibility that experiences at SESAME will spark future careers in science.

She plans to apply for beam time at SESAME to conduct protein crystallography, a field that involves peering inside proteins to learn about their function and aid in pharmaceutical drug discovery.

Another scientist vying for a spot at SESAME is Iranian chemist Maedeh Darzi, who studies the materials of ancient manuscripts and how they degrade. Synchrotrons are of great value to archaeologists because they minimize the damage to irreplaceable artifacts. Instead of cutting them apart, scientists can take a less damaging approach by probing them with particles.

Darzi sees SESAME as a chance to collaborate with scientists from the Middle East and to promote science, peace and friendship. For her and others, SESAME could be a place where particles put things back together.

## September 24, 2017

## September 21, 2017

### Symmetrybreaking - Fermilab/SLAC

A project called A2D2 will explore new applications for compact linear accelerators.

Particle accelerators are the engines of particle physics research at Fermi National Accelerator Laboratory. They generate nearly light-speed, subatomic particles that scientists study to get to the bottom of what makes our universe tick. Fermilab experiments rely on a number of different accelerators, including a powerful, 500-foot-long linear accelerator that kick-starts the process of sending particle beams to various destinations.

But if you’re not doing physics research, what’s an accelerator good for?

It turns out, quite a lot: Electron beams generated by linear accelerators have all kinds of practical uses, such as making the wires used in cars melt-resistant or purifying water.

A project called Accelerator Application Development and Demonstration (A2D2) at Fermilab’s Illinois Accelerator Research Center will help Fermilab and its partners to explore new applications for compact linear accelerators, which are only a few feet long rather than a few hundred. These compact accelerators are of special interest because of their small size—they’re cheaper and more practical to build in an industrial setting than particle physics research accelerators—and they can be more powerful than ever.

“A2D2 has two aspects: One is to investigate new applications of how electron beams might be used to change, modify or process different materials,” says Fermilab’s Tom Kroc, an A2D2 physicist. “The second is to contribute a little more to the understanding of how these processes happen.”

To develop these aspects of accelerator applications, A2D2 will employ a compact linear accelerator that was once used in a hospital to treat tumors with electron beams. With a few upgrades to increase its power, the A2D2 accelerator will be ready to embark on a new venture: exploring and benchmarking other possible uses of electron beams, which will help specify the design of a new, industrial-grade, high-power machine under development by IARC and its partners.

It won’t be just Fermilab scientists using the A2D2 accelerator: As part of IARC, the accelerator will be available for use (typically through a formal CRADA or SPP agreement) by anyone who has a novel idea for electron beam applications. IARC’s purpose is to partner with industry to explore ways to translate basic research and tools, including accelerator research, into commercial applications.

“I already have a lot of people from industry asking me, ‘When can I use A2D2?’” says Charlie Cooper, general manager of IARC. “A2D2 will allow us to directly contribute to industrial applications—it’s something concrete that IARC now offers.”

Speaking of concrete, one of the first applications in mind for compact linear accelerators is creating durable pavement for roads that won’t crack in the cold or spread out in the heat. This could be achieved by replacing traditional asphalt with a material that could be strengthened using an accelerator. The extra strength would come from crosslinking, a process that creates bonds between layers of material, almost like applying glue between sheets of paper. A single sheet of paper tears easily, but when two or more layers are linked by glue, the paper becomes stronger.

“Using accelerators, you could have pavement that lasts longer, is tougher and has a bigger temperature range,” says Bob Kephart, director of IARC. Kephart holds two patents for the process of curing cement through crosslinking. “Basically, you’d put the road down like you do right now, and you’d pass an accelerator over it, and suddenly you’d turn it into really tough stuff—like the bed liner in the back of your pickup truck.”

This process has already caught the eye of the U.S. Army Corps of Engineers, which will be one of A2D2’s first partners. Another partner will be the Chicago Metropolitan Water Reclamation District, which will test the utility of compact accelerators for water purification. Many other potential customers are lining up to use the A2D2 technology platform.

“You can basically drive chemical reactions with electron beams—and in many cases those can be more efficient than conventional technology, so there are a variety of applications,” Kephart says. “Usually what you have to do is make a batch of something and heat it up in order for a reaction to occur. An electron beam can make a reaction happen by breaking a bond with a single electron.”

In other words, instead of having to cook a material for a long time to reach a specific heat that would induce a chemical reaction, you could zap it with an electron beam to get the same effect in a fraction of the time.

In addition to exploring the new electron-beam applications with the A2D2 accelerator, scientists and engineers at IARC are using cutting-edge accelerator technology to design and build a new kind of portable, compact accelerator, one that will take applications uncovered with A2D2 out of the lab and into the field. The A2D2 accelerator is already small compared to most accelerators, but the latest R&D allows IARC experts to shrink the size while increasing the power of their proposed accelerator even further.

“The new, compact accelerator that we’re developing will be high-power and high-energy for industry,” Cooper says. “This will enable some things that weren’t possible in the past. For something such as environmental cleanup, you could take the accelerator directly to the site.”

While the IARC team develops this portable accelerator, which should be able to fit on a standard trailer, the A2D2 accelerator will continue to be a place to experiment with how to use electron beams—and study what happens when you do.

“The point of this facility is more development than research, however there will be some research on irradiated samples,” says Fermilab’s Mike Geelhoed, one of the A2D2 project leads. “We’re all excited—at least I am. We and our partners have been anticipating this machine for some time now. We all want to see how well it can perform.”

*Editor's note: This article was originally published by Fermilab.*