Particle Physics Planet

February 09, 2016

Emily Lakdawalla - The Planetary Society Blog

First Details of the 2017 NASA Budget Request
There are few surprises in the Obama Administration's final budget proposal for NASA, though some progress is made in critical areas like science funding and NASA's overall funding.

February 09, 2016 06:33 PM

Lubos Motl - string vacua and pheno

The utter insanity of Woit's Rutgers colloquium
I did my PhD at Rutgers University, the State University of New Jersey. Those were 4 interesting years – ending by the PhD defense on 9/11/2001, 9:30 am, some 50 miles from the Twin Towers.

Shortly before I came to Rutgers in Fall 1997 (not counting a visit in Spring 1997), it was a powerful thinking machine, arguably a top 5 place in string theory in the world. (This comment does not say that Rutgers is not good today, it's very good; and it does not imply that a new graduate student like me was the cause why Rutgers ceased to be at the absolute Olymp of theoretical physics, I was too small a master for such big changes. In the mid-to-late 1990s, it was simply natural for the richer universities like Harvard to attract folks from that "hot field" that did much of their recent important work at "slightly less obvious" top places such as Rutgers and Santa Barbara.)

Before the brains were absorbed by some of the "more expected" famous universities in the U.S., string theory faculty at Rutgers as a group were known – relatively to other physics professors at Rutgers – for their unusual contributions to science and also funding and they enjoyed some teaching advantages relatively to non-string faculty, and so on, a setup designed to further improve their efficient research. I was always imagining how hard such a setup would have been in Czechia, due to jealousy, a feature of the Czech national character.

Fast forward to 2016. Last week, the notorious critic of string theory Peter Voits (yes, this is the right spelling) gave a physics colloquium at Rutgers. Colloquia are held in the round underground building pictured above every Wednesday. The speakers are almost universally active physicists. Another exception occurred a week before Voits' colloquium when David Maiullo talked about his Broadway show.

The Rutgers website suggests that the host – the man who probably had the idea to invite Voits – was Herbert Neuberger, a lattice gauge theory guy. This hypothesis makes some sense; Voits' only papers about physics, those written in the mid 1980s, were about gauge theory, too.

Along with a string theorist whom I know very well and who is located in Asia, we agreed that the string theory Rutgers faculty were no warriors. And indeed, the reports say that no local string theorist has attended the anti-string colloquium and if he did, he remained completely invisible. If we insist on polite words, Mr Neuberger is quite a jerk. Can you imagine that a string theorist would organize a colloquium by a non-physicist who attacks e.g. lattice gauge theory?
The slides from Voits' colloquium are available as a PDF file. Let me go through them.
Needless to say, the first crazy thing about the talk was the title:
Not Even Wrong, ten years later: a view from mathematics on prospects for fundamental physics without experiment
Ten years after the publication of an anti-physics tirade (one of hundreds of similar tirades by the laymen you may find in the libraries or on the Internet) that no high-energy physicist has ever taken seriously, Voits and his host must think that it was such a big deal that it deserves a colloquium. Now, the following page (2/32) is the outline:
  • Advertisements: old book, blog, coming book
  • What happened to string unification
  • 2x about how mathematics helps to guide physics
  • Representation theory is useful for the Standard Model
Now, this is just plain sick. First, why should a fifth of a colloquium be dedicated to "advertisements", let alone advertisements that don't help the scientific research in any way? Is Prof Neuberger also planning to turn the website to a porn website?

The second point is said to be about the string unification – except that the speaker hasn't written a single paper (or any text that makes any sense or could earn a citation from a scientist) and there are many other ways to see that he is 100% unqualified to talk about these difficult matters, especially when it comes to advances that emerged in the recent decades (let alone recent years).

The remaining three bullets out of five want to convey the idea that both mathematics in general and representation theory are useful in physics and the Standard Model. What? Is this meant to be the topic of a colloquium? I understood the importance of mathematics in physics when I was 4 and the importance of representation theory in physics when I was 10. Every janitor who was allowed to clean my office for grad students had to know these basics, too. You must be joking, Sirs.

Page 3/32 makes the story of the anti-physics book even crazier. We learn that the book wasn't actually written 10 years ago; it was mostly written 15 years ago. Huge developments have taken place in string theory and theoretical physics in the recent 15 years. Even if the book were relevant for scientists back in 2001, and it obviously wasn't, it would have been outdated by today. So how can one possibly organize a colloquium in 2016 for which this book is meant to be one of the main pillars?

Page 4/32 shows a screenshot of the "Not Even Wrong" blog. Voits boasts that it has 1,500 blog posts (TRF has 6,600) and 40,000 comments (we have way over 100,000) and most of the 20,000 page views a day are by "robots" (maybe Voits' own robots). Now, why would anyone care? All this Internet traffic is completely negligible relatively to the most influential servers on the Internet. Why would someone talk about it at all? Why should the time of Rutgers students, postdocs, and professors be wasted by a mediocre website? Because it claims to have something to do with physics? It has nothing to do with the professional, serious physics.

Slide 5/32 promotes Joseph Conlon's book – quite embarrassing for Joseph. Page 6/32 says that Voits is writing a book about quantum mechanics. Given the fact that Voits misunderstands pretty much everything that is more complicated than a certain modest threshold, one can't expect much from that book.

On slides 7-8/32, we learn that Voits liked the years 1975-1979 and one of his achievements was to be an unpaid visitor at Harvard in 1987-1988. Wow. Who could possibly give a damn? I've attended dozens of colloquia by the Nobel prize winners but if the speaker or the host began to talk about some detailed affiliations, it would turn me off totally. Now, why should the Rutgers physics community suffer through a talk that lists unpaid visits by a crackpot that took place some 30 years ago?

Pages 9-12/32 include some popular-book-style introduction to string theory as understood in the 1980s, with 2 vague sentences about the 1990s and a purely non-technical comment about the recent years. Is this level of depth enough for a Rutgers physics colloquium these days?

Page 13/32 says that there is "hype about string theory" and uses a 17-year-old New York Times photograph of Lisa Randall as evidence. Now, Lisa's and Raman's finding was important in phenomenology; it wasn't quite string theory, just string-theory-related ideas; the article was rather sensible; it appeared 17 years ago; and physicists shouldn't get their knowledge about their field from the New York Times, anyway. So what the hell is the role that this slide could play in a physics colloquium in 2016?

Page 14/32 says that the multiverse may exist according to string theory and Voits states that "it is not science" and "it is dangerous" without a glimpse of a justification. Page 15/32 claims that there is the "end of science" and mentions Susskind's term "Popperazzi" for the religious cult claiming that some stupidly misinterpreted oversimplified ideas by a random philosopher should be worshiped as the most important thing by all physicists. If Voits at least invented something as catchy as "Popperazzi". He hasn't. He's done no physics for 30 years but even when it comes to talking points, he is purely stealing from others – whether it's Wolfgang Pauli, Leonard Susskind, or someone else. Is that enough for a physics colloquium?

Pages 16-17/32 inform us about the shocking thing that mathematics is a non-empirical science. Great to learn something new and deep. He also lists some random buzzwords from mathematics like "Riemannian geometry" but it remains absolutely unclear why he did so. Let me tell you why: all these buzzwords are meant to mask the fact that he is nothing else than an ignorant layman and crackpot.

On pages 19-20/32, we are invited to buy a "different vision" and "radical Platonism". Everyone knows what is "Platonism" but what it means for it to be "radical" remains unclear – but it must be related to Lee Sm*lin's "mysticism", we learn. What? A slide says that the Standard Model works rather well. A janitor would be enough for that, too.

Page 21/32 lists things like lattice gauge theory and some nonperturbative electroweak theory but says nothing about those random phrases. On page 22/32, it's said that "quantum gravity could be much like the Standard Model", but it's not explained how this could be true. He suddenly jumps to the stringy multiverse again and says that it's "circular". Whether string theory implies a multiverse or not, there is obviously nothing circular about it.

Page 23/32 starts to mix the random buzzwords from representation theory such as the Dirac cohomology and categorification. On page 24/42, we're told that the momentum is related to translations, a thing that many high school students know, too. Voits has "nothing to say about the mysterious part, how does classical behavior emerge". Nothing is not too much to say about this foundational issue for someone who claims to be writing a book on quantum mechanics.

Page 25/32 escalates the crackpottery. He works at the level of basic definitions of a linear space or a commutator – the stuff approximately from the first undergraduate lecture on linear algebra – but he pretends that he has found something that could perhaps compete with string theory and maybe supersede it. What? This is just a collection of randomly mixed up elementary buzzwords and super-elementary mathematical expressions from the undergraduate linear algebra courses. A few more slides say some ill-defined things that try to pretend that Voits knows what the Dirac operator or category theory mean – except that it's self-evident that he doesn't actually understand these concepts.

The last page, 32/32, summarizes the talk. Ten years after the "string wars", string theory is failing even more than ever before, the audience was told by the stuttering critic of science. A problem is that this is clearly a totally untrue statement and the talk didn't contain anything at all that could substantiate this statement, especially not something that would be related to the recent 15 years in theoretical physics – developments that Mr Voits doesn't have the slightest idea about, not even at the popular-book level.

We "learn" that the Standard Model could be close to a theory of everything – yes, it is surely "somewhat close" (not "too close") but no more details are offered by Voits – and representation theory could be useful.

The second, key bullet of the summary says that if the number of available new experiments is limited, physicists must "look to mathematics for some guidance". Holy cow, but that's exactly what string theorists are doing and that's exactly why Voits and Sm*lin – and the brainwashed sheep who take these crackpots seriously – criticize about string theory at almost all times. And now he wants to recommend this "power of mathematics" as "his" recipe to proceed? Holy cow.

(Emil Martinec made a much better comment on this breathtaking cognitive dissonance of Mr Voits.)

The fact that a colloquium like that has been allowed at Rutgers looks like a serious breakdown of the system. Mr Neuberger should be given hard time but because I know most of the string theorists who are currently at Rutgers faculty, I don't believe that anything like that will actually take place. The tolerance for talks with the right "ideological flavor", despite their unbelievably lousy quality, has become a part of the political correctness that has conquered much of the Academia.

by Luboš Motl ( at February 09, 2016 05:59 PM

Symmetrybreaking - Fermilab/SLAC

Neutrinos on a seesaw

A possible explanation for the lightness of neutrinos could help answer some big questions about the universe.

Mass is a fundamental property of matter, but there’s still a lot about it we don’t understand—especially when it comes to the strangely tiny masses of neutrinos. 

An idea called the seesaw mechanism proposes a way to explain the masses of these curious particles. If shown to be correct, it could help us understand a great deal about the nature of fundamental forces and—maybe—why there’s more matter than antimatter in the universe today.

Wibbly-wobbly massy-wassy stuff

The masses of the smallest bits of matter cover a wide range. Electrons are roughly 1800 times less massive than protons and neutrons, which are one hundred times less massive than the Higgs boson. Other rare beasts like the top quark are heavier still.

Then we have the neutrinos, which don’t fit in at all. 

According to the Standard Model of particles and forces that emerged in the 1970s, neutrinos were massless. Experiments seemed to concur. However, over the next two decades, physicists showed that neutrinos change their flavor, or type.

Neutrinos come in three varieties: electron, muon and tau. Think of them as Neapolitan ice cream: The strawberry is the electron neutrino; the vanilla is the muon neutrino; and the chocolate is the tau neutrino. 

By the late 1980s, physicists were reasonably good at scooping out the strawberry; most experiments were designed to detect electron neutrinos only. But they were seeing far fewer than theory predicted they should. 

By 1998, researchers discovered the missing neutrinos could be explained by oscillation—the particles were changing from one flavor to another. By figuring out how to detect the other flavors, they showed they could account for the remainder of the missing neutrinos. 

This discovery forced them to reconsider the mass of the neutrino, since neutrinos can oscillate only if they have a tiny—but nonzero—mass.

 Today, “just from experimental facts, we know that neutrino masses are way smaller compared to all the other elementary [matter particle] masses,” says Mu-Chun Chen, a theoretical physicist at the University of California, Irvine. 

We don’t yet know exactly how much mass they have, but astronomical observations 1 Looking to the heavens for neutrino masses show they’re likely around a millionth of the mass of an electron—or even less. And this small mass could be a product of the seesaw mechanism. 

Seesaw Mechanism Animation
Artwork by Sandbox Studio, Chicago with Ana Kova

I am not left-handed!

To visualize another important property of neutrinos, make a “thumbs-up” gesture with your left hand. Your fingers will curl the way the neutrino rotates, and your thumb will point in the direction it travels. This combination makes for a “left-handed” particle. Antineutrinos, the antimatter version of neutrinos, are right-handed: Take your right hand and make a thumbs-up to show the relation between their spin and motion.

Some particles such as electrons or quarks don’t spin in any particular direction relative to the way they move; they are neither purely right- nor left-handed. So far, scientists have only ever observed left-handed neutrinos. 

But the seesaw mechanism predicts that there are two kinds of neutrinos: the light, left-handed ones we know and—on the other end of the metaphorical seesaw—heavy, right-handed neutrinos that we’ve never seen. The seesaw itself is a ratio: the higher the mass of the right-handed neutrino, the lower the mass of the left-handed neutrinos. Based on experiments, these right-handed neutrinos would be extraordinarily massive, perhaps 10^15 (one quadrillion) times heavier than a proton.

And there’s more: The seesaw mechanism predicts that if right-handed neutrinos exist, then they would be their own antiparticles. This could give us a clue to how our universe came to be full of matter. 

One idea is that in the first fraction of a second after the big bang, the universe produced just a tiny bit more matter than antimatter. After most particles annihilated with their antimatter counterparts, that imbalance left us with the matter we have today. Most of the laws of physics don’t distinguish between matter and antimatter, so something beyond the Standard Model must explain the asymmetry. 

Particles that are their own antiparticles can produce situations that violate some of the normal rules of physics. If right-handed neutrinos—which are their own antineutrinos—exist, then neutrinos could present the same kind of symmetry violation that might have happened for other types of matter. Exactly how that carries over to matter other than neutrinos, though, is still an area of active research for Chen and other physicists.

Searching for the seesaw

Scientists think they have yet to see these heavy right-handers for two reasons. First, the only force they know to act on neutrinos is the weak force, and the weak force acts only on left-handed particles. Right-handed neutrinos might not interact with any of the known forces.

Second, right-handed neutrinos would be too massive to be stable in our universe, and they would require too much energy to be created in even the most powerful particle accelerator. However, these particles could leave footprints in other experiments.

Today, scientists are studying the light, left-handed neutrinos that we can see to look for signs that could give us a verdict on the seesaw mechanism.

For one, they’re looking to see if neutrinos are their own antiparticles. That wouldn’t necessarily mean that the seesaw mechanism is true, but finding it would be a big point in the seesaw mechanism’s favor.

The seesaw mechanism goes hand-in-hand with grand unified theories—theories that unite the strong, weak and electromagnetic theory into a single force at high energies. If scientists find evidence of the seesaw mechanism, they could learn important things about how the forces are related.

The seesaw mechanism is the most likely way to explain how neutrinos got their mass. However, frustratingly, the nature of the explanation pushes many of its testable consequences out of experimental reach. 

The best hope lies in persistent experimentation, and—as with the discovery of neutrino oscillation in the first place—hunting for anything that doesn’t quite fit expectations.

by Matthew R. Francis at February 09, 2016 03:06 PM

Peter Coles - In the Dark

Fat Tuesday – Eh La Bas!

Today’s  the day we call in England  Shrove Tuesday. We’re apparently all supposed to get shriven by doing a pennance before Lent . Another name for the occasion is Pancake Day, although I’m not sure what sort of pennance it is to be forced to eat pancakes.

Further afield the name for this day is a bit more glamorous. Mardi Gras, which I translated using my schoolboy French as Fat Tuesday, doesn’t make me think of pancakes but of carnivals. And being brought up in a house surrounded by Jazz, it makes me think of New Orleans and the wonderful marching bands that played not just during the Mardi Gras parades but at  just about every occasion for which they could find an excuse, including funerals.

The Mardi Gras parades gave rise to many of the great tunes of New Orleans Jazz, many of them named after the streets through which the parade would travel, mainly in  the famous French Quarter. Basin Street, South Rampart Street, and Bourbon Street are among the names redolent with history for Jazz fans and musicians around the world. I also remember a record by Humphrey Lyttelton‘s 1950s band called Fat Tuesday.

The New Orleans Mardi Gras has on recent occasions sometimes got a bit out of hand, and you probably wouldn’t want to take kids into the French Quarter for fear they would see things they shouldn’t. Personally, though, I’d love the chance to savour the atmosphere and watch the parades.  Anyway, here’s an infectious little number performed for you by the inestimable Preservation Hall Jazz Band from New Orleans; the Preservation Hall is located in the French Quarter. It’s a traditional song with original lyrics in the local Creole Patois, but often also performed in standard French. The words are all about eating, which makes it somewhat relevant to today, although that’s only their surface meaning. You might recognize the tune from other songs that borrowed the theme, but this one is the Daddy! You don’t often hear it played with as strong a Caribbean influence on the rhythm as this version, and the excellent banjo solo is evocative of the Cajun music or Louisiana, but that blending of cultures and traditions is exactly what made New Orleans such an important place in musical history…


by telescoper at February 09, 2016 02:33 PM

astrobites - astro-ph reader's digest

So Much Hot (Jupiter) Diversity, So Little Time

Title: A continuum from clear to cloudy hot-Jupiter exoplanets without primordial water depletion 

Authors: Sing, D., Fortney, J., Nikolov, N., Wakeford, H., et al.

 First Authors Affiliation: University of Exeter

Paper status: Published in Nature

Almost every astrophysical process we know of was discovered by observing a large census of seemingly identical objects. The famous Hertzsprung-Russel diagram, which shows the relationship between temperature and luminosity in stars and gives insights into stellar evolution, was only uncovered when Hertzsprung and Russel were able to utilize large scale photographic spectroscopy surveys to look at and compare several hundreds of stars. They didn’t fully characterize each individual star. Instead, they looked at two simple and measurable features: apparent magnitude and the strengths of a couple absorption features as a proxy for temperature. These measurements for a stellar sample of 1 or 2, would not have yielded scientifically interesting results, but when compared to 100 others, patterns started to emerge.

In exoplanet science, we are at, what I’ll call, the “pre-HR diagram” stage. We have only been able to detect a few molecular absorption features in the atmospheres of just a handful of planets. Water absorption, for example, has been detected in the atmospheres of hot Jupiter exoplanets. The strength of these absorption features, though, has varied from planet to planet and has led various authors to make predictions about why this is— Maybe the planets with low water content were formed in part of a disk where water has been depleted? Maybe the water is actually there but clouds are muting the absorption features? With so few observations, it’s been hard to answer these questions. The authors of today’s astrobite observed the atmospheres of ten hot Jupiter planets all orbiting different host stars. Of course, ten planets won’t make the modern day exoplanet HR diagram but it does give us ten data points on what has previously been a blank canvas.

Below, Figure 1 shows the transmission spectra of ten different hot Jupiter atmospheres observed with the Hubble Space Telescope. For more info on how we get these spectra, I’d suggest reading this previous bite. These planets range in temperature from 960 – 2510 K, in mass from 0.21 – 1.50 times the mass of Jupiter, and in period from 0.79- 4.46 days. For perspective, Mercury orbits the Sun in an 88 day period and has an average temperature of 440 K. There is nothing in our Solar System remotely comparable to these ten planets. To showcase similarities and differences between each exoplanet’s atmospheric spectra, the authors have plotted everything on the same figure. The solid colored lines in the figures are the best fit atmospheric models, while the colored dots showcase the actual data.  To the untrained eye, these might seem a bit intimidating. But, if you know what you are looking for, you don’t even need complex models to gain insights into these planetary atmosphere:

Transmission spectra of hot Jupiter planets observed with the Hubble Space Telescope and Spitzer. Solid colored lines show the atmospheric models while the colored dots show the observed data.

Transmission spectra of hot Jupiter planets observed with the Hubble Space Telescope and Spitzer. Solid colored lines show the atmospheric models while the colored dots show the observed data.

Absorption Features: 

Absorption features are probably the most striking feature of a planet spectrum because they jump off, what we call, the “continuum” of the spectrum. Let’s start with WASP-17b in Figure 1. Try for a moment to ignore the solid line and focus on the orange dots. You should notice from the data points alone that there is probably sodium and water absorption in the atmosphere of WASP-17b. The reason I suggested ignoring the solid colored model is because although the model indicates present of potassium, no potassium was actually detected. This can get tricky.

Now, glance at the other spectra and try to figure out which other planets also contain sodium, which contain potassium and which contain water. Bear in mind, that like finger prints, molecules have unique wavelength at which they absorb at. This means that any feature you see at 0.6 microns will be Na, any feature you see at 0.78 microns will be K and any feature you see at 1.5 microns will be water.

Sodium was detected in five planetary atmospheres, potassium was detected in four planetary atmospheres, and water was detected in five. How well did you stack up?

Detecting the features is only half the battle. You might’ve noticed that while water was present in the atmosphere of WASP-17b and HD 209458b, their features look incredibly different. HD 209458b’s looks a bit muted. Why is that? And what about those planets that exhibit no features at all, like WASP-12b? What are their atmospheres made of?

Clouds and Hazes 

On Earth, there are few (or no) days that go by when there are blue skies throughout the entire planet. The bottom line is, every planet or moon in our Solar System with an atmosphere has some degree of clouds or hazes. It is, therefore, no surprise that we see indicators of clouds and hazes in the atmospheres of exoplanets. I should pause here and note that there are very different interpretations and definitions of what clouds and hazes are. An Earth scientists might have a different definition than an exoplanet scientists. I should therefore clarify, that I am using the definition given by the authors. In the most simple sense, a cloud is a “grey opacity source”. Imagine holding a prism up to a light and making a rainbow on your wall. If you were to take a plain, grey, no-color filter and hold it up between the prism and the wall, the only difference you would observe would be a subtle dimming across your entire rainbow. A homogeneous dimming of light (a grey opacity source).

Hazes can operate quite differently because they consist of tiny sub-micron sized particles, all capable of scattering light in various directions (called Rayleigh scattering). Ever wonder why the sky is blue? Rayleigh scattering is more efficient at short wavelengths (blue end of the spectrum), so the sunlight that gets scattered down to the earth is predominantly blue. Going back to our prism analogy, if you now replaced the grey filter with a dense mat of tiny sub-micron sized particles, you’d see the blue end of the rainbow increase in intensity.

Let’s return to Figure 1. Hazes should present themselves as a systematic increase in intensity toward the blue end of the spectrum and clouds should present themselves as a dimming throughout the entire spectrum. WASP-12b clearly exhibits a large presence of clouds while, WASP-31b all the way down to WASP-6b exhibit some degree of hazes.

What did we learn? 

From just looking comparatively at the spectra of these ten planets, some fundamental questions about planetary systems can be addressed:

  1. A muted water feature, does not necessarily mean the atmosphere is depleted of water vapor. Instead, it is more likely an indicator of clouds.
  2. Not ALL hot Jupiters have a massive cloud deck
  3. Not ALL hot Jupiters have thick hazes

We have placed ten dots on our exoplanet HR diagram and laid the ground work for how future missions such as the James Webb Space Telescope, can add to the field. In the near future we will be able to double, quadruple or even centuple this sample size and gain a deeper understanding of planet atmospheres, atmospheric chemistry, planet formation.

by Natasha Batalha at February 09, 2016 03:35 AM

Clifford V. Johnson - Asymptotia

Staring at Stairs…

triangle_staircaseThese stairs probably do not conform to any building code, but I like them anyway, and so they will appear in a paper I'll submit to the arxiv soon.

They're part of a nifty algorithm I thought of on Friday that I like rather a lot.

More later.

-cvj Click to continue reading this post

The post Staring at Stairs… appeared first on Asymptotia.

by Clifford at February 09, 2016 12:03 AM

February 08, 2016

Emily Lakdawalla - The Planetary Society Blog

What Does a 'Good' Budget for Planetary Science Look Like?
NASA's 2017 budget request comes out on Tuesday, here's how you can evaluate if the budget for the Planetary Science Division is good or not. It's not just about 2017, but the next five years.

February 08, 2016 11:31 PM

Christian P. Robert - xi'an's og

covariant priors, Jeffreys and paradoxes

“If no information is available, π(α|M) must not deliver information about α.”

In a recent arXival apparently submitted to Bayesian Analysis, Giovanni Mana and Carlo Palmisano discuss of the choice of priors in metrology. Which reminded me of this meeting I attended at the Bureau des Poids et Mesures in Sèvres where similar debates took place, albeit being led by ferocious anti-Bayesians! Their reference prior appears to be the Jeffreys prior, because of its reparameterisation invariance.

“The relevance of the Jeffreys rule in metrology and in expressing uncertainties in measurements resides in the metric invariance.”

This, along with a second order approximation to the Kullback-Leibler divergence, is indeed one reason for advocating the use of a Jeffreys prior. I at first found it surprising that the (usually improper) prior is used in a marginal likelihood, as it cannot be normalised. A source of much debate [and of our alternative proposal].

“To make a meaningful posterior distribution and uncertainty assessment, the prior density must be covariant; that is, the prior distributions of different parameterizations must be obtained by transformations of variables. Furthermore, it is necessary that the prior densities are proper.”

The above quote is quite interesting both in that the notion of covariant is used rather than invariant or equivariant. And in that properness is indicated as a requirement. (Even more surprising is the noun associated with covariant, since it clashes with the usual notion of covariance!) They conclude that the marginal associated with an improper prior is null because the normalising constant of the prior is infinite.

“…the posterior probability of a selected model must not be null; therefore, improper priors are not allowed.”

Maybe not so surprisingly given this stance on improper priors, the authors cover a collection of “paradoxes” in their final and longest section: most of which makes little sense to me. First, they point out that the reference priors of Berger, Bernardo and Sun (2015) are not invariant, but this should not come as a surprise given that they focus on parameters of interest versus nuisance parameters. The second issue pointed out by the authors is that under Jeffreys’ prior, the posterior distribution of a given normal mean for n observations is a t with n degrees of freedom while it is a t with n-1 degrees of freedom from a frequentist perspective. This is not such a paradox since both distributions work in different spaces. Further, unless I am confused, this is one of the marginalisation paradoxes, which more straightforward explanation is that marginalisation is not meaningful for improper priors. A third paradox relates to a contingency table with a large number of cells, in that the posterior mean of a cell probability goes as the number of cells goes to infinity. (In this case, Jeffreys’ prior is proper.) Again not much of a bummer, there is simply not enough information in the data when faced with a infinite number of parameters. Paradox #4 is the Stein paradox, when estimating the squared norm of a normal mean. Jeffreys’ prior then leads to a constant bias that increases with the dimension of the vector. Definitely a bad point for Jeffreys’ prior, except that there is no Bayes estimator in such a case, the Bayes risk being infinite. Using a renormalised loss function solves the issue, rather than introducing as in the paper uniform priors on intervals, which require hyperpriors without being particularly compelling. The fifth paradox is the Neyman-Scott problem, with again the Jeffreys prior the culprit since the estimator of the variance is inconsistent. By a multiplicative factor of 2. Another stone in Jeffreys’ garden [of forking paths!]. The authors consider that the prior gives zero weight to any interval not containing zero, as if it was a proper probability distribution. And “solve” the problem by avoid zero altogether, which requires of course to specify a lower bound on the variance. And then introducing another (improper) Jeffreys prior on that bound… The last and final paradox mentioned in this paper is one of the marginalisation paradoxes, with a bizarre explanation that since the mean and variance μ and σ are not independent a posteriori, “the information delivered by x̄ should not be neglected”.

Filed under: Books, Statistics, University life Tagged: evidence, Harold Jeffreys, hierarchical Bayesian modelling, improper priors, inadmissibility, invariance, Jeffreys priors, marginalisation paradoxes, Neyman-Scott problem, noninformative priors, over-interpretation of improper priors, reference priors

by xi'an at February 08, 2016 11:16 PM

Sean Carroll - Preposterous Universe

Guest Post: Grant Remmen on Entropic Gravity

Grant Remmen“Understanding quantum gravity” is on every physicist’s short list of Big Issues we would all like to know more about. If there’s been any lesson from last half-century of serious work on this problem, it’s that the answer is likely to be something more subtle than just “take classical general relativity and quantize it.” Quantum gravity doesn’t seem to be an ordinary quantum field theory.

In that context, it makes sense to take many different approaches and see what shakes out. Alongside old stand-bys such as string theory and loop quantum gravity, there are less head-on approaches that try to understand how quantum gravity can really be so weird, without proposing a specific and complete model of what it might be.

Grant Remmen, a graduate student here at Caltech, has been working with me recently on one such approach, dubbed entropic gravity. We just submitted a paper entitled “What Is the Entropy in Entropic Gravity?” Grant was kind enough to write up this guest blog post to explain what we’re talking about.

Meanwhile, if you’re near Pasadena, Grant and his brother Cole have written a musical, Boldly Go!, which will be performed at Caltech in a few weeks. You won’t want to miss it!

One of the most exciting developments in theoretical physics in the past few years is the growing understanding of the connections between gravity, thermodynamics, and quantum entanglement. Famously, a complete quantum mechanical theory of gravitation is difficult to construct. However, one of the aspects that we are now coming to understand about quantum gravity is that in the final theory, gravitation and even spacetime itself will be closely related to, and maybe even emergent from, the mysterious quantum mechanical property known as entanglement.

This all started several decades ago, when Hawking and others realized that black holes behave with many of the same aspects as garden-variety thermodynamic systems, including temperature, entropy, etc. Most importantly, the black hole’s entropy is equal to its area [divided by (4 times Newton’s constant)]. Attempts to understand the origin of black hole entropy, along with key developments in string theory, led to the formulation of the holographic principle – see, for example, the celebrated AdS/CFT correspondence – in which quantum gravitational physics in some spacetime is found to be completely described by some special non-gravitational physics on the boundary of the spacetime. In a nutshell, one gets a gravitational universe as a “hologram” of a non-gravitational universe.

If gravity can emerge from, or be equivalent to, a set of physical laws without gravity, then something special about that non-gravitational physics has to make it happen. Physicists have now found that that special something is quantum entanglement: the special correlations among quantum mechanical particles that defies classical description. As a result, physicists are very interested in how to get the dynamics describing how spacetime is shaped and moves – Einstein’s equation of general relativity – from various properties of entanglement. In particular, it’s been suggested that the equations of gravity can be shown to come from some notion of entropy. As our universe is quantum mechanical, we should think about the entanglement entropy, a measure of the degree of correlation of quantum subsystems, which for thermal states matches the familiar thermodynamic notion of entropy.

The general idea is as follows: Inspired by black hole thermodynamics, suppose that there’s some more general notion, in which you choose some region of spacetime, compute its area, and find that when its area changes this is associated with a change in entropy. (I’ve been vague here as to what is meant by a “change” in the area and what system we’re computing the area of – this will be clarified soon!) Next, you somehow relate the entropy to an energy (e.g., using thermodynamic relations). Finally, you write the change in area in terms of a change in the spacetime curvature, using differential geometry. Putting all the pieces together, you get a relation between an energy and the curvature of spacetime, which if everything goes well, gives you nothing more or less than Einstein’s equation! This program can be broadly described as entropic gravity and the idea has appeared in numerous forms. With the plethora of entropic gravity theories out there, we realized that there was a need to investigate what categories they fall into and whether their assumptions are justified – this is what we’ve done in our recent work.

In particular, there are two types of theories in which gravity is related to (entanglement) entropy, which we’ve called holographic gravity and thermodynamic gravity in our paper. The difference between the two is in what system you’re considering, how you define the area, and what you mean by a change in that area.

In holographic gravity, you consider a region and define the area as that of its boundary, then consider various alternate configurations and histories of the matter in that region to see how the area would be different. Recent work in AdS/CFT, in which Einstein’s equation at linear order is equivalent to something called the “entanglement first law”, falls into the holographic gravity category. This idea has been extended to apply outside of AdS/CFT by Jacobson (2015). Crucially, Jacobson’s idea is to apply holographic mathematical technology to arbitrary quantum field theories in the bulk of spacetime (rather than specializing to conformal field theories – special physical models – on the boundary as in AdS/CFT) and thereby derive Einstein’s equation. However, in this work, Jacobson needed to make various assumptions about the entanglement structure of quantum field theories. In our paper, we showed how to justify many of those assumptions, applying recent results derived in quantum field theory (for experts, the form of the modular Hamiltonian and vacuum-subtracted entanglement entropy on null surfaces for general quantum field theories). Thus, we are able to show that the holographic gravity approach actually seems to work!

On the other hand, thermodynamic gravity is of a different character. Though it appears in various forms in the literature, we focus on the famous work of Jacobson (1995). In thermodynamic gravity, you don’t consider changing the entire spacetime configuration. Instead, you imagine a bundle of light rays – a lightsheet – in a particular dynamical spacetime background. As the light rays travel along – as you move down the lightsheet – the rays can be focused by curvature of the spacetime. Now, if the bundle of light rays started with a particular cross-sectional area, you’ll find a different area later on. In thermodynamic gravity, this is the change in area that goes into the derivation of Einstein’s equation. Next, one assumes that this change in area is equivalent to an entropy – in the usual black hole way with a factor of 1/(4 times Newton’s constant) – and that this entropy can be interpreted thermodynamically in terms of an energy flow through the lightsheet. The entropy vanishes from the derivation and the Einstein equation almost immediately appears as a thermodynamic equation of state. What we realized, however, is that what the entropy is actually the entropy of was ambiguous in thermodynamic gravity. Surprisingly, we found that there doesn’t seem to be a consistent definition of the entropy in thermodynamic gravity – applying quantum field theory results for the energy and entanglement entropy, we found that thermodynamic gravity could not simultaneously reproduce the correct constant in the Einstein equation and in the entropy/area relation for black holes.

So when all is said and done, we’ve found that holographic gravity, but not thermodynamic gravity, is on the right track. To answer our own question in the title of the paper, we found – in admittedly somewhat technical language – that the vacuum-subtracted von Neumann entropy evaluated on the null boundary of small causal diamonds gives a consistent formulation of holographic gravity. The future looks exciting for finding the connections between gravity and entanglement!

by Sean Carroll at February 08, 2016 09:42 PM

Lubos Motl - string vacua and pheno

Compactified M-theory and LHC predictions
Guest blog by Gordon Kane

I want to thank Luboš for suggesting that I explain the compactified M-theory predictions of the superpartner masses, particularly for the gluino that should be seen at LHC in Run II. I’ll include the earlier Higgs boson mass and decay branching ratio predictions as well. I’ll only give references to a few papers that allow the reader to see more details of derivations and of calculated numbers, plus a few of the original papers that established the basic compactification, usually just with arXiv numbers so the interested reader can look at them and trace the literature, because this is a short explanation only focused on the LHC predictions. I apologize to others who could be referenced. Before a few years ago it was not possible to use compactified string/M-theories to predict superpartner masses. All “predictions” were based on naturalness arguments, and turned out to be wrong.

String/M-theories must be formulated in 10 or 11 dimensions to give a consistent quantum theories of gravity. In order to examine their predictions for our 4D world, they obviously must be projected onto 4D, a process called “compactification”. Compactified string/M-theories exhibit gravity, plus many properties that characterize the Standard Model of particle physics. These include Yang-Mills gauge theories of forces (such as \(SU(3)_{\rm color} \times SU(2)_{\rm electroweak}\times U(1)\)); chiral quarks and leptons (so parity violation); supersymmetry derived, not assumed; softly broken supersymmetry; hierarchical quark masses; families; moduli; and more. Thus they are attractive candidates for exploring theories extending the Standard Model.

At the present time which string/M-theory is compactified (Heterotic or Type II or M-theory etc), and to what matter-gauge groups, is not yet determined by derivations or principles. Following a body of work done in the 1995-2004 era [1,2,3,4,5,6,7], my collaborators and I have pursued compactifying M-theory. The 11D M-theory is compactified on a 7D manifold of \(G_2\) holonomy, so 7 curled up small dimensions and 3 large space ones. We assume appropriate \(G_2\) manifolds exist – there has been a lot of progress via mathematical study of such manifolds in recent years, including workshops. For M-theory it is known that gauge matter arises from singular 3-cycles in the 7D manifold [3], and chiral fermions from conical singularities on the 7D manifold [4]. Following Witten [5], we assume compactification to an \(SU(5)\)-MSSM. Other alternatives can be studied later. Having in mind the goal of finding \({\rm TeV}\) physics arising from a Planck-scale compactification, and knowing that fluxes (the generalization of electromagnetic fields to extra dimensional worlds) have dimensions and therefore naturally lead to physics near the Planck scale but not near a \({\rm TeV}\), we compactify in a fluxless sector. With the LHC data coming we focused on moduli stabilization, supersymmetry breaking and electroweak symmetry breaking.

In order to calculate in the compactified theory, we need the superpotential, the Kähler potential and the gauge kinetic function. To learn the features characteristic of the theory, we take the generic Kähler potential and gauge kinetic function. The moduli superpotential is a sum of non-perturbative terms because the complex moduli have an axion imaginary part and it has a shift symmetry [8,9,10]. We do most of the calculations with two superpotential terms, since that is sufficient to guarantee that supergravity approximations work well, and we can find semi-analytic results. When it matters we check with numerical work for more terms in the superpotential. The signs of the superpotential terms are determined by axion stabilization [8,9,10]. We use the known generic Kähler potential [6] and gauge kinetic function [7]. By using the generic theory we find the natural predictions of such a theory, with no free parameters. This is very important – if one introduces extra terms by hand, say in the Kähler potential, predictivity is lost.

In addition to the above assumptions we assume the lack of a solution to the cosmological constant problem does not stop us from making reasonable predictions. Solving the CC problems would not help us learn the gluino or Higgs boson mass, and not solving the CC problems does not prevent us from calculating the gluino or Higgs boson mass. Eventually this will have to be checked.

We showed that the M-theory compactification stabilized all moduli and gave a unique de Sitter vacuum for a given manifold, simultaneously breaking supersymmetry. Moduli vevs and masses are calculable. We calculate the supersymmetry soft-breaking Lagrangian at the compactification scale. Then we have the 4D softly broken supergravity quantum field theory, and can calculate all the predictions of the fully known parameter-free soft-breaking Lagrangian. The theory has many solutions with electroweak symmetry breaking.

We also need to have the \(\mu\) parameter in the theory. That is done following the method of Witten [5] who pointed out a generic discrete symmetry in the compactified M-theory that implied \(\mu=0\). We recognized that stabilizing the moduli broke that symmetry, so \(\mu\approx 0\). Since \(\mu\) would vanish if either supersymmetry were unbroken or moduli not stabilized, its value should be proportional to typical moduli vevs (which we calculated to be about \(1/10\) or \(1/20\) of the Planck scale) times the gravitino mass, so \(\mu\approx 3\TeV\). Combining this with the electroweak symmetry breaking conditions gives \(\tan\beta\approx 5\).

The resulting model (let’s call it a model even though it is a real theory and has no adjustable parameters, since we made the assumptions about compactifying to the \(SU(5)\)-MSSM, using the generic Kähler potential and gauge kinetic function, and estimating \(\mu\)) has a number of additional achievements. The lightest modulus can generate both the matter asymmetry and the dark matter when it decays, and thus their ratio. The moduli dominate the energy density of the universe soon after the end of inflation, so there is a non-thermal cosmological history. Axions are stabilized and there is a solution to the strong CP problem. There are no flavor or CPV problems, and EDMs are predicted to be small, below current limits, since the soft-breaking Lagrangian at the high scale is real at tree level, and the RGE running is known [14]. I mention these aspects to illustrate that the model is broadly relevant, not only to LHC predictions.

The soft-breaking Lagrangian contains the terms for the Higgs potential, \(M_{H,u}\) and \(M_{H,d}\) at the high scale. At the high scale all the scalars are about equal to the gravitino mass, about \(40\TeV\) (see below). All the terms needed for the RGE running are also calculated, so they can be run down to the \({\rm TeV}\) scale. \(M_{H,u}\) runs rapidly, down to about a \({\rm TeV}\) at the \({\rm TeV}\) scale. One can examine all the solutions with electroweak symmetry breaking, and finds they all have the form of the well-known two Higgs doublet “decoupling sector”, with one light Higgs and other physical Higgs bosons whose mass is about equal to the gravitino mass. For the decoupling sector the Higgs decay branching ratios are equal to the Standard Model ones except for small loop corrections, mainly the chargino loop. The light Higgs mass is calculated by the “match and run” technique, using the latest two and three loop contributions for heavy scalars, etc., and the light Higgs mass for all solutions is \(126.4\GeV\). This was done before the LHC data (arXiv:1112.1059 and reports at earlier meetings), though that doesn’t matter since the calculation does not depend on anything that changes with time. The RGE calculation has been confirmed by others.

The value of the gravitino mass follows from gaugino condensation and the associated dimensional transmutation. The M-theory hidden sectors generically have gauge groups (and associated matter) of various ranks. Those with the largest gauge groups will run down fastest, and their gauge coupling will get large, leading to condensates, analogous to how QCD forms the hadron spectrum but at a much higher energy scale. This scale, call it \(\Lambda\), is typically calculated to be about \(10^{14}\GeV\). The superpotential \(W\) has dimensions of mass cubed, so \(W\sim\Lambda^3\). The gravitino mass is \[

M_{3/2}=\frac{e^{K/2}W}{M_{pl}^2}\approx\left(\frac{\Lambda}{M_{pl}}\right)^3\cdot \frac{M_{pl}}{V_3}

\] since \(e^{K/2}\sim 1/V_3\). The factor \((\Lambda/M_{pl})^3\) takes us from the Planck scale down a factor \(10^{-12}\), and including the calculable volume factor gives \(M_{3/2}\approx 50\TeV\). This result is generic and robust for the compactified M-theory. It predicts that scalars (squarks, sleptons, \(M_{H,u}\), \(M_{H,d}\)) are of order \(50\TeV\) at the high scale, before RGE running.

The suppression of the gaugino masses from the gravitino scale to the \({\rm TeV}\) scale is completely general (Acharya et al, hep-th/0606262; Phys.Rev.Lett 97(2006)191601). The supergravity expression for the gaugino masses, \(M_{1/2}\), is a sum of terms each given by an F-term times the derivative of the visible sector gauge kinetic function with respect to each F-term. The visible sector gauge kinetic function does not depend on the chiral fermion F-terms, so the associated derivative vanishes, and \(M_{1/2}\) is proportional to the moduli F term generated by gaugino condensation in the hidden sector 3-cycles. The ratio of the gaugino condensate F-term to the chiral fermion F-term is approximately the ratio of volumes, \(V_3/V_7\), of order 1/40, for appropriate dimensionless units. \(V_7\) determines the gravitino mass but not \(M_{1/2}\). Let’s finally turn to the gaugino masses. The reader should understand now that the prediction is not just a “little above the limits”, but follows from a generic, robust calculation. Semi-quantitatively, the gluino mass is \([(\Lambda/M_{pl})^3/V_7]M_{pl}\).

Then the gaugino masses with the suppression described above are generically about \(1\TeV\). Detailed calculation, using the Higgs boson mass to pin down the gravitino mass more precisely (giving \(M_{3/2}=35\TeV\)) then predicts the gluino mass to be about \(1.5\TeV\), the wino mass \(614\GeV\), and the LSP bino about \(450\GeV\) [12]. These three states can be observed at LHC Run II but none of the other superpartners should be seen in Run II (also an important prediction). The higgsinos and squarks can be seen at an \(\sim 100\TeV\) collider via squark-gluino associated production [12,13].

The LHC gluino production cross section is \(10\)-\(15\,{\rm fb}\) [12]. Note that for squarks and gluinos having equal masses the squark exchange contribution to gluino production is significant, so the usual cross section claimed for gluino production is larger than our prediction when squarks are heavy. Simplified searches using larger cross sections will overestimate limits. Surprisingly, experimental groups and many phenomenologists have reported highly model dependent limits much larger than the correct ones for the compactified M-theory as if those limits were general. The wino pair production cross section is also of order \(15\,{\rm fb}\). The wino has nearly 100% branching ratio to bino + higgs, which is helpful for detection. Gluinos decay via the usual virtual squarks about 45% into first and second family quarks, 55% into 3rd family quarks, so simplified searches will overestimate limits. Branching ratios and signals are explained in [12]. The LHC t-tbar cross section is about \(4500\,{\rm fb}\), so it gives the main background (diboson production gives the next worse background). Background study should of course be done by experimenters, for realistic branching ratios to not be misleading. We estimate that to see a \(3\sigma\) signal for a \(1.5\TeV\) gluino will take over \(40\,{\rm fb}^{-1}\) integrated luminosity at LHC, so perhaps it can be seen by or during fall 2016 if the luminosity accumulates sufficiently rapidly.
  1. E.Witten, hep-th/9503124; NuclPhysB443
  2. Papadoupoulos, P. Townsend hep-th/9506150
  3. B.Acharya, hep-th/9812205
  4. B.Acharya and E.Witten, hep-th/0109152
  5. E.Witten, hep-ph/0201018
  6. C.Beasley and E. Witten, hep-th/0203061
  7. A.Lukas and D.Morris, hep-th/0305078
  8. B.Acharya, K.Bobkov, G.Kane, P.Kumar, D. Vaman, hep-th/0606262; PhysRevLett 97(2006)191601
  9. B.Acharya, K.Bobkov, G.Kane, P.Kumar, J.Shao hep-th/0701034
  10. B.Acharya, K.Bobkov, G.Kane, P.Kumar, J.Shao, arXiv:0801.0478
  11. B.Acharya, K.Bobkov, P.Kumar, arXiv:1004.5138
  12. S.Ellis, G.Kane, and B.Zheng, arXiv:1408.1961; JHEP 1507(2015)081
  13. S.Ellis and B.Zheng, arXiv:1506.02644
  14. S.Ellis and G.Kane, arXiv:1405.7719.

by Luboš Motl ( at February 08, 2016 05:20 PM

Peter Coles - In the Dark

The Search for Gravitational Waves

Regardless of what will or will not be announced on Thursday, I thought it would be worth sharing this nice colloquium talk by Dr Alan Weinstein of Caltech about the search for gravitational waves, featuring the Laser Interferometric Gravitational-wave Observatory (LIGO). I’ve picked this not only because it’s a nice and comprehensive overview, but also that Professor Weinstein doesn’t call them gravity waves!



by telescoper at February 08, 2016 02:17 PM

Christian P. Robert - xi'an's og

métro static

gare RER de Sceaux, May14, 2012[During a particularly painful métro trip, a man kept talking and talking in a very aggressive and somewhat incoherent manner. It is not until the end of the ride that I realised he was speaking on the phone to a relative… and not to an inexistent other self!]

“Vous voulez me faire jouer un rôle hein mais ca ne marche pas avec moi, vous jouez tous des rôles, la journée finie, vous rentrez chez vous, vous vous démaquillez, vous enlevez vos masques. Plutôt vous avez deux masques, un pour la journée et un pour le soir, le vrai et le faux, le faux c’est celui de la journée.”

Filed under: pictures, Travel Tagged: métro static, Paris, ramblings

by xi'an at February 08, 2016 01:18 PM

Tommaso Dorigo - Scientificblogging

From The Great Wall To The Great Collider
With a long delay, last week I was finally able to have a look at the book "From the Great Wall to the Great Collider - China and the Quest to Uncover the Inner Workings of the Universe", by Steve Nadis and Shing-Tung Yau. And I would like to report about my impressions here.

read more

by Tommaso Dorigo at February 08, 2016 10:55 AM

February 07, 2016

Christian P. Robert - xi'an's og

Bayesian model comparison with intractable constants

abcIRichard Everitt, Adam Johansen (Warwick), Ellen Rowing and Melina Evdemon-Hogan have updated [on arXiv] a survey paper on the computation of Bayes factors in the presence of intractable normalising constants. Apparently destined for Statistics and Computing when considering the style. A great entry, in particular for those attending the CRiSM workshop Estimating Constants in a few months!

A question that came to me from reading the introduction to the paper is why a method like Møller et al.’s (2006) auxiliary variable trick should be considered more “exact” than the pseudo-marginal approach of Andrieu and Roberts (2009) since the later can equally be seen as an auxiliary variable approach. The answer was on the next page (!) as it is indeed a special case of Andrieu and Roberts (2009). Murray et al. (2006) also belongs to this group with a product-type importance sampling estimator, based on a sequence of tempered intermediaries… As noted by the authors, there is a whole spectrum of related methods in this area, some of which qualify as exact-approximate, inexact approximate and noisy versions.

Their main argument is to support importance sampling as the method of choice, including sequential Monte Carlo (SMC) for large dimensional parameters. The auxiliary variable of Møller et al.’s (2006) is then part of the importance scheme. In the first toy example, a Poisson is opposed to a Geometric distribution, as in our ABC model choice papers, for which a multiple auxiliary variable approach dominates both ABC and Simon Wood’s synthetic likelihood for a given computing cost. I did not spot which artificial choice was made for the Z(θ)’s in both models, since the constants are entirely known in those densities. A very interesting section of the paper is when envisioning biased approximations to the intractable density. If only because the importance weights are most often biased due to the renormalisation (possibly by resampling). And because the variance derivations are then intractable as well. However, due to this intractability, the paper can only approach the impact of those approximations via empirical experiments. This leads however to the interrogation on how to evaluate the validity of the approximation in settings where truth and even its magnitude are unknown… Cross-validation and bootstrap type evaluations may prove too costly in realistic problems. Using biased solutions thus mostly remains an open problem in my opinion.

The SMC part in the paper is equally interesting if only because it focuses on the data thinning idea studied by Chopin (2002) and many other papers in the recent years. This made me wonder why an alternative relying on a sequence of approximations to the target with tractable normalising constants could not be considered. A whole sequence of auxiliary variable completions sounds highly demanding in terms of computing budget and also requires a corresponding sequence of calibrations. (Now, ABC fares no better since it requires heavy simulations and repeated calibrations, while further exhibiting a damning missing link with the target density. ) Unfortunately, embarking upon a theoretical exploration of the properties of approximate SMC is quite difficult, as shown by the strong assumptions made in the paper to bound the total variation distance to the true target.

Filed under: Books, Kids, pictures, Statistics, Travel, University life Tagged: ABC, auxiliary variable, bias vs. variance, CRiSM, estimating constants, importance sampling, Monte Carlo Statistical Methods, normalising constant, pseudo-marginal MCMC, SMC, unbiased estimation, University of Warwick

by xi'an at February 07, 2016 11:16 PM

Peter Coles - In the Dark

The Owl

Downhill I came, hungry, and yet not starved,
Cold, yet had heat within me that was proof
Against the north wind; tired, yet so that rest
Had seemed the sweetest thing under a roof. 

Then at the inn I had food, fire, and rest,
Knowing how hungry, cold, and tired was I.
All of the night was quite barred out except
An owl’s cry, a most melancholy cry. 

Shaken out long and clear upon the hill
No merry note, nor cause of merriment,
But one telling me plain what I escaped
And others could not, that night, as in I went. 

And salted was my food, and my repose,
Salted and sobered too, by the bird’s voice
Speaking for all who lay under the stars,
Soldiers and poor, unable to rejoice.

by Edward Thomas (1878-1917)

by telescoper at February 07, 2016 08:21 PM

John Baez - Azimuth

Rumors of Gravitational Waves

The Laser Interferometric Gravitational-Wave Observatory or LIGO is designed to detect gravitational waves—ripples of curvature in spacetime moving at the speed of light. It’s recently been upgraded, and it will either find gravitational waves soon or something really strange is going on.

Rumors are swirling that LIGO has seen gravitational waves produced by two black holes, of 29 and 36 solar masses, spiralling towards each other—and then colliding to form a single 62-solar-mass black hole!

You’ll notice that 29 + 36 is more than 62. So, it’s possible that three solar masses were turned into energy, mostly in the form of gravitational waves!

According to these rumors, the statistical significance of the signal is supposedly very high: better than 5 sigma! That means there’s at most a 0.000057% probability this event is a random fluke – assuming nobody made a mistake.

If these rumors are correct, we should soon see an official announcement. If the discovery holds up, someone will win a Nobel prize.

The discovery of gravitational waves is completely unsurprising, since they’re predicted by general relativity, a theory that’s passed many tests already. But it would open up a new window to the universe – and we’re likely to see interesting new things, once gravitational wave astronomy becomes a thing.

Here’s the tweet that launched the latest round of rumors:


For background on this story, try this:

Tale of a doomed galaxy, Azimuth, 8 November 2015.

The first four sections of that long post discuss gravitational waves created by black hole collisions—but the last section is about LIGO and an earlier round of rumors, so I’ll quote it here!

LIGO stands for Laser Interferometer Gravitational Wave Observatory. The idea is simple. You shine a laser beam down two very long tubes and let it bounce back and forth between mirrors at the ends. You use this compare the length of these tubes. When a gravitational wave comes by, it stretches space in one direction and squashes it in another direction. So, we can detect it.

Sounds easy, eh? Not when you run the numbers! We’re trying to see gravitational waves that stretch space just a tiny bit: about one part in 1023. At LIGO, the tubes are 4 kilometers long. So, we need to see their length change by an absurdly small amount: one-thousandth the diameter of a proton!

It’s amazing to me that people can even contemplate doing this, much less succeed. They use lots of tricks:

• They bounce the light back and forth many times, effectively increasing the length of the tubes to 1800 kilometers.

• There’s no air in the tubes—just a very good vacuum.

• They hang the mirrors on quartz fibers, making each mirror part of a pendulum with very little friction. This means it vibrates very well at one particular frequency, and very badly at frequencies far from that. This damps out the shaking of the ground, which is a real problem.

• This pendulum is hung on another pendulum.

• That pendulum is hung on a third pendulum.

• That pendulum is hung on a fourth pendulum.

• The whole chain of pendulums is sitting on a device that detects vibrations and moves in a way to counteract them, sort of like noise-cancelling headphones.

• There are 2 of these facilities, one in Livingston, Louisiana and another in Hanford, Washington. Only if both detect a gravitational wave do we get excited.

I visited the LIGO facility in Louisiana in 2006. It was really cool! Back then, the sensitivity was good enough to see collisions of black holes and neutron stars up to 50 million light years away.

Here I’m not talking about the supermassive black holes that live in the centers of galaxies. I’m talking about the much more common black holes and neutron stars that form when stars go supernova. Sometimes a pair of stars orbiting each other will both blow up, and form two black holes—or two neutron stars, or a black hole and neutron star. And eventually these will spiral into each other and emit lots of gravitational waves right before they collide.

50 million light years is big enough that LIGO could see about half the galaxies in the Virgo Cluster. Unfortunately, with that many galaxies, we only expect to see one neutron star collision every 50 years or so.

They never saw anything. So they kept improving the machines, and now we’ve got Advanced LIGO! This should now be able to see collisions up to 225 million light years away… and after a while, three times further.

They turned it on September 18th. Soon we should see more than one gravitational wave burst each year.

In fact, there’s a rumor that they’ve already seen one! But they’re still testing the device, and there’s a team whose job is to inject fake signals, just to see if they’re detected. Davide Castelvecchi writes:

LIGO is almost unique among physics experiments in practising ‘blind injection’. A team of three collaboration members has the ability to simulate a detection by using actuators to move the mirrors. “Only they know if, and when, a certain type of signal has been injected,” says Laura Cadonati, a physicist at the Georgia Institute of Technology in Atlanta who leads the Advanced LIGO’s data-analysis team.

Two such exercises took place during earlier science runs of LIGO, one in 2007 and one in 2010. Harry Collins, a sociologist of science at Cardiff University, UK, was there to document them (and has written books about it). He says that the exercises can be valuable for rehearsing the analysis techniques that will be needed when a real event occurs. But the practice can also be a drain on the team’s energies. “Analysing one of these events can be enormously time consuming,” he says. “At some point, it damages their home life.”

The original blind-injection exercises took 18 months and 6 months respectively. The first one was discarded, but in the second case, the collaboration wrote a paper and held a vote to decide whether they would make an announcement. Only then did the blind-injection team ‘open the envelope’ and reveal that the events had been staged.

Aargh! The disappointment would be crushing.

But with luck, Advanced LIGO will soon detect real gravitational waves. And I hope life here in the Milky Way thrives for a long time – so that when the gravitational waves from the doomed galaxy PG 1302-102 reach us, hundreds of thousands of years in the future, we can study them in exquisite detail.

For Castelvecchi’s whole story, see:

• Davide Castelvecchi Has giant LIGO experiment seen gravitational waves?, Nature, 30 September 2015.

For pictures of my visit to LIGO, see:

• John Baez, This week’s finds in mathematical physics (week 241), 20 November 2006.

For how Advanced LIGO works, see:

• The LIGO Scientific Collaboration Advanced LIGO, 17 November 2014.

by John Baez at February 07, 2016 07:01 PM

February 06, 2016

Christian P. Robert - xi'an's og

exceptional measures for exceptional circumstances, not as a new normality

Since the mass assassinations of November 13 in Paris, France is in a “legal” state of emergency (état d’urgence). Which has been once again re-conducted a few days by the Parliament and which could even become part of the Constitution if the planned vote succeeds in a few days! Emergency state that gives more and more emergency powers to the executive branch of the French State and thus to the police, with lesser controls from the judiciary and the parliamentary branches of the State. Beyond this arbitrary reduction of civil liberties and rights, from the prohibition of demonstrations to extra-judiciary house arrests, recently denounced by Amnesty International and the UN and many local organisations, the government is pushing for another change in the Constitution that would deprive binationals terrorists of their French nationality. Quite absurd (as if those terrorists cared!), discriminatory (what of Égalité?!), schizophrenic  (“de-naturalising” someone  does not turn him or her into an alien, nor erases the fact that she or he was born and raised in France), and anti-republican… Even though I remain sceptic of their real impact, there are several petitions around calling for the end of the emergency state and the dismissal of the de-naturalisation.

Filed under: Kids Tagged: état d'urgence, binationals, France, French government, nationality, Paris, République

by xi'an at February 06, 2016 11:16 PM

astrobites - astro-ph reader's digest

Tracing cosmic siblings in the Milky Way

Paper title: Chemical tagging can work — Identification of stellar phase-space structures purely by chemical-abundance similarity

Authors: David W. Hogg, Andrew R. Casey, Melissa Ness, Hans-Walter Rix and Daniel Foreman-Mackey

First author’s institutions: New York University and Max-Planck-Institut für Astronomie

Status: Submitted to The Astrophysical Journal

A look back to the ESO Ultra HD Expedition's time at Paranal – the stars within their grasp.

See that star? Could it be a sibling of the Sun? Credit: ESO/B. Tafreshi (

When a dense cloud of cold gas and dust collapses, then we have the birth of stars. They stick together for some time, as stellar clusters, but these eventually break up over the ages, and the sibling stars are scattered from each other. The best example of a lonely star we have is our good and old Sun. Luckily, astronomers have thought of a clever way to trace stellar siblings in the Milky Way. In the future, we may even be looking for the Sun’s sister stars using our telescopes. Not only that, but if we can trace the history of these stars, we can better understand how our Galaxy evolved.

Introducing: chemical tagging

For stars that are born in the same natal cloud, it is not unreasonable to assume that they have the same (or at least very close) chemical compositions, which astronomers measure by the quantities called chemical abundances. Moreover, for stars with a mass similar to the Sun, these abundances keep practically unchanged in the stellar surfaces for most of their lives. So astronomers had the thought: if we looked at the chemical composition in the stars’ surfaces, would we be able to identify stellar siblings just with that information? This is the question today’s paper tries to answer.

The process of looking for stellar siblings using the surface composition of stars became known as chemical tagging. What the authors of today’s paper did was a proof of concept: to test if it was possible to identify stellar clusters (which we know to definitely be sibling stars) using chemical tagging. The idea is that if works for clusters, it will also work for siblings scattered from each other. To do that, they used The Cannon, a machine learning program capable of creating stellar models using only the spectra of stars. The data to be fed to The Cannon was obtained in the APOGEE survey.

Sounds crazy, but it works

In this plot, we have abundances of different elements on all axes. We can infer that stars that appear together, forming structures, in plots for all elements were probably formed together (colored symbols). The light grey symbols are the other stars of the survey who are unrelated to the colored ones.

With the spectra of the stars and their respective stellar models, they calculated the chemical composition in the surface of the stars (16 elements in total) for almost 100,000 of them. That’s a lot of parameters and stars to look for any kind of structure in the abundances data, so they had to resort to an algorithm that automatically looks for these structures (and a very straightforward one at that, too). Nevertheless, what they found is that, only by looking at the similarities in chemical abundances, they could identify various stellar clusters, such as M15, M92 and M107, the Saggitarius spheroidal dwarf galaxy, and even a collection of old stars that do not seem to be bound to a particular cluster.


Position of the stars in galactic coordinates. They could identify that these stars (the colored symbols), which are scattered through the disk of the Milky way could have been born together! The light grey dots are, again, stars unrelated to the colored ones who were observed in the survey.

In the past, chemical tagging was criticized for being too difficult to perform and people questioned its reliability. Ultimately, what the authors prove with this study is that identifying stellar siblings just by looking at their surface chemical composition does work, and it doesn’t even need a sophisticated algorithm to produce reliable results. What we need, actually, is precision in abundances, which was made possible with The Cannon. All this means that we are off to a great start in understanding even better how our Galaxy evolved and its star formation history.

by Leonardo dos Santos at February 06, 2016 01:33 AM

Clifford V. Johnson - Asymptotia

On Zero Matter

zero-matter-containedOver at Marvel, I chatted with actor Reggie Austin (Dr. Jason Wilkes on Agent Carter) some more about the physics I helped embed in the show this season. It was fun. (See an earlier chat here.) This was about Zero Matter itself (which will also be a precursor to things seen in the movie Dr. Strange later this year)... It was one of the first things the writers asked me about when I first met them, and we brainstormed about things like what it should be called (the name "dark force" comes later in Marvel history), and how a scientist who encountered it would contain it. This got me thinking about things like perfect fluids, plasma physics, exotic phases of materials, magnetic fields, and the like (sadly the interview skips a lot of what I said about those)... and to the writers' and show-runners' enormous credit, lots of these concepts were allowed to appear in the show in various ways, including (versions of) two containment designs that I sketched out. Anyway, have a look in the embed below.

Oh! The name. We did not settle on a name after the first meeting, but one of [...] Click to continue reading this post

The post On Zero Matter appeared first on Asymptotia.

by Clifford at February 06, 2016 01:07 AM

February 05, 2016

Clifford V. Johnson - Asymptotia

Suited Up!

war_gear_smYes, I was in battle again. A persistent skunk that wants to take up residence in the crawl space. I got rid of it last week, having found one place it broke in. This involved a lot of crawling around on my belly armed with a headlamp (not pictured - this is an old picture) and curses. I've done this before... It left. Then yesterday I found a new place it had broken in through and the battle was rejoined. Interestingly, this time it decided to hide after some of the back and forth and I lost track of it for a good while and was about to give up and hope it will feel unsafe with all the lights I'd put on down there (and/or encourage it further to leave by deploying nuclear weapons to match the ones it comes armed with*).

In preparation for this I left open the large access hatch and sprinkled a layer [...] Click to continue reading this post

The post Suited Up! appeared first on Asymptotia.

by Clifford at February 05, 2016 11:09 PM

Tommaso Dorigo - Scientificblogging

Top Secret: On Confidentiality On Scientific Issues, Across The Ring And Across The Bedroom

The following text, a short excerpt from the book "Anomaly!", recounts the time when the top quark was about to be discovered, in 1994-95. After the "evidence" paper that CDF had published in 1994, the CDF and DZERO experiments were both running for the first prize - a discovery of the last quark.


read more

by Tommaso Dorigo at February 05, 2016 10:08 PM

Emily Lakdawalla - The Planetary Society Blog

HiRISE image coverage of the Curiosity field site on Mars, Version 3.0
There have been tons and tons of HiRISE images of the Curiosity landing region, and it has taken quite a lot of work for me to find, locate, and catalogue them. This post is a summary of what I've found; after five revisions and updates, it's now version 3.0 of the list.

February 05, 2016 06:53 PM

Axel Maas - Looking Inside the Standard Model

More than one Higgs means more structure
We have published once more a new paper, and I would like again to outline what we did (and why).

The motivation for this investigation started out with another paper of mine. As described earlier, back then I have taken a rather formal stand on proposals for new physics. It was based on the idea that there is some kind of self-similar substructure of what we usually call the Higgs and the W and Z bosons. In this paper, I speculated that this self-similarity may be rather exclusive to the standard model. As a consequence, this may alter the predictions for new physics models.

Of course, speculating is easy. To make something out of it requires to do real calculations. Thus, I have started two projects to test them. One is on the unification of forces, and still ongoing. Some first results are there, but not yet anything conclusive. It is the second project which yielded new results.

In this second project we had a look at a theory where more Higgs particles are added to the standard model, a so-called 2-Higgs-doublet model, or 2HDM for short. I had speculated that, besides the additional Higgs particles, further additional particles may arise as bound states. I. e., as states which are made from two or more other particles. These are not accounted for by ordinary methods.

In the end, it now appears that this idea is not correct, at least not in its simplest form. There are still some very special cases left, where this may still be true, but by and large not. However, we have understood why the original idea is wrong, and why it may still be correct in other cases. The answer is symmetry.

When adding additional Higgs particles, one is not entirely free. It is necessary that we do not alter the standard model where we have already tested it. Especially, we cannot easily modify the symmetries of the standard model. However, the symmetries of the standard model then induce a remarkable effect. The additional Higgs particles in 2HDMs are not entirely different from the ones we know. Rather, they mix with it as a quantum effect. In quantum theories, particles can change into each other under certain conditions. And the symmetries of the standard model entail that this is possible for the new and the old Higgses.

If the particles mix, the possibilities to distinguish them diminish. As a consequence, the simplest additional states can no longer be distinguished from the states already accounted for by ordinary methods. Thus, they are not additional states. Hence, the simplest possible deviation I speculated about is not realized. There may still be more complicated ones, but to figure this out is much more complicated, and has yet to be done. Thus, this work showed that the simple idea was not right.

So what about the other project still in progress? Should I now also expect this to just reproduce what is known? Actually no. The thing we learned in this project was why everything fell into its ordinary places. The reason is the mixing between the normal and the additional Higgs particles. This possibility is precluded in the other project, as there the additional particles are very different from the original ones. It may still be that my original idea is wrong. But it has to be wrong in a different way than in the case we investigated now. And thus we have also learned something more about a wider class of theories.

This shows that even disproving your ideas is important. From the reasons why they fail you learn more than just from a confirmation of them - you learn something new.

by Axel Maas ( at February 05, 2016 05:32 PM

Emily Lakdawalla - The Planetary Society Blog

In Pictures: Orion Assembled and Shipped to Kennedy Space Center
The shell of NASA's next Orion spacecraft has been welded together and shipped to Kennedy Space Center, Florida. Here's a photo recap of the assembly and transport process.

February 05, 2016 04:33 PM

ZapperZ - Physics and Physicists

The Physics of Mirrors Falls Slightly Short
This is a nice, layman article on the physics behind mirrors.

While they did a nice job in explaining about the metal surface and the smoothness effect, I wish articles like this will also dive in the material science aspect of why light, in this case visible light, is reflected better off a metal surface than none metalllic surface. In other words, let's include some solid state/condensed matter physics in this. That is truly the physics behind the workings of a mirror.


by ZapperZ ( at February 05, 2016 03:50 PM

ZapperZ - Physics and Physicists

Wendelstein 7-X' Comes Online
ITER should look over its shoulder, because Germany's nuclear fusion reactor research facility is coming online. It is considerably smaller, significantly cheaper, but more importantly, it is built and ready to run!

Construction has already begun in southern France on ITER, a huge international research reactor that uses a strong electric current to trap plasma inside a doughnut-shaped device long enough for fusion to take place. The device, known as a tokamak, was conceived by Soviet physicists in the 1950s and is considered fairly easy to build, but extremely difficult to operate.

The team in Greifswald, a port city on Germany's Baltic coast, is focused on a rival technology invented by the American physicist Lyman Spitzer in 1950. Called a stellarator, the device has the same doughnut shape as a tokamak but uses a complicated system of magnetic coils instead of a current to achieve the same result.

Let the games begin!


by ZapperZ ( at February 05, 2016 03:45 PM

Peter Coles - In the Dark

LIGO Newsflash

This morning I heard the same rumour from two distinct (and possibly independent) sources. That’s not enough to prove that the rumour is true, but perhaps enough to make it  repeating here.

The rumour is that, on Thursday 11th February in Washington DC at 10.40am 10.30am local time (15.40 15.30GMT), the Laser Interferometry Gravitational Wave Observatory (LIGO) will announce the direct experimental detection of gravitational waves.

If true this is immensely exciting, but I reiterate that it is, for the time being at least, only a rumour.

I will add more as soon as I get it. Please feel free to provide updates through the comments. Likewise if you have information to the contrary…


UPDATE: 9th February 2016. An official announcement of the forthcoming announcement has now been announced. It will take place at 10.30 local time in Washington (15.30 GMT), although it is believed the first ten minutes will involve a couple of songs by the popular vocal artist Beyoncé.


by telescoper at February 05, 2016 01:51 PM

Emily Lakdawalla - The Planetary Society Blog

Mars Exploration Rovers Update: Opportunity Turns 12! Embarks on Electric Slide
On January 24th, the veteran Mars Exploration Rover (MER) wrapped the last day of her 12th year of surface operations on Mars, marking an extraordinary, historic achievement for the Mars Exploration Rovers (MER) mission.

February 05, 2016 01:12 AM

John Baez - Azimuth

Aggressively Expanding Civilizations

Ever since I became an environmentalist, the potential destruction wrought by aggressively expanding civilizations has been haunting my thoughts. Not just here and now, where it’s easy to see, but in the future.

In October 2006, I wrote this in my online diary:

A long time ago on this diary, I mentioned my friend Bruce Smith’s nightmare scenario. In the quest for ever faster growth, corporations evolve toward ever faster exploitation of natural resources. The Earth is not enough. So, ultimately, they send out self-replicating von Neumann probes that eat up solar systems as they go, turning the planets into more probes. Different brands of probes will compete among each other, evolving toward ever faster expansion. Eventually, the winners will form a wave expanding outwards at nearly the speed of light—demolishing everything behind them, leaving only wreckage.

The scary part is that even if we don’t let this happen, some other civilization might.

The last point is the key one. Even if something is unlikely, in a sufficiently large universe it will happen, as long as it’s possible. And then it will perpetuate itself, as long as it’s evolutionarily fit. Our universe seems pretty darn big. So, even if a given strategy is hard to find, if it’s a winning strategy it will get played somewhere.

So, even in this nightmare scenario of "spheres of von Neumann probes expanding at near lightspeed", we don’t need to worry about a bleak future for the universe as a whole—any more than we need to worry that viruses will completely kill off all higher life forms. Some fraction of civilizations will probably develop defenses in time to repel the onslaught of these expanding spheres.

It’s not something I stay awake worrying about, but it’s a depressingly plausible possibility. As you can see, I was trying to reassure myself that everything would be okay, or at least acceptable, in the long run.

Even earlier, S. Jay Olson and I wrote a paper together on the limitations in accurately measuring distances caused by quantum gravity. If you try to measure a distance too accurately, you’ll need to concentrate so much energy in such a small space that you’ll create a black hole!

That was in 2002. Later I lost touch with him. But now I’m happy to discover that he’s doing interesting work on quantum gravity and quantum information processing! He is now at Boise State University in Idaho, his home state.

But here’s the cool part: he’s also studying aggressively expanding civilizations.

Expanding bubbles

What will happen if some civilizations start aggressively expanding through the Universe at a reasonable fraction of the speed of light? We don’t have to assume most of them do. Indeed, there can’t be too many, or they’d already be here! More precisely, the density of such civilizations must be low at the present time. The number of them could be infinite, since space is apparently infinite. But none have reached us. We may eventually become such a civilization, but we’re not one yet.

Each such civilization will form a growing ‘bubble’: an expanding sphere of influence. And occasionally, these bubbles will collide!

Here are some pictures from a simulation he did:

As he notes, the math of these bubbles has already been studied by researchers interested in inflationary cosmology, like Alan Guth. These folks have considered the possibility that in the very early Universe, most of space was filled with a ‘false vacuum’: a state of matter that resembles the actual vacuum, but has higher energy density.

A false vacuum could turn into the true vacuum, liberating energy in the form of particle-antiparticle pairs. However, it might not do this instantly! It might be ‘metastable’, like ball number 1 in this picture:

It might need a nudge to ‘roll over the hill’ (metaphorically) and down into the lower-energy state corresponding to the true vacuum, shown as ball number 3. Or, thanks to quantum mechanics, it might ‘tunnel’ through this hill.

The balls and the hill are just an analogy. What I mean is that the false vacuum might need to go through a stage of having even higher energy density before it could turn into the true vacuum. Random fluctuations, either quantum-mechanical or thermal, could make this happen. Such a random fluctuation could happen in one location, forming a ‘bubble’ of true vacuum that—under certain conditions—would rapidly expand.

It’s actually not very different from bubbles of steam forming in superheated water!

But here’s the really interesting Jay Olson noted in his first paper on this subject. Research on bubbles in the inflationary cosmology could actually be relevant to aggressively expanding civilizations!

Why? Just as a bubble of expanding true vacuum has different pressure than the false vacuum surrounding it, the same might be true for an aggressively expanding civilization. If they are serious about expanding rapidly, they may convert a lot of matter into radiation to power their expansion. And while energy is conserved in this process, the pressure of radiation in space is a lot bigger than the pressure of matter, which is almost zero.

General relativity says that energy density slows the expansion of the Universe. But also—and this is probably less well-known among nonphysicists—it says that pressure has a similar effect. Also, as the Universe expands, the energy density and pressure of radiation drops at a different rate than the energy density of matter.

So, the expansion of the Universe itself, on a very large scale, could be affected by aggressively expanding civilizations!

The fun part is that Jay Olson actually studies this in a quantitative way, making some guesses about the numbers involved. Of course there’s a huge amount of uncertainty in all matters concerning aggressively expanding high-tech civilizations, so he actually considers a wide range of possible numbers. But if we assume a civilization turns a large fraction of matter into radiation, the effects could be significant!

The effect of the extra pressure due to radiation would be to temporarily slow the expansion of the Universe. But the expansion would not be stopped. The radiation will gradually thin out. So eventually, dark energy—which has negative pressure, and does not thin out as the Universe expands—will win. Then the Universe will expand exponentially, as it is already beginning to do now.

(Here I am ignoring speculative theories where dark energy has properties that change dramatically over time.)

Jay Olson’s work

Here are his papers on this subject. The abstracts sketch his results, but you have to look at the papers to see how nice they are. He’s thought quite carefully about these things.

• S. Jay Olson, Homogeneous cosmology with aggressively expanding civilizations, Classical and Quantum Gravity 32 (2015) 215025.

Abstract. In the context of a homogeneous universe, we note that the appearance of aggressively expanding advanced life is geometrically similar to the process of nucleation and bubble growth in a first-order cosmological phase transition. We exploit this similarity to describe the dynamics of life saturating the universe on a cosmic scale, adapting the phase transition model to incorporate probability distributions of expansion and resource consumption strategies. Through a series of numerical solutions spanning several orders of magnitude in the input assumption parameters, the resulting cosmological model is used to address basic questions related to the intergalactic spreading of life, dealing with issues such as timescales, observability, competition between strategies, and first-mover advantage. Finally, we examine physical effects on the universe itself, such as reheating and the backreaction on the evolution of the scale factor, if such life is able to control and convert a significant fraction of the available pressureless matter into radiation. We conclude that the existence of life, if certain advanced technologies are practical, could have a significant influence on the future large-scale evolution of the universe.

• S. Jay Olson, Estimates for the number of visible galaxy-spanning civilizations and the cosmological expansion of life.

Abstract. If advanced civilizations appear in the universe with a desire to expand, the entire universe can become saturated with life on a short timescale, even if such expanders appear but rarely. Our presence in an untouched Milky Way thus constrains the appearance rate of galaxy-spanning Kardashev type III (K3) civilizations, if it is assumed that some fraction of K3 civilizations will continue their expansion at intergalactic distances. We use this constraint to estimate the appearance rate of K3 civilizations for 81 cosmological scenarios by specifying the extent to which humanity could be a statistical outlier. We find that in nearly all plausible scenarios, the distance to the nearest visible K3 is cosmological. In searches where the observable range is limited, we also find that the most likely detections tend to be expanding civilizations who have entered the observable range from farther away. An observation of K3 clusters is thus more likely than isolated K3 galaxies.

• S. Jay Olson, On the visible size and geometry of aggressively expanding civilizations at cosmological distances.

Abstract. If a subset of advanced civilizations in the universe choose to rapidly expand into unoccupied space, these civilizations would have the opportunity to grow to a cosmological scale over the course of billions of years. If such life also makes observable changes to the galaxies they inhabit, then it is possible that vast domains of life-saturated galaxies could be visible from the Earth. Here, we describe the shape and angular size of these domains as viewed from the Earth, and calculate median visible sizes for a variety of scenarios. We also calculate the total fraction of the sky that should be covered by at least one domain. In each of the 27 scenarios we examine, the median angular size of the nearest domain is within an order of magnitude of a percent of the whole celestial sphere. Observing such a domain would likely require an analysis of galaxies on the order of a giga-lightyear from the Earth.

Here are the main assumptions in his first paper:

1. At early times (relative to the appearance of life), the universe is described by the standard cosmology – a benchmark Friedmann-Robertson-Walker (FRW) solution.

2. The limits of technology will allow for self-reproducing spacecraft, sustained relativistic travel over cosmological distances, and an efficient process to convert baryonic matter into radiation.

3. Control of resources in the universe will tend to be dominated by civilizations that adopt a strategy of aggressive expansion (defined as a frontier which expands at a large fraction of the speed of the individual spacecraft involved), rather than those expanding diffusively due to the conventional pressures of population dynamics.

4. The appearance of aggressively expanding life in the universe is a spatially random event and occurs at some specified, model-dependent rate.

5. Aggressive expanders will tend to expand in all directions unless constrained by the presence of other civilizations, will attempt to gain control of as much matter as is locally available for their use, and once established in a region of space, will consume mass as an energy source (converting it to radiation) at some specified, model-dependent rate.

by John Baez at February 05, 2016 01:00 AM

astrobites - astro-ph reader's digest

Living the ‘Magellan’ life : Observing at a Chilean masterpiece



You are driving up a mountain. In the middle of nowhere. In a part of the globe that’s too remote to imagine. On a road that could give way anytime. There’s an alpaca chilling by the hills, looking at you with interest. Well, almost.

You’ve seen textbooks of the Atacama Desert as a kid, but never actually believed that people live here, let alone work for science. I welcome you to Las Campanas, a mountain in one of the driest areas of the world – 2500m above sea level – and home to one of the world’s modern wonders, the Magellan Telescopes.

When I was asked of my aspirations growing up, it was becoming obvious that I loved the cosmos. After all, what could be more exciting than studying the galaxies and attempting to understand the physics of the universe! Unfortunately, in India, where I am from, astrophysicists are still thought of as ‘just stargazers’, and not being an ‘encyclopedias of constellations’ means that you are as far from an astronomer as astrology is from real science. What most people don’t realize is that the idea of actual observing at a telescope (that is classically called, well, ‘classical observing’) is almost a dying art in astrophysics, with the advent of modern day telescopes and algorithmic pipelines that let you observe and reduce data while sitting in the comfort of a coffee shop. You submit a queue of astrophysical objects (and hence the name ‘queue-observing’) that you wish to observe, and voila, thy job will be done! Space telescopes like Hubble and Spitzer, and even modern-day ground based observatories like ALMA, work like this. Hence, it is almost ironic that I root for more astronomers on telescopes today. This article shall tell you why.


The massive 6.5m mirror at the Clay telescope that I used.

The Magellan telescopes are run by the Carnegie Institution of Washington in collaboration with several major universities across the US. At the University of Chicago, I work with the South Pole Telescope collaboration on high-redshift galaxy clusters that were detected by the Sunyaev-Zeldovich effect. To lock down on how far these clusters exactly are and how massive they are, spectroscopic data and, subsequently, Doppler shifts of spectral lines are needed. This is why I left my cozy office chair in Hyde Park for a 20-hour long journey to another cozy chair in the control room of the Clay 6.5m telescope. This majestic structure houses LDSS3, one of the world’s best spectroscopic and imaging instruments in the optical and near-infrared wavelength bands (read Ian Czekala’s post on the same instrument, but with different science goals!). I was co-observing with Mike McDonald, an assistant professor at MIT and a boss at observing and handling data.

With photo opportunities like this, a wide grin is justified.

With photo opportunities like this, a wide grin is justified.

Everything about the Magellan telescopes is classical, and almost poetic. The first sight of the metallic domes bathed in sunlight gets you pumped up for the nights to come. The staff at the lodge is very friendly, and the cooks make sure that you have the best Chilean food that you can possibly get. You meet other astronomers working on other telescopes or instruments, and you see them recognizing each other from previous runs. It is this ‘inner circle’ that I hope to be a part of someday.


The control room at Baade, Clay’s twin 6.5m telescope. This is where the real action is.

We were allotted the nights of 31st January and 1st February, and were rewarded with the best weather we could possibly hope for. With The Beatles playing in the control room and delicious empanadas ready to be devoured, we started the night by calibrating our spectral slits – taking images of the grism, measuring the ambient light in the dome, and measuring the sky background for that night. We planned to use the 4000 Angstrom break (a super cool tracker to measure Doppler shifts in galaxies), which falls in the red part of the electromagnetic spectrum because of the distance of these clusters.  Once twilight was past, we attacked the first cluster with 3-hour long exposures – yes, it takes that long for us to get a sizeable signal-to-noise ratio for a cluster that far! By the end of two nights, all five cluster targets were spectrally conquered.

What did I learn from this past week? That it is computationally challenging to detect faint emission lines from a redshift 1.2 galaxy cluster hidden behind the glowing sky background. That it is important to consider all potential sources of error while dealing with data reduction of clusters. But what did I learn that I wouldn’t have sitting in my cozy office chair in Chicago?

Magellan against the backdrop of the Milky Way. The word 'surreal' comes to mind.

Magellan against the backdrop of the Milky Way. The word ‘surreal’ comes to mind.

That there is something amazingly profound about looking up in the sky and seeing the Milky Way with its stellar population and diffuse gas structures. The Magellanic Clouds give you good company, and the alignment of four planets in the sky (go figure!) adds the right touch to my story – that Mike and I upheld the spirit of astronomy and astrophysics by being there and looking at those clusters with our own eyes, while they were being studied on the grand canvas of the universe. We utilized our intuition, not an algorithm, to determine how long is good enough to see a cluster, or whether our time was better spent studying another cluster. I believe that part of what makes us astronomers is this very intuition. It is in the same spirit of the field that we should consider what we miss by moving to an automated version of astrophysics.

If and when astronomy moves in that direction, I for one, will miss the empanadas, the alpacas and this great opportunity to take some clicks of my own. It is beautiful up there.

by Gourav Khullar at February 05, 2016 12:45 AM

February 04, 2016

Symmetrybreaking - Fermilab/SLAC

Weighing the lightest particle

Physicists are using one of the oldest laws of nature to find the mass of the elusive neutrino.

Neutrinos are everywhere. Every second, 100 trillion of them pass through your body unnoticed, hardly ever interacting. Though exceedingly abundant, they are the lightest particles of matter, and physicists around the world are attempting the difficult challenge of measuring their mass.   

For a long time, physicists thought neutrinos were massless. This belief was overturned by the discovery that neutrinos oscillate between three flavors: electron, muon and tau. This happens because each flavor contains a mixture of three mass types, neutrino-1, neutrino-2 and neutrino-3, which travel at slightly different speeds.

According to the measurements taken so far, neutrinos must weigh less than 2 electronvolts (a minute fraction of the mass of the tiny electron, which weighs 511,000 electronvolts). A new generation of experiments is attempting to lower this limit—and possibly even identify the actual mass of this elusive particle.

Where did the energy go?

Neutrinos were first proposed by the Austrian-born theoretical physicist Wolfgang Pauli to resolve a problem with beta decay. In the process of beta decay, a neutron in an unstable nucleus transforms into a proton while emitting an electron. Something about this process was especially puzzling to scientists. During the decay, some energy seemed to go missing, breaking the well-established law of energy conservation.

Pauli suggested that the disappearing energy was slipping away in the form of another particle. This particle was later dubbed the neutrino, or “little neutral one,” by the Italian physicist Enrico Fermi.

Scientists are now applying the principle of energy conservation to direct neutrino mass experiments. By very precisely measuring the energy of electrons released during the decay of unstable atoms, physicists can deduce the mass of neutrinos.

“The heavier the neutrino is, the less energy is left over to be carried by the electron,” says Boris Kayser, a theoretical physicist at Fermilab. “So there is a maximum energy that an electron can have when a neutrino is emitted.”

These experiments are considered direct because they rely on fewer assumptions than other neutrino mass investigations. For example, physicists measure mass indirectly by observing neutrinos’ imprints on other visible things such as galaxy clustering.

Detecting the kinks

Of the direct neutrino mass experiments, KATRIN, which is based at the Karlsrule Institute for Technology in Germany, is the closest to beginning its search.

If everything works as planned, I think we'll have very beautiful results in 2017,” says Guido Drexlin, a physicist at KIT and co-spokesperson for KATRIN.

Cleanliness is key inside the main spectrometer.

The KATRIN collaboration

KATRIN plans to measure the energy of the electrons released from the decay of the radioactive isotope tritium. It will do so by using a giant tank tuned to a precise voltage that allows only electrons above a specific energy to pass through to the detector at the other side. Physicists can use this information to plot the rate of decays at any given energy.

The mass of a neutrino will cause a disturbance in the shape of this graph. Each neutrino mass type should create its own kink. KATRIN, with a peak sensitivity of 0.2 electronvolts (a factor 100 better than previous experiments) will look for a “broad kink” that physicists can use to calculate average neutrino mass.  

Another tritium experiment, Project 8, is attempting a completely different method to measure neutrino mass. The experimenters plan to detect the energy of each individual electron ejected from a beta decay by measuring the frequency of its spiraling motion in a magnetic field. Though still in the early stages, it has the potential to go beyond KATRIN’s sensitivity, giving physicists high hopes for its future.

“KATRIN is the furthest along—it will come out with guns blazing,” says Joseph Formaggio, a physicist at MIT and Project 8 co-spokesperson. “But if they see a signal, the first thing people are going to want to know is whether the kink they see is real. And we can come in and do another experiment with a completely different method.”

Cold capture

Others are looking for these telltale kinks using a completely different element, holmium, which decays through a process called electron capture. In these events, an electron in an unstable atom combines with a proton, turning it into a neutron while releasing a neutrino.

Physicists are measuring the very small amount of energy released in this decay by enclosing the holmium source in microscopic detectors that are operated at very low temperatures (typically below minus 459.2 degrees Fahrenheit). Each holmium decay leads to a tiny increase of the detector’s temperature (about 1/1000 degrees Fahrenheit).

“To lower the limit on the electron neutrino mass, you need a good thermometer that can measure these very small changes of temperature with high precision,” says Loredana Gastaldo, a Heidelberg University physicist and spokesperson for the ECHo experiment.  

There are currently three holmium experiments, ECHo and HOLMES in Europe and NuMECs in the US, which are in various stages of testing their detectors and producing isotopes of holmium.

The holmium and tritium experiments will help lower the limit on how heavy neutrinos can be, but it may be that none will be able to definitively determine their mass. It will likely require a combination of both direct and indirect neutrino mass experiments to provide scientists with the answers they seek—or, physicists might even find completely unexpected results.

“Don't bet on neutrinos,” Formaggio says. “They’re kind of unpredictable.”

by Diana Kwon at February 04, 2016 04:40 PM

Quantum Diaries

Frénésie du côté de la théorie

Depuis le 15 décembre, j’ai compté 200 nouveaux articles théoriques, chacun offrant une ou plusieurs explications possibles sur la nature d’une nouvelle particule qui n’a pas encore été découverte. Cette frénésie a commencé lorsque les expériences CMS et ATLAS ont toutes deux rapporté avoir trouvé quelques événements qui pourraient révéler la présence d’une nouvelle particule se désintégrant en deux photons. Sa masse serait autour de 750 GeV, soit cinq fois la celle du Higgs boson.

Personne ne sait si un tel engouement est justifié mais cela illustre combien les physiciens et physiciennes espèrent une découverte majeure dans les années à venir. Est-ce que cela se passera comme pour le boson de Higgs, qui fut officiellement découvert en juillet 2012, bien que quelques signes avant-coureurs apparurent un an auparavant ? Il est encore bien trop tôt pour le dire. Et comme je l’avais écrit en juillet 2011, c’est comme si nous essayions de deviner si le train s’en vient en scrutant l’horizon par une morne journée d’hiver. Seule un peu de patience nous dira si la forme indistincte à peine visible au loin est bien le train longuement attendu ou juste une illusion. Il faudra plus de données pour pouvoir trancher, mais en attendant, tout le monde garde les yeux rivés sur cet endroit.
LeTrainDeMidiLe train de midi, Jean-Paul Lemieux, Galerie nationale du Canada

En raison des difficultés inhérentes à la reprise du LHC à plus haute énergie, la quantité de données récoltées à 13 TeV en 2015 par ATLAS et CMS a été très limitée. De tels petits échantillons de données sont toujours sujets à de larges fluctuations statistiques et l’effet observé pourrait bien s’évaporer avec plus de données. C’est pourquoi les deux expériences se sont montrées si réservées lors de la présentation de ces résultats, déclarant clairement qu’il était bien trop tôt pour sauter au plafond.

Mais les théoriciens et théoriciennes, qui cherchent en vain depuis des décennies un signe quelconque de phénomènes nouveaux, ont sauté sur l’occasion. En un seul mois, y compris la période des fêtes de fin d’année”, 170 articles théoriques avaient déjà été publiés pour suggérer autant d’interprétations différentes possibles pour cette nouvelle particule, même si on ne l’a pas encore découverte.

Aucune nouvelle donnée ne viendra avant quelques mois en raison du de la maintenance annuelle. Le Grand Collisionneur de Hadrons repartira le 21 mars et devrait livrer les premières collisions aux expériences le 18 avril. On espère un échantillon de données de 30 fb-1 en 2016, alors qu’en 2015 seuls 4 fb-1 furent produits. Lorsque ces nouvelles données seront disponibles cet été, nous saurons alors si cette nouvelle particule existe ou pas.

Une telle possibilité serait une véritable révolution. Le modèle théorique actuel de la physique des particules, le Modèle Standard, n’en prévoit aucune. Toutes les particules prédites par le modèle ont déjà été trouvées. Mais puisque ce modèle laisse encore plusieurs questions sans réponses, les théoriciennes et théoriciens sont convaincus qu’il doit exister une théorie plus vaste pour expliquer les quelques anomalies observées. La découverte d’une nouvelle particule ou la mesure d’une valeur différente de celle prévue par la théorie révèleraient enfin la nature de cette nouvelle physique allant au-delà du Modèle Standard.

Personne ne connaît encore quelle forme cette nouvelle physique prendra. Voilà pourquoi tant d’explications théoriques différentes pour cette nouvelle particule ont été proposées. J’ai compilé certaines d’entre elles dans le tableau ci-dessous. Plusieurs de ces articles décrivent simplement les propriétés requises par un nouveau boson pour reproduire les données observées. Les solutions proposées sont incroyablement diversifiées, les plus récurrents étant diverses versions de modèles de matière sombre ou supersymétriques, de Vallée Cachée, de Grande Théorie Unifiée, de bosons de Higgs supplémentaire ou composites, ou encore des dimensions cachées. Il y en a pour tous les goûts : des axizillas au dilatons, en passant pas les cousins de pions sombres, les technipions et la trinification.

La situation est donc tout ce qu’il y a de plus clair : tout est possible, y compris rien du tout. Mais n’oublions pas qu’à chaque fois qu’un accélérateur est monté en énergie, on a eu droit à de nouvelles découvertes. L’été pourrait donc être très chaud.

Pauline Gagnon

Pour en savoir plus sur la physique des particules et les enjeux du LHC, consultez mon livre : « Qu’est-ce que le boson de Higgs mange en hiver et autres détails essentiels».

Pour recevoir un avis lors de la parution de nouveaux blogs, suivez-moi sur Twitter: @GagnonPauline ou par e-mail en ajoutant votre nom à cette liste de distribution.


Un résumé partiel du nombre d’articles publiés jusqu’à maintenant et le type de solutions proposées pour expliquer la nature de la nouvelle particule, si nouvelle particule il y a. Pratiquement tous les modèles théoriques connus peuvent être adaptés pour accommoder une nouvelle particule compatible avec les quelques événements observés. Ce tableau est juste indicatif et en aucun cas, strictement exact puisque plusieurs articles étaient plutôt difficiles à classer. Une de ces idées s’avèrera-t-elle être juste ?

by Pauline Gagnon at February 04, 2016 04:21 PM

Quantum Diaries

Frenzy among theorists

Since December 15, I have counted 200 new theoretical papers, each one suggesting one or several possible explanations for a new particle not yet discovered. This flurry of activity started when the CMS and ATLAS Collaborations both reported having found a few events that could possibly reveal the presence of a new particle decaying to two photons. Its mass would be around 750 GeV, that is, five times the mass of the Higgs boson.

No one knows yet if all this excitement is granted but it clearly illustrates how much physicists are hoping for a huge discovery in the coming years. Will it be like with the Higgs boson, which was officially discovered in July 2012 but had already given some faint signs of its presence a year earlier? Right now, there is not enough data. And just as I wrote in July 2011, it is as if we were trying to guess if the train is coming by looking in the far distance on a grey winter day. Only time will tell if the indistinct shape barely visible above the horizon is the long awaited train or just an illusion. But until more data become available, everybody will keep their eyes on that spot.


The noon train, Jean-Paul Lemieux, National Gallery of Canada

Due to the difficulties inherent to the restart of the LHC at higher energy, the amount of data collected at 13 TeV in 2015 by ATLAS and CMS was very limited. Given that small data samples are always prone to large statistical fluctuations, the experimentalists exerted much caution when they presented these results, clearly stating that any claim was premature.

But theorists, who have been craving for signs of something new for decades, jumped on it. Within a single month, including the end-of-the-year holiday period, 170 theoretical papers were published to suggest just as many possible different interpretations for this yet undiscovered new particle.

No new data will come for a few more months due to annual maintenance. The Large Hadron Collider is due to restart on March 21 and should deliver the first collisions to the experiments around April 18. The hope is to collect a data sample of 30 fb-1 in 2016, to be compared with about 4 fb-1 in 2015. Later this summer, when more data will be available, we will know if this new particle exists or not.

This possibility is however extremely exciting since the Standard Model of particle physics is now complete. All expected particles have been found. But since this model leaves many open questions, theorists are convinced that there ought to be a more encompassing theory. Hence, discovering a new particle or measuring anything with a value different from its predicted value would reveal at long last what the new physics beyond the Standard Model could be.

No one knows yet what form this new physics will take. This is why so many different theoretical explanations have been proposed for this possible new particle. I have compiled some of them in the table below. Many of these papers described the properties needed by a new boson to fit the actual data. The solutions proposed are incredibly diversified, the most recurrent ones being various versions of dark matter or supersymmetric, new gauge symmetries, Hidden Valley, Grand Unified Theory, extra or composite Higgs bosons and extra dimensions. There enough to suit every taste: axizillas, dilatons, dark pion cousins of a G-parity odd WIMP, one-family walking technipion or trinification.

It is therefore crystal clear: it could be anything or nothing at all… But every time accelerators have gone up in energy, new discoveries have been made. So we could be in for a hot summer.

Pauline Gagnon

Learn more on particle physics, don’t miss my book, which will come out in English in July.

To be alerted of new postings, follow me on Twitter: @GagnonPauline  or sign-up on this mailing list to receive an e-mail notification.


A partial summary of the number of papers published so far with the type of solutions they proposed to explain the nature of the new particle, if new particle there is. Just about all known theoretical models can be adapted to produce a new particle with characteristics compatible with the few events observed. This is just indicative and by no means, strictly exact since many proposals were rather hard to categorize. Will one of these ideas be the right one?

by Pauline Gagnon at February 04, 2016 04:13 PM

Lubos Motl - string vacua and pheno

Why string theory, by Joseph Conlon
I have received a free copy of "Why String Theory" by Joseph Conlon, a young Oxford string theorist who has done successful specialized work related either to the moduli stabilization of the flux vacua, or to the axions in string theory. (He's been behind the website, too.)

The 250-page-long paperback looks modern and tries to be more technical than popular books but less technical than string theory textbooks. Unfortunately, I often feel that "more technical than a popular book" mostly means that the book uses some kind of an intellectual jargon – but the nontrivial physics ideas aren't actually described more accurately than in the popular books.

From the beginning, one may see that the book differs from the typical books that are intensely focusing on the search for a theory of everything. Well, the dedication as well as the introduction to each chapter at the beginning of the book (and others) sort of shocked me.

The dedication remains the biggest shock for me: the book is dedicated to the U.K. taxpayers.

It's not just the dedication, however. In the preface, Conlon explains that he wants the "wonderful fellow citizens who support scientific research through their taxes" (no kidding!) to be the readers. He is very grateful for the money.

The preface has only reinforced my feeling that he is "in it for the money". And the theme has continued to reappear in the following chapters, too. It became a distraction I couldn't get rid of. In at least two sections, he mentions that the financial resources going to string theory are much smaller than those in the medical research and the latter funds are still a tiny portion of the budgets.

Great. But why would you repeat this thing twice in a book that is supposed to be about physics? The money going to pure science is modest because most taxpayers are simply not interested in pure science at all. They are interested in practical things. A minority of the people is interested in our pure knowledge of Nature and those would pay a much higher percentage of the budgets to string theory, too. The actual amounts (perhaps a billion of dollars in the U.S. every year?) are a compromise of a sort.

The idea that all taxpayers will be interested in such a book is silly (almost equivalently, it's silly to think that someone will read the book because he is a taxpayer whose money is partly spent for the research; most people don't read books about cheaper ways to hire janitors although this decides about billions of dollars a year, too) and it's hard for me to get rid of the feeling that Conlon's formulations are shaped by the gratitude to the taxpayers for the money – so he's sort of bribed which is incompatible with the scientific integrity. You may imagine that a sensitive reader such as myself reads the text and sees the impact of the "bribes" on various formulations (for example, Conlon's outrageous lie that all string critics are basically honest people is probably shaped by the financial considerations – because many of the string critics are taxpayers) but quite suddenly, the book counts the string theorists by the number of mortgages that people have because of some work that is linked to string theory. Is that serious? And does a majority of string theorists have a mortgage? Whether it's right or not, why should such things matter?

The obsession with the financial aspects of Conlon's job has distracted me way too often. It's totally OK when some people are considering string theory research to be just another job – but it is just not too interesting to read about it. We don't read books about the dependence of other occupations on wages, either. And for a person who is interested in physics sufficiently to buy the book, the money circulating in string theory research is surely a negligible part of the story.

And this financial theme kept on penetrating way too many things. The first regular chapter, "The Long Wait", starts in June 1973. I honestly wouldn't know what event deserving to start the book occurred in June 1973. It turned out that it was the date when the papers on QCD were submitted and in the book, that event is "special" from the today's viewpoint because these were the newest theoretical physics papers as of today that were awarded by the Nobel prize. It seems technically true – Veltman and 't Hooft did their Nobel-prize-winning work in 1971, Kobayashi and Maskawa earlier in 1973, and so on.

But is this factoid important enough to be given the full first chapter of a book on string theory? I don't think so. The fact that no Nobel prizes came to theoretical physicists for their more recent discoveries isn't really important – except for those who are only doing physics because of the money, of course. But even when it comes to the money, numerous people (especially around string theory and inflation) got greater prizes for much newer insights. There are various reasons why the Nobel prizes aren't being given to theoretical physicists for more recent discoveries but these reasons don't imply that breathtakingly important discoveries haven't been made. This focus on June 1973 is just a totally flawed way to think about the importance in theoretical physics – an unfortunate way to start a semipopular book on theoretical physics.

I knew that the following chapter was about scales in physics which is why I was like "WTF" when I saw the first words of that chapter: "As-Salaam-Alaikum". What? ;-) This Arabic greeting means "peace be upon you". What does it have to do with scales in physics? Even when you add the following exchange from the desert that Conlon added, "where are you coming from and where are you going?", this exchange has still nothing to do with scales in physics. At most, the exchange describes a world line in an Arab region of the spacetime. But it has nothing to do with the renormalization group. Perhaps both situations involves diagrams with oriented lines – but that's too small an amount of common ancestry.

Again, one can't avoid thinking: this awkward beginning was probably a not-so-hidden message to the Muslim British taxpayers. Sorry, I have a problem with that. And I think that so do the Muslim Britons who actually care about physics. And no British Muslim will buy a book about string theory because it contains an Arabic greeting so this kind of bootlicking is ineffective, anyway. The bulk of the chapter dedicates many pages to describing the size of many objects. I think that what makes it boring is that Conlon doesn't seem to communicate any deeper and nontrivial – or, almost equivalently, a priori controversial – ideas (something that books like Wilczek's book on beauty are full of). It seems to me that the book is addressed to some moderately intelligent people with superficial ideas about physics and it encourages them to think that they're not really missing anything important. The logic of the renormalization group, "integrating out", or its relationships with reductionism etc. aren't really discussed.

The following, third chapter wants to cover the pillars of 20th century and pre-stringy physics. It starts by talking about special relativity. Conlon argues that the words "In the beginning..." in the Bible (as well as the whole subject of history etc.) contradict relativity. Sorry, there isn't any contradiction like that. Even in relativity, one may sort events chronologically. Different observers may do so differently but it's still possible. And in the history of events on the Earth, the spatial distances are so short relatively to the times times \(c\) (and the reasonable velocities to consider are so much smaller than the speed of light) that all the observers' choices of the coordinates end up being basically equivalent, anyway. So the reference to the Bible has nothing to do with special relativity, just like the Arabic greeting that had started the previous chapter has nothing to do with scales. Perhaps it was a message to the Christian taxpayers. Or the violent atheist taxpayers – because the comment about the Bible was a negative one.

Now, a page is dedicated to special relativity and less than a page to foundations of general relativity. It's really too little and nothing is really explained there. Moreover, general relativity is framed as a "replacement" of special relativity. That's not correct. Einstein would describe it as a generalization, not replacement, of special relativity (look at the name), relevant for situations in which gravity matters. In the modern setup, we view general relativity as the unique framework that results from the combination of special relativity and spin-two massless fields (which are needed to incorporate gravity). In this sense, general relativity is an application of special relativity – and in some sense a subset of special relativity.

Quantum mechanics is given several pages and Conlon says that it absolutely works which is good news. But aside from a few sentences about the quantum entanglement, the pages are mostly spent with repeating that quantum mechanics is needed for chemistry. There are several more sections about the pre-stringy pillars of physics – some cosmology, something about symmetries.

The fourth chapter wants to argue that something beyond the Standard Model and GR is needed. So it's the chapter mentioning the non-renormalizability of gravity etc. Some important points are made, including the point that quantum mechanics must hold universally (Conlon surely is pro-QM). But I can't see what kind of readers (with what background) will understand the explanations at this level. The explanations vaguely depend on some quasi-expert's jargon but they don't say enough for you to reconstruct any actual arguments. I've done lots of this semi-expert writing and it seems absolutely obvious to me that you need to extend the semi-technical explanations at least by an order of magnitude relatively to Conlon's short summaries to actually convey some helpful, verifiable, usable, nontrivial ideas.

If I try to characterize the people who are waiting for this genre that is linguistically heavy but lacking the actual arguments, I think it's right to say that they're "intellectuals who are ready to parrot sentences, even complicated sentences with the jargon similar to the experts' jargon, to defend their intellectual credentials (i.e. impress other people with intelligently sounding sentences)" but who don't really understand anything properly. And I think it's not right to increase the number of such people.

Thankfully, things get better from the fifth chapter that begins with string theory proper. The first event is Veneziano's work in 1968. Conlon describes about 10 non-physics events in the year 1968. It's not clear to me why there are so many events like that. But in such a long list, I think it is crazy not to mention Prague Spring in Czechoslovakia (and the student riots in Paris should be featured more prominently, too). It ended on August 21st, 1968 when 750,000 Warsaw Pact troops occupied my country. To say the least, it was the largest military operation since the war which, I believe, is more important than the cancellation of last steam engines on British railways etc.

The world sheet duality that was deduced from Veneziano's formula hides some cool mathematics but the book unfortunately avoids equations so none of this content is communicated. The beauty and power of all these things may only be understood along with the mathematical relationships etc. which is why I am afraid that this "more detailed" but purely verbal story about the discoveries doesn't bring much to a thinking reader. It's like a book praising a beautiful painting – which doesn't actually show you the painting.

There are various stories, e.g. about Claude Lovelace who realized that bosonic string theory requires 26 dimensions. Lovelace has never completed his PhD but he was hired as a professor at Rutgers, anyway – I was meeting him for many years when I was a PhD student (he died in 2012). A quote in Conlon's book suggests that Lovelace's promotion was insufficient relatively to his contributions. I wouldn't agree with that. At Rutgers, I also knew Joel Shapiro, another early father of string theory. He's a fun guy – he taught group theory to us. A very good course. At a colloquium I gave later, he suggested that the term "first string revolution" should indeed be used for the 1968-1973 era, as Conlon indicates. Whatever is the right name (I called it the zeroth string revolution), it wasn't a superstring revolution because there was no supersymmetry yet!

What seems problematic to me is that the exact chronology of the historical events became the heart of Conlon's prose. But string theory isn't a collection of random historical events, like the Second World War. It's primarily a unifying theory of everything that we don't understand perfectly but the current incomplete understanding is much more accurate and makes much more sense than what people knew in the late 1960s, or in the following 4 decades. A book that is really about physics just can't put all historical events on the same level. The history was just about the "Columbus' journey to the New World" but it's the New World itself, and not details of the journey, that should be the point of a book "Why the New World".

Various discoveries and dualities etc. are mentioned in one or two sentences per discovery. I think it's just too little information about each of them. It may be OK for people who read dozens of redundant books a year and who don't feel the urge to think about every idea that is being pumped into them. For certain reasons, I think that it's counterproductive when people learn about too many facts about string theory (or another theory) without really understanding their relationship and inevitability. If they learn many things, they must feel that string theorists are just inventing random garbage. It feels like science could live equally well without those things (if the events were replaced by totally different events with a different outcome) – just like the mankind could have survived without the Second World War. But it isn't the case. They're deducing something that can't be otherwise – a point you may only verify if you actually know the technology. Well, at some moment, you may start to trust claims about exact dualities etc. by certain authors. But you must see the strong evidence for or a derivation of at least one such an amazing result (or several) to see that it's not just a pile of fairy-tales.

While e.g. Richard Dawid exaggerates (to say the least) the changes in thinking during the string theory era, he does correctly capture the importance of uniqueness (only game in town) and unexpected explanatory interconnections for the string theorists' focus on string theory. Conlon, while a string theorist, seems to completely overlook if not explicitly reject these facts and principles. But they're essential for the understanding where theoretical physicists will look for new insights in the future and how they use the accumulated knowledge to find the new one. So the vision or motivation for the future is therefore basically absent in Conlon's book, too.

Another chapter is about AdS/CFT and the landscape. AdS/CFT was revealed in 1997 – and just like for 1968, Conlon lists many events in 1997. Tony Blair won some elections in a landslide. Holy cow. Every year, there are hundreds of elections in the world and someone wins them. Even the elections in the U.K. are rather frequent. Moreover, I don't understand the logic by which a book like this one should be preferably read by the Britons only. The scientific curiosity is transnational. The book may be "dedicated" to U.K. taxpayers but if it's about science, then it must be equally interesting for the Canadian and other taxpayers, right?

But there are more serious bugs with the content, I think. We're told that after AdS/CFT, almost no one would view string theory primarily as a unifying fundamental theory of Nature. Sorry but that's rubbish. Virtually all top string theorists do. The fact that there are lots of articles that use AdS/CFT methods outside "fundamental physics" doesn't imply that the links of string theory with fundamental physics have been weakened. You may find millions of T-shirts with \(E=mc^2\) which doesn't mean that it's the most important insight made by Einstein.

Similarly, it's wrong to say that the AdS/CFT made string theory "less special". The AdS/CFT correspondence has found a new powerful equivalence between string theory and quantum field theory – but the two sides operate in different spacetimes or world volumes. This holographic duality has made string theory more "inseparable" from the established physics and it became less conceivable that string theory could be "cut" away from physics again – it's because dynamics of string theory inevitably emerges if you study the important established theories, quantum field theories, carefully enough (especially in certain limits).

But if you use a consistent description, it's true (just like it was true before AdS/CFT was found) that in any spacetime where you can see the effects of quantum gravity, you may also see something like strings or M2-branes and the extra dimensions (for the total to be 10 or 11) and other things that come with them. AdS/CFT doesn't allow you to circumvent this fact in any way. It only gives you a new description of this physics of strings or membranes in terms of a theory on a different space, the boundary of the AdS space – a new QFT-based tool to directly prove that there are strings, branes, and various other stringy effects in the bulk. This theory on the boundary happens to be a quantum field theory. But the importance of QFTs in string theory wasn't new, either. Perturbative string theory was always described in terms of a QFT, namely the two-dimensional conformal field theory. The essential point is that this 2D CFT lives on a different space, the world sheet, than the actual spacetime where we observe the gravity. Aside from the world sheet CFT, AdS/CFT has also told us to use the boundary CFT – another QFT-style way to describe stringy physics. But the physics of quantum gravity in the same spacetime as the spacetime of quantum gravity is as stringy as it was before. AdS/CFT has allowed us to explicitly construct many phenomenologically unrealistic sets of equations for quantum gravity (by constructing some boundary CFTs) but it hasn't made the problem of combining particular non-gravitational matter contents with quantum gravity less constraining. An ordinary generic QFT used as a boundary CFT produces a "heavily curved gravitating AdS spacetime" and those may have become "easy" in some way. But the actual, low-curvature theories of quantum gravity are as rare as before.

At least, I found Conlon's discussion of the landscape OK. The large number of solutions is neither new not a problem. The anthropic principle is non-vacuous but it may easily degenerate into explanations that may look sufficient to someone but that are demonstrably not the right ones.

In another chapter, Conlon starts to talk about the "problem of strong coupling". I am afraid that the basic idea that "something is easy at weak coupling, hard at strong coupling" etc. is very easy, much like the usage of the buzzword "nonperturbative". But people who don't really understand and who misinterpret what "easy" and "hard" and "nonperturbative" mean will do so after reading these pages by Conlon, too. Conlon continues with the discussion of the high number of citations of AdS/CFT and reasons why it's exact and correct. An exact agreement about a complicated polynomial-in-zeta-function formula for the dimension of the Konishi operator for many colors makes a lesson clear.

Many pages talk about the application of AdS/CFT correspondence to heavy ion physics; the next section similarly talks about AdS/condensed matter physics. There are many true facts and factoids there. I disagree with Conlon's conclusion in the heavy ion section that adding corrections on top of simplified models is the universal "modus operandi" of science. He uses this thesis to explain that the "exact AdS theory" of the heavy ion physics has to be supplemented with corrections for it to work. That's true and it's normal in much of physics but 1) it is always preferred in physics when adjustments don't have to be added, and 2) it is not how string theory in the strict sense works. String theory does not allow one to add any continuous corrections to its physics, ever. Everything is completely determined by discrete data (identifying the vacuum solution) and the fact that the adjustments are possible in AdS/heavy ion physics shows that those methods are just string-inspired, not examples of full-fledged string theory.

The next chapter talks about the interactions between physics and mathematics. It starts with the pride of physicists. Physics is the deepest science and physicists are Sheldon. No one else can match them – perhaps with the exception of mathematicians. Some insights and facts about mathematics are picked (perhaps a bit randomly) but the main point to be discussed is the flow of ideas in between mathematics and physics.

Monstrous moonshine and mirror symmetry are discussed as the two big examples of string theory's importance in mathematics (excellent topics except that one can't see the beauty without the mathematical "details") while the next section argues against "cults" and chooses Feynman and Witten as the two "cults" that should be avoided. (I think that Witten's cult is basically non-existent, at least outside 3-4 buildings in the world, and I would say "unfortunately".) Progress since the 1980s wouldn't have taken place if everyone were like Feynman; or everyone were like Witten, Conlon says. I actually disagree with both statements. Diversity is way too overrated here. If you had 1,000 Feynmen in physics in the 1980s, I am pretty sure that they would have found the things in the first superstring revolution, too, aside from many other discoveries. One can approach all these things in Feynman's intuitive way. And Conlon overstates how "just intuitive" Feynman papers were. He could have made discoveries with easier formulae because he was normally making fundamental discoveries. But he was the first guy who systematically calculate the Feynman diagrams – from the path integrals to the Feynman parameterization, \(bc\) ghosts that Feynman de facto invented, and beyond. This is in no way "just heuristic/intuitive science".

The disadvantange of Feynman in the real world was that there was only one. Things are even clearer with Witten. I don't agree with Conlon that Witten is only good at things that are "at the intersection of mathematics and physics". Witten has done lots of phenomenology, too, including things like cosmic strings, SUSY breaking patterns, detailed calculations on \(G_2\) holonomy manifolds. 100 Wittens and no one else since 1968 would have been enough to find basically everything we know today. People are different and may have different strengths but that doesn't mean that most of these idiosyncrasies are irreplaceable. It can take more effort for someone to find something – than it takes to someone else – but science ultimately works for everyone who is sufficiently intelligent and hard-working. To say otherwise means to believe that science depends on some magic unreproducible skills.

Chapter 10 is meant to focus on Conlon's characteristic research topics – stabilization of the moduli in compactifications and axions. You may imagine that one needs to know quite a lot to follow what e.g. his papers could have contributed. I think it's basically impossible to convey the information in a semipopular book but he tries. The following Chapter 11 is about quantum gravity in string theory – Strominger and Vafa etc. It doesn't get to recent, post-2009 advances, as far as I can see.

Another chapter argues that all styles of doing physics – revolutionaries and hard workers etc. etc. – are important. It may sound OK but in reality, it's not really possible to classify most physicists by their styles into these boxes at all. Whether someone makes a revolution is ultimately not about his styles and emotions, anyway. And as I said, good physicists may "emulate" what others are doing, despite their having different methods and styles.

The following chapter does a pretty good job in replying some common criticisms of string theory. Then there is another chapter where it's discussed e.g. why loop quantum gravity has remained unsuccessful. I don't think that Conlon describes the status of that proposed theory accurately.

There's a lot of facts and ideas to be found in this book and I obviously agree with a large portion of it. But because of the combination of the "difficult language" and "shortage of actual explanations with the beef", the target audience isn't clear to me, the text seems to be driven by financial and career-wise considerations at too many places (and many of us find these sociological etc. things to be too distracting), and it doesn't go into the sufficient depth for the reader to actually understand that string theory isn't a conglomerate of randomly invented ideas that people are adding arbitrarily (even though Conlon knows very well and explicitly writes that string theory cannot be described in this way). It is not really a book that explains something hard enough (for the layman or the non-expert scientist) and I think that Conlon isn't really an "explainer" in this sense. And I even think that the book reinforces some misconceptions spread by some critics of string theory (e.g. about the impact of AdS/CFT on the status of string theory as a TOE).

You may want to buy the book anyway, to see that it's perhaps not as bad as this text makes it sound.

by Luboš Motl ( at February 04, 2016 02:44 PM

Peter Coles - In the Dark

Measuring the lack of impact of journal papers

I’ve been involved in a depressing discussion on the Astronomers facebook page, part of which was about the widespread use of Journal Impact factors by appointments panels, grant agencies, promotion committees, and so on. It is argued (by some) that younger researchers should be discouraged from publishing in, e.g., the Open Journal of Astrophysics, because it doesn’t have an impact factor and they would therefore be jeopardising their research career. In fact it takes two years for new journal to acquire an impact factor so if you take this advice seriously nobody should ever publish in any new journal.

For the record, I will state that no promotion committee, grant panel or appointment process I’ve ever been involved in has even mentioned impact factors. However, it appears that some do, despite the fact that they are demonstrably worse than useless at measuring the quality of publications. You can find comprehensive debunking of impact factors and exposure of their flaws all over the internet if you care to look: a good place to start is Stephen Curry’s article here.  I’d make an additional point here, which is that the impact factor uses citation information for the journal as a whole as a sort of proxy measure of the research quality of papers publish in it. But why on Earth should one do this when citation information for each paper is freely available? Why use a proxy when it’s trivial to measure the real thing?

The basic statistical flaw behind impact factors is that they are based on the arithmetic mean number of citations per paper. Since the distribution of citations in all journals is very skewed, this number is dragged upwards by a few papers with extremely large numbers of citations. In fact, most papers published have many few citations than the impact factor of a journal. It’s all very misleading, especially when used as a marketing tool by cynical academic publishers.

Thinking about this on the bus on my way into work this morning I decided to suggest a couple of bibliometric indices that should help put impact factors into context. I urge relevant people to calculate these for their favourite journals:

  • The Dead Paper Fraction (DPF). This is defined to be the fraction of papers published in the journal that receive no citations at all in the census period.  For journals with an impact factor of a few, this is probably a majority of the papers published.
  • The Unreliability of Impact Factor Factor (UIFF). This is defined to be the fraction of papers with fewer citations than the Impact Factor. For many journals this is most of their papers, and the larger this fraction is the more unreliable their Impact Factor is.

Another usefel measure for individual papers is

  • The Corrected Impact Factor. If a paper with a number N of actual citations is published in a journal with impact factor I then the corrected impact factor is C=N-I. For a deeply uninteresting paper published in a flashily hyped journal this will be large and negative, and should be viewed accordingly by relevant panels.

Other suggestions for citation metrics less stupid than the impact factor are welcome through the comments box…


by telescoper at February 04, 2016 11:27 AM

astrobites - astro-ph reader's digest

After Super-Earth (Formation)

Title: Super-Earth Atmospheres: Self-Consistent Accretion and Retention

Authors: Sivan Ginzburg, Hilke Schlichting, Re’em Sari

First author’s affiliation: The Hebrew University of Jerusalem

A Super-Earth has formed. The threats the Super-Earth will be facing are real. Everything in this system (the dissipating disk, the cooling planet, and the radiating star) has evolved to kill the atmosphere of the planet. Every single decision the planet makes will be life or death. If the atmosphere is going to survive this, the planet must realize that with too much mass, it could keep growing into a gas giant like Jupiter. With not enough mass, it may not accumulate much of an atmosphere. And if it gets too hot, it could lose its atmosphere completely.

Super-Earths can have Heavy Atmospheres

Do you know where we are? This is a Super-Earth – a rocky planet more massive than Earth, but much less massive than Neptune. The upper size limit of a Super-Earth is about 1.6 Earth radii, above which most planets are not rocky.

Size comparison of Super-Earth COROT 7-b with Earth and Neptune. Image Credit:

Figure 1. Size comparison of Super-Earth COROT 7-b between Earth and Neptune. Image Credit.

Super-Earths can have SUPER atmospheres! While the Earth’s atmosphere comprises just a millionth – or 0.0001% – of the Earth’s total mass, some Super-Earths are not as dense as Earth. To explain their measured masses and radii, some of these planets must have heavier atmospheres that constitute between 1 to 10% of their total mass.

While previous studies show how the planet can build up such a massive atmosphere, the authors in this work take the atmospheric evolution a step further by also modeling atmospheric mass loss to see how low-density Super-Earths close to their stars are capable of retaining their heavy atmospheres.

Building the Atmosphere by Cooling Down

Before studying the processes that conspire to strip Super-Earths of their atmospheres, the authors first model how their atmospheres accumulate to begin with.

Massive enough planets can build up an atmosphere in the first few million years of a planetary system by accreting gas from the surrounding protoplanetary disk before it dissipates away. To end up in the atmosphere, the gas must fall into the region around the planet called the Hill sphere – where the planet’s gravity dominates over the star’s gravity – or else the gas will never be bound to the planet. Additionally, the gas must fall further into the smaller region called the Bondi sphere – where it is moving slower than the planet’s escape velocity – or else the gas will be moving too fast to accrete.

A fully formed planet can initially accrete a small amount of gas (just 0.1% of the planet’s mass) to fill up its Bondi sphere very quickly. With this region filled up, the planet can no longer build its atmosphere without some help.

Fortunately, the atmosphere is not static and it helps itself by cooling down over time. Since the atmosphere is hotter than the surrounding disk, the upper atmosphere will slowly cool by radiating energy away. Once it cools, the atmosphere must become denser in order to maintain its total energy. This allows it to accrete more gas. The authors find that a cooling atmosphere can accrete about 1 to 10% of the planet’s mass in gas, enough to match observations of low-density Super-Earths – that is, if the planet can hold onto that gas.

Keeping the Atmosphere by Cooling Down Quickly

The authors then continue their model by incorporating the effects of the disk, the planet, and the star that are trying to deplete the atmosphere.

Even after the gaseous protoplanetary disk dissipates, the atmosphere continues to cool down. Only now with no more gas to accrete, the atmosphere must shrink to become denser, causing a thicker atmosphere to hurt itself by cooling. As the heavier lower atmosphere is cooling down, it releases enough energy to expel the lighter outer layers that are no longer supported by the pressure from the disk. This prevents any close-in Super-Earth from having a thick atmosphere and begins its mass loss.

Once the atmosphere shrinks to a thin size, it will continue to lose mass as it continues to cool. The atmosphere’s fate depends on whether it is heavy or light:

  • Light atmospheres that make up less than 5% of the planet’s mass lose mass faster than the time they would take to completely cool down. As a result, none of these light atmospheres can survive.
  • Heavy atmospheres fortunately cool down much quicker. After no more than a billion years, the atmosphere will stop cooling and in turn, stop losing mass. The heavy atmospheres have survived!

As the planet tries to destroy its own atmosphere, the star also joins in on the fun by heating the atmosphere with high-energy UV photons from its blackbody spectrum. This prolongs the time it takes for the atmosphere to cool and further constrains which heavy atmospheres will survive.

Matching Up With Observations

The planet’s mass and temperature play key roles in determining whether it becomes Super-Earth sized and whether the atmosphere is appropriately heavy to survive. The authors use their model to determine the ranges of masses and temperatures a planet needs to form a Super-Earth with a surviving, substantial atmosphere, which are shown in Figure 2.

Figure 1. The allowed mass range at a given temperature for a Super-Earth to form with a retainable atmosphere. (For example, if a Super-Earth has a surface temperature of 1000 K, it should be capable of retaining its atmosphere if it has a mass between 9 and 14 times the mass of the Earth.) The triangles and squares mark Super-Earths with heavy atmospheres. The circles indicate Super-Earths with lighter atmospheres.

Figure 2. The allowed mass range at a given temperature for a Super-Earth to form with a retainable atmosphere. (For example, if a Super-Earth has a surface temperature of 1000 K, it should be capable of retaining its atmosphere if it has a mass between 9 and 14 times the mass of the Earth.) The triangles and squares mark Super-Earths with heavy atmospheres. The circles indicate Super-Earths with lighter atmospheres.

Their results match up rather well with the known Super-Earths. Hardly any Super-Earths below the minimum mass curve have a substantial atmosphere. As expected, the planets with relatively heavy atmospheres (triangles and squares) fall between the two dashed lines. Interestingly, not all of the planets between the lines have substantial atmospheres. This may be due to the effects of giant impacts also conspiring to deplete these atmospheres.

All in all, the authors hope that the simplicity of their model will make it a good starting point for future work to incorporate additional effects from more detailed disk models. Despite the agreement with observations, they acknowledge that more rigorous models are needed to understand how Super-Earths can accrete and retain their atmospheres.

by Michael Hammer at February 04, 2016 07:41 AM

February 03, 2016

astrobites - astro-ph reader's digest

Calling STEM Grad Students: Apply now for ComSciCon 2016!


Since 2013, ComSciCon has helped over 300 graduate students improve their science communication skills through workshops across the country. ComSciCon 2016 will be the 4th annual National Workshop, hosted in Cambridge, MA, from June 9-11.

Applications are now open for the Communicating Science 2016 workshop, to be held in Cambridge, MA on June 9-11, 2016!

Graduate students at US institutions in all fields relating to science and engineering, are encouraged to apply. The application will close on March 1st.

Click here to apply!

Since the first ComSciCon national convention in 2013, we’ve received well over 3000 applications from graduate students across the country, and we’ve welcomed about 300 of them to three national and local workshops held in Cambridge, MA. You can read about last year’s workshop to get a sense for the activities and participants at ComSciCon events.

While acceptance to the workshop is competitive, attendance of the workshop is free of charge and travel support will be provided to accepted applicants.

Participants will build the communication skills that scientists and other technical professionals need to express complex ideas to their peers, experts in other fields, and the general public. There will be panel discussions on the following topics:

  • Communicating with Non-Scientific Audiences through Media Outlets
  • Communicating through Policy and Advocacy
  • Communicating through Creative Outlets and Storytelling
  • Communicating through Education and Outreach
  • Communicating with Diverse Audiences

In addition to these discussions, ample time is allotted for interacting with the experts and with attendees from throughout the country to discuss science communication and develop science outreach collaborations. Workshop participants will produce an original piece of science writing and receive feedback from workshop attendees and professional science communicators, including journalists, authors, public policy advocates, educators, and more.

ComSciCon attendees have founded new science communication organizations in collaboration with other students at the event, published more than 40 articles written at the conference in popular publications with national impact, and formed lasting networks with our student alumni and invited experts. Visit the ComSciCon website to learn more about our past workshop programs and participants.

Attendees, organizers, and panelists gather for a group photo after ComSciCon 2015.

Attendees, organizers, and panelists gather for a group photo after ComSciCon 2015.

This workshop is sponsored by Harvard University, the Massachusetts Institute of Technology, University of Colorado Boulder, the American Astronomical Society, the American Association for the Advancement of Science, the American Chemical Society, and Microsoft Research.

by Ben Cook at February 03, 2016 01:31 PM

February 02, 2016

Clifford V. Johnson - Asymptotia

It Came from Elsewhere…

Reggie_Austin_and_cvj_interviewThis just in. Marvel has posted a video of a chat I did with Agent Carter's Reggie Austin (Dr. Jason Wilkes) about some of the science I dreamed up to underpin some of the things in the show. In particular, we talk about his intangibility and how it connects to other properties of the Zero Matter that we'd already established in earlier episodes. You can see it embedded below [...] Click to continue reading this post

The post It Came from Elsewhere… appeared first on Asymptotia.

by Clifford at February 02, 2016 11:49 PM

Tommaso Dorigo - Scientificblogging

Choose the next topic
Being back in blogging mood, I decided I would make a poll among the most affectionate readers of this column - those who will come here to read "blog" pieces and not only "articles which are sponsored on the relevant spots in the main web page of the Science20 site.
The idea is that I have a few topics to offer for the next few posts, and I would offer you to choose which one you are interested to read about. Of course, you could also suggest that I write about something different from my proposed topics - but I do not guarantee that I will comply, as I might feel unfit to the requested tasks. We'll see, though.

Here is a short list of a few things I can spend my time talking about in a post here.

- recent CMS results
- recent ATLAS results

read more

by Tommaso Dorigo at February 02, 2016 08:34 PM

Tommaso Dorigo - Scientificblogging

A Workshop On Applied Statistics

A Sino-Italian workshop on Applied Statistics was held today at the Department of Statistical Sciences of the University of Padova. The organizers were Alessandra Brazzale and Alessandra Salvan from the Department of Statistical Sciences, and Giorgio Picci from the "Confucius Institute". 

read more

by Tommaso Dorigo at February 02, 2016 08:19 PM

Symmetrybreaking - Fermilab/SLAC

This radioactive life

Radiation is everywhere. The question is: How much?

An overly plump atomic nucleus just can’t keep itself together. 

When an atom has too many protons or neutrons, it’s inherently unstable. Although it might sit tight for a while, eventually it can’t hold itself together any longer and it spontaneously decays, spitting out energy in the form of waves or particles.

The end result is a smaller, more stable nucleus. The spit-out waves and particles are known as radiation, and the process of nuclear decay that produces them is called radioactivity. 

Radiation is a part of life. There are radioactive elements in most of the materials we encounter on a daily basis, which constantly spray us with radiation. For the average American, this adds up to a dose of about 620 millirem of radiation every year. That’s roughly equivalent to 10 abdominal X-rays. 

Scientists use the millirem unit to express how much a radiation dose damages the human body. A person receives 1 millirem during an airline flight from one U.S. coast to the other. 

But where exactly does our annual dose of radiation come from? Looking at sources, we can split the dosage in two nearly equal parts: About half comes from natural background radiation and half comes from manmade sources.

Infographic by Sandbox Studio, Chicago with Ana Kova


Natural background radiation originates from outer space, the atmosphere, the ground, and our own bodies. There’s radon in the air we breathe, radium in the water we drink and miscellaneous radioactive elements in the food we eat. Some of these pass through our bodies without much ado, but some get incorporated into our molecules. When the nuclei eventually decay, our own bodies expose us to tiny doses of radiation. 

“We’re exposed to background radiation whether we like it or not,” says Sayed Rokni, radiation safety officer and radiation protection department head at SLAC National Accelerator Laboratory. “That exists no matter what we do. I wouldn’t advise it, but we could choose not to have dental X-rays. But we can’t choose not to be exposed to terrestrial radiation—radiation that is in the crust of the earth, or from cosmic radiation.”

It’s no reason to panic, though. 

“The human species, and everything around us, has evolved over the ages while receiving radiation from natural sources. It has formed us. So clearly there is an acceptable level of radiation,” Rokni says. 

Any radiation not considered background comes from manmade sources, primarily through diagnostic or therapeutic medical procedures. In the early 1980s, medical procedures accounted for 15 percent of an American’s yearly radiation exposure—they now account for 48 percent. 

“The amount of natural background radiation has stayed the same,” says Don Cossairt, Fermilab radiation protection manager. “But radiation from medical procedures has blossomed, perhaps with corresponding dramatic improvements in treating many diseases and ailments.” 

Growth in the use of medical imaging has raised the average American’s yearly exposure from its 1980s' average of 360 millirems to 620 millirems. Today’s annual average is not regarded as harmful to health by any regulatory authority. 

While medical procedures make up most of the manmade radiation we receive, about 2 percent of the overall annual dose comes from radiation emitted by some consumer products. Most of these products are probably in your home right now. Simply examining the average kitchen, one finds a cornucopia of items that emit enough radiation to detect it with a Geiger counter, in both manmade consumer products and natural foods. 

Are there Brazil nuts in your pantry? They’re the most radioactive food there is. A Brazil nut tree’s roots reach far down into the soil to deep underground where there’s more radium, absorb this radioactive element, and pass it on to the nuts. Brazil nuts also contain potassium, which occurs in tandem with potassium-40, a naturally occurring radioactive isotope. 

Potassium-40 is the most prevalent radioactive element in the food we eat. Potassium-packed bananas are well known for their radioactivity, so much so that a banana’s worth of radioactivity is used as an informal measurement of radiation. It’s called the Banana Equivalent Dose. One BED is equal to 0.01 millirem. A typical chest x-ray is somewhere around 200 to 1000 BED. A fatal dose of radiation is about 50 million BED in one sitting. 

Some other potassium-40-containing munchies that emit radiation include carrots, potatoes, lima and kidney beans and red meat. From food and water alone, the average person receives an annual internal dose of about 30 millirem. That’s 3000 bananas!

Even the dish off of which you’re eating may be giving you a slight dose of radiation. The glaze of some older ceramics contains uranium, thorium or good ol’ potassium-40 to make it a certain color, especially red-orange pottery made pre-1960s. Likewise, some yellowish and greenish antique glassware contains uranium as a colorant. Though this dinnerware might make a Geiger counter click, it’s still safe to eat with. 

Your smoke detector, which usually hangs silently on the ceiling until its batteries go dead, is radioactive too. That’s how it can save you from a burning building: A small amount of americium-241 in the device allows it to detect when there’s smoke in the air. 

“It’s not dangerous unless you take it out in the garage and beat it up with a hammer to release the radioactivity,” Cossairt says. The World Nuclear association notes that the americium dioxide found in smoke detectors is insoluble and would “pass through the digestive tract without delivering a significant radiation dose.”

Granite countertops also contain uranium and thorium, which decays into radon gas. Most of the gas gets trapped in the countertop, but some can be released and add a small amount to the radon level in a home—which primarily comes from the soil a structure sits on. 

Granite doesn’t just emit radiation inside the home. People living in areas with more granite rock receive an extra boost of radiation per year. 

Yearly radiation exposure varies significantly depending on where you live. People at higher altitudes receive a greater dose of radiation showered from space per year. 

But not to worry if you live in a locale with lots of altitude and granite, like Denver, Colorado. “No health effect due to radiation exposure has ever been correlated with people living at higher altitudes,” Cossairt says. Similarly, no one has noted a correlation between health and the increased dose of radiation from environmental granite rock. 

It doesn’t matter if you’re living at altitude or sea level, in the Rocky Mountains or on Maryland’s Eastern Shore—radiation is everywhere. But annual doses from background and manmade sources aren’t enough to worry about. So enjoy your banana and feel free to grab another handful of Brazil nuts.

Check out our printable poster about radioactivity.

Artwork by Sandbox Studio, Chicago with Ana Kova

by Chris Patrick at February 02, 2016 04:04 PM

CERN Bulletin

LabVIEW workshops 2016: a free and fun way to learn a new programming language

We are organising about 5 workshops (1 day per week - 2 hours after work) at CERN in the following months, particularly aimed at CERN people (especially technical students). 



The courses will start with the basics of LabVIEW. During the course, which is based on official National Instruments (NI) training materials, we'll learn together how to program in LabVIEW and how to interface with NI hardware. Depending on the participants’ needs and requests, the topics of FPGA and Real-Time could also be explored. The course ends with the CLAD certificate exam. The course and materials are in English.

What is LabVIEW? A highly productive development environment for creating custom applications, allowing users to code in a single language for devices ranging from FPGA, through RT systems to PCs. The software is used at CERN, but not everybody has had the opportunity to work with it. Now could be a good time for you to start.

Target audience: For students and anyone else interested.

Pre-requirements: No experience required, but a bit of programming awareness is recommended. 

If you are interested:
Register here.
More info here.

Organisers: Patryk Oleniuk, LabVIEW Student Ambassador (CERN, TE-EPC) assisted by Izabela Horvath (CERN, TE-MSC), Michał Maciejewski (CERN, TE-MPE) and CERN’s LabVIEW support team.

All courses are free – we offer them because we're LabVIEW fans…

Note: These workshops are given by volunteers. We like LabVIEW and want to share our knowledge of it. The course and the exam are free of charge and the workshops should not be considered as professional NI training. Please refer to the Technical Training catalogue ( for all formal LabVIEW training courses available.

February 02, 2016 10:02 AM

CERN Bulletin

The n-Category Cafe

Integral Octonions (Part 12)

guest post by Tim Silverman

“Everything is simpler mod <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>.”

That is is the philosophy of the Mod People; and of all <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>, the simplest is 2. Washed in a bath of mod 2, that exotic object, the <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> lattice, dissolves into a modest orthogonal space, its Weyl group into an orthogonal group, its “large” <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> sublattices into some particularly nice subspaces, and the very Leech lattice itself shrinks into a few arrangements of points and lines that would not disgrace the pages of Euclid’s Elements. And when we have sufficiently examined these few bones that have fallen out of their matrix, we can lift them back up to Euclidean space in the most naive manner imaginable, and the full Leech springs out in all its glory like instant mashed potato.

What is this about? In earlier posts in this series, JB and Greg Egan have been calculating and exploring a lot of beautiful Euclidean geometry involving <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> and the Leech lattice. Lately, a lot of Fano planes have been popping up in the constructions. Examining these, I thought I caught some glimpses of a more extensive <semantics>𝔽 2<annotation encoding="application/x-tex">\mathbb{F}_2</annotation></semantics> geometry; I made a little progress in the comments, but then got completely lost. But there is indeed an extensive <semantics>𝔽 2<annotation encoding="application/x-tex">\mathbb{F}_2</annotation></semantics> world in here, parallel to the Euclidean one. I have finally found the key to it in the following fact:

Large <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> lattices mod <semantics>2<annotation encoding="application/x-tex">2</annotation></semantics> are just maximal flats in a <semantics>7<annotation encoding="application/x-tex">7</annotation></semantics>-dimensional quadric over <semantics>𝔽 2<annotation encoding="application/x-tex">\mathbb{F}_2</annotation></semantics>.

I’ll spend the first half of the post explaining what that means, and the second half showing how everything else flows from it. We unfortunately bypass (or simply assume in passing) most of the pretty Euclidean geometry; but in exchange we get a smaller, simpler picture which makes a lot of calculations easier, and the <semantics>𝔽 2<annotation encoding="application/x-tex">\mathbb{F}_2</annotation></semantics> world seems to lift very cleanly to the Euclidean world, though I haven’t actually proved this or explained why — maybe I shall leave that as an exercise for you, dear readers.

N.B. Just a quick note on scaling conventions before we start. There are two scaling conventions we could use. In one, a ‘shrunken’ <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> made of integral octonions, with shortest vectors of length <semantics>1<annotation encoding="application/x-tex">1</annotation></semantics>, contains ‘standard’ sized <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> lattices with vectors of minimal length <semantics>2<annotation encoding="application/x-tex">\sqrt{2}</annotation></semantics>, and Wilson’s Leech lattice construction comes out the right size. The other is <semantics>2<annotation encoding="application/x-tex">\sqrt{2}</annotation></semantics> times larger: a ‘standard’ <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> lattice contains “large” <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> lattices of minimal length <semantics>2<annotation encoding="application/x-tex">2</annotation></semantics>, but Wilson’s Leech lattice construction gives something <semantics>2<annotation encoding="application/x-tex">\sqrt{2}</annotation></semantics> times too big. I’ve chosen the latter convention because I find it less confusing: reducing the standard <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> mod <semantics>2<annotation encoding="application/x-tex">2</annotation></semantics> is a well-known thing that people do, and all the Euclidean dot products come out as integers. But it’s as well to bear this in mind when relating this post to the earlier ones.

Projective and polar spaces

I’ll work with projective spaces over <semantics>𝔽 q<annotation encoding="application/x-tex">\mathbb{F}_q</annotation></semantics> and try not to suddenly start jumping back and forth between projective spaces and the underlying vector spaces as is my wont, at least not unless it really makes things clearer.

So we have an <semantics>n<annotation encoding="application/x-tex">n</annotation></semantics>-dimensional projective space over <semantics>𝔽 q<annotation encoding="application/x-tex">\mathbb{F}_q</annotation></semantics>. We’ll denote this by <semantics>PG(n,q)<annotation encoding="application/x-tex">\mathrm{PG}(n,q)</annotation></semantics>.

The full symmetry group of <semantics>PG(n,q)<annotation encoding="application/x-tex">\mathrm{PG}(n,q)</annotation></semantics> is <semantics>GL n+1(q)<annotation encoding="application/x-tex">\mathrm{GL}_{n+1}(q)</annotation></semantics>, and from that we get subgroups and quotients <semantics>SL n+1(q)<annotation encoding="application/x-tex">SL_{n+1}(q)</annotation></semantics> (with unit determinant), <semantics>PGL n+1(q)<annotation encoding="application/x-tex">\mathrm{PGL}_{n+1}(q)</annotation></semantics> (quotient by the centre) and <semantics>PSL n+1(q)<annotation encoding="application/x-tex">\mathrm{PSL}_{n+1}(q)</annotation></semantics> (both). Over <semantics>𝔽 2<annotation encoding="application/x-tex">\mathbb{F}_2</annotation></semantics>, the determinant is always <semantics>1<annotation encoding="application/x-tex">1</annotation></semantics> (since that’s the only non-zero scalar) and the centre is trivial, so these groups are all the same.

In projective spaces over <semantics>𝔽 2<annotation encoding="application/x-tex">\mathbb{F}_2</annotation></semantics>, there are <semantics>3<annotation encoding="application/x-tex">3</annotation></semantics> points on every line, so we can ‘add’ two any points and get the third point on the line through them. (This is just a projection of the underlying vector space addition.)

In odd characteristic, we get two other families of Lie type by preserving two types of non-degenerate bilinear form: symmetric and skew-symmetric, corresponding to orthogonal and symplectic structures respectively. (Non-degenerate Hermitian forms, defined over <semantics>𝔽 q 2<annotation encoding="application/x-tex">\mathbb{F}_{q^2}</annotation></semantics>, also exist and behave similarly.)

Denote the form by <semantics>B(x,y)<annotation encoding="application/x-tex">B(x,y)</annotation></semantics>. Points <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics> for which <semantics>B(x,x)=0<annotation encoding="application/x-tex">B(x, x)=0</annotation></semantics> are isotropic. For a symplectic structure all points are isotropic. A form <semantics>B<annotation encoding="application/x-tex">B</annotation></semantics> such that <semantics>B(x,x)=0<annotation encoding="application/x-tex">B(x,x)=0</annotation></semantics> for all <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics> is called alternating, and in odd characteristic, but not characteristic <semantics>2<annotation encoding="application/x-tex">2</annotation></semantics>, skew-symmetric and alternating forms are the same thing.

A line spanned by two isotropic points, <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics> and <semantics>y<annotation encoding="application/x-tex">y</annotation></semantics>, such that <semantics>B(x,y)=1<annotation encoding="application/x-tex">B(x,y)=1</annotation></semantics> is a hyperbolic line. Any space with a non-degenerate bilinear (or Hermitian) form can be decomposed as the orthogonal sum of hyperbolic lines (i.e. as a vector space, decomposed as an orthogonal sum of hyperbolic planes), possibly together with an anisotropic space containing no isotropic points at all. There are no non-empty symplectic anisotropic spaces, so all symplectic spaces are odd-dimensional (projectively — the corresponding vector spaces are even-dimensional).

There are anisotropic orthogonal points and lines (over any finite field including in even characteristic), but all the orthogonal spaces we consider here will be a sum of hyperbolic lines — we say they are of plus type. (The odd-dimensional projective spaces with a residual anisotropic line are of minus type.)

A quadratic form <semantics>Q(x)<annotation encoding="application/x-tex">Q(x)</annotation></semantics> is defined by the conditions

i) <semantics>Q(x+y)=Q(x)+Q(y)+B(x,y)<annotation encoding="application/x-tex">Q(x+y)=Q(x)+Q(y)+B(x,y)</annotation></semantics>, where <semantics>B<annotation encoding="application/x-tex">B</annotation></semantics> is a symmetric bilinear form.

ii) <semantics>Q(λx)=λ 2Q(x)<annotation encoding="application/x-tex">Q(\lambda x)=\lambda^2Q(x)</annotation></semantics> for any scalar <semantics>λ<annotation encoding="application/x-tex">\lambda</annotation></semantics>.

There are some non-degeneracy conditions I won’t go into.

Obviously, a quadratic form implies a particular symmetric bilinear form, by <semantics>B(x,y)=Q(x+y)Q(x)Q(y)<annotation encoding="application/x-tex">B(x,y)=Q(x+y)-Q(x)-Q(y)</annotation></semantics>. In odd characteristic, we can go the other way: <semantics>Q(x)=12B(x,x)<annotation encoding="application/x-tex">Q(x)=\frac{1}{2}B(x,x)</annotation></semantics>.

We denote the group preserving an orthogonal structure of plus type on an <semantics>n<annotation encoding="application/x-tex">n</annotation></semantics>-dimensional projective space over <semantics>𝔽 q<annotation encoding="application/x-tex">\mathbb{F}_q</annotation></semantics> by <semantics>GO n+1 +(q)<annotation encoding="application/x-tex">\mathrm{GO}_{n+1}^+(q)</annotation></semantics>, by analogy with <semantics>GL n+1(q)<annotation encoding="application/x-tex">\mathrm{GL}_{n+1}(q)</annotation></semantics>. Similarly we have <semantics>SO n+1 +(q)<annotation encoding="application/x-tex">\mathrm{SO}_{n+1}^+(q)</annotation></semantics>, <semantics>PGO n+1 +(q)<annotation encoding="application/x-tex">\mathrm{PGO}_{n+1}^+(q)</annotation></semantics> and <semantics>PSO n+1 +(q)<annotation encoding="application/x-tex">\mathrm{PSO}_{n+1}^+(q)</annotation></semantics>. However, whereas <semantics>PSL n(q)<annotation encoding="application/x-tex">\mathrm{PSL}_n(q)</annotation></semantics> is simple apart from <semantics>2<annotation encoding="application/x-tex">2</annotation></semantics> exceptions, we usually have an index <semantics>2<annotation encoding="application/x-tex">2</annotation></semantics> subgroup of <semantics>SO n+1 +(q)<annotation encoding="application/x-tex">\mathrm{SO}_{n+1}^+(q)</annotation></semantics>, called <semantics>Ω n+1 +(q)<annotation encoding="application/x-tex">\Omega_{n+1}^+(q)</annotation></semantics>, and a corresponding index <semantics>2<annotation encoding="application/x-tex">2</annotation></semantics> subgroup of <semantics>PSO n+1 +(q)<annotation encoding="application/x-tex">\mathrm{PSO}_{n+1}^+(q)</annotation></semantics>, called <semantics>PΩ n+1 +(q)<annotation encoding="application/x-tex">\mathrm{P}\Omega_{n+1}^+(q)</annotation></semantics>, and it is the latter that is simple. (There is an infinite family of exceptions, where <semantics>PSO n+1 +(q)<annotation encoding="application/x-tex">\mathrm{PSO}_{n+1}^+(q)</annotation></semantics> is simple.)

Symplectic structures are easier — the determinant is automatically <semantics>1<annotation encoding="application/x-tex">1</annotation></semantics>, so we just have <semantics>Sp n+1(q)<annotation encoding="application/x-tex">\mathrm{Sp}_{n+1}(q)</annotation></semantics> and <semantics>PSp n+1(q)<annotation encoding="application/x-tex">\mathrm{PSp}_{n+1}(q)</annotation></semantics>, with the latter being simple except for <semantics>3<annotation encoding="application/x-tex">3</annotation></semantics> exceptions.

Just as a point with <semantics>B(x,x)=0<annotation encoding="application/x-tex">B(x,x)=0</annotation></semantics> is an isotropic point, so any subspace with <semantics>B<annotation encoding="application/x-tex">B</annotation></semantics> identically <semantics>0<annotation encoding="application/x-tex">0</annotation></semantics> on it is an isotropic subspace.

And just as the linear groups act on incidence geometries given by the (‘classical’) projective spaces, so the symplectic and orthogonal act on polar spaces, whose points, lines, planes, etc, are just the isotropic points, isotropic lines, isotropic planes, etc given by the bilinear (or Hermitian) form. We denote an orthogonal polar space of plus type on an <semantics>n<annotation encoding="application/x-tex">n</annotation></semantics>-dimensional projective space over <semantics>𝔽 q<annotation encoding="application/x-tex">\mathbb{F}_q</annotation></semantics> by <semantics>Q n +(q)<annotation encoding="application/x-tex">\mathrm{Q}_n^+(q)</annotation></semantics>.

In characteristic <semantics>2<annotation encoding="application/x-tex">2</annotation></semantics>, a lot of this goes wrong, but in a way that can be fixed and mostly turns out the same.

1) Symmetric and skew-symmetric forms are the same thing! There are still distinct orthogonal and symplectic structures and groups, but we can’t use this as the distinction.

2) Alternating and skew-symmetric forms are not the same thing! Alternating forms are all skew-symmetric (aka symmetric) but not vice versa. A symplectic structure is given by an alternating form — and of course this definition works in odd characteristic too.

3) Symmetric bilinear forms are no longer in bijection with quadratic forms: every quadratic form gives a unique symmetric (aka skew-symmetric, and indeed alternating) bilinear form, but an alternating form is compatible with multiple quadratic forms. We use non-degenerate quadratic forms to define orthogonal structures, rather than symmetric bilinear forms — which of course works in odd characteristic too. (Note also from the above that in characteristic <semantics>2<annotation encoding="application/x-tex">2</annotation></semantics> an orthogonal structure has an associated symplectic structure, which it shares with other orthogonal structures.)

We now have both isotropic subspaces on which the bilinear form is identically <semantics>0<annotation encoding="application/x-tex">0</annotation></semantics>, and singular subspaces on which the quadratic form is identically <semantics>0<annotation encoding="application/x-tex">0</annotation></semantics>, with the latter being a subset of the former. It is the singular spaces which go to make up the polar space for the orthogonal structure.

To cover both cases, we’ll refer to these isotropic/singular projective spaces inside the polar spaces as flats.

Everything else is still the same — decomposition into hyperbolic lines and an anisotropic space, plus and minus types, <semantics>Ω n+1 +(q)<annotation encoding="application/x-tex">\Omega_{n+1}^+(q)</annotation></semantics> inside <semantics>SO n+1 +(q)<annotation encoding="application/x-tex">\mathrm{SO}_{n+1}^+(q)</annotation></semantics>, polar spaces, etc.

Over <semantics>𝔽 2<annotation encoding="application/x-tex">\mathbb{F}_2</annotation></semantics>, we have that <semantics>GO n+1 +(q)<annotation encoding="application/x-tex">\mathrm{GO}_{n+1}^+(q)</annotation></semantics>, <semantics>SO n+1 +(q)<annotation encoding="application/x-tex">\mathrm{SO}_{n+1}^+(q)</annotation></semantics>, <semantics>PGO n+1 +(q)<annotation encoding="application/x-tex">\mathrm{PGO}_{n+1}^+(q)</annotation></semantics> and <semantics>PSO n+1 +(q)<annotation encoding="application/x-tex">\mathrm{PSO}_{n+1}^+(q)</annotation></semantics> are all the same group, as are <semantics>Ω n+1 +(q)<annotation encoding="application/x-tex">\Omega_{n+1}^+(q)</annotation></semantics> and <semantics>PΩ n+1 +(q)<annotation encoding="application/x-tex">\mathrm{P}\Omega_{n+1}^+(q)</annotation></semantics>.

The vector space dimension of the maximal flats in a polar space is the polar rank of the space, one of its most important invariants — it’s the number of hyperbolic lines in its orthogonal decomposition.

<semantics>Q 2m1 +(q)<annotation encoding="application/x-tex">\mathrm{Q}_{2m-1}^+(q)</annotation></semantics> has rank <semantics>m<annotation encoding="application/x-tex">m</annotation></semantics>. The maximal flats fall into two classes. In odd characteristic, the classes are preserved by <semantics>SO 2m +(q)<annotation encoding="application/x-tex">\mathrm{SO}_{2m}^+(q)</annotation></semantics> but interchanged by the elements of <semantics>GO 2m +(q)<annotation encoding="application/x-tex">\mathrm{GO}_{2m}^+(q)</annotation></semantics> with determinant <semantics>1<annotation encoding="application/x-tex">-1</annotation></semantics>. In even characteristic, the classes are preserved by <semantics>Ω 2m +(q)<annotation encoding="application/x-tex">\Omega_{2m}^+(q)</annotation></semantics>, but interchanged by elements of <semantics>GO 2m +(q)<annotation encoding="application/x-tex">\mathrm{GO}_{2m}^+(q)</annotation></semantics>.

Finally, I’ll refer to the value of the quadratic form at a point, <semantics>Q(x)<annotation encoding="application/x-tex">Q(x)</annotation></semantics>, as the norm of <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics>, even though in Euclidean space we’d call it “half the norm-squared”.

Here are some useful facts about <semantics>Q 2m1 +(q)<annotation encoding="application/x-tex">\mathrm{Q}_{2m-1}^+(q)</annotation></semantics>:

1a. The number of points is <semantics>(q m1)(q m1+1)q1<annotation encoding="application/x-tex">\displaystyle\frac{\left(q^m-1\right)\left(q^{m-1}+1\right)}{q-1}</annotation></semantics>.

1b. The number of maximal flats is <semantics> i=0 m1(1+q i)<annotation encoding="application/x-tex">\prod_{i=0}^{m-1}\left(1+q^i\right)</annotation></semantics>.

1c. Two maximal flats of different types must intersect in a flat of odd codimension; two maximal flats of the same type must intersect in a flat of even codimension.

Here two more general facts.

1d. Pick a projective space <semantics>Π<annotation encoding="application/x-tex">\Pi</annotation></semantics> of dimension <semantics>n<annotation encoding="application/x-tex">n</annotation></semantics>. Pick a point <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics> in it. The space whose points are lines through <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>, whose lines are planes through <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>, etc, with incidence inherited from <semantics>Π<annotation encoding="application/x-tex">\Pi</annotation></semantics>, is a projective space of dimension <semantics>n1<annotation encoding="application/x-tex">n-1</annotation></semantics>.

1e. Pick a polar space <semantics>Σ<annotation encoding="application/x-tex">\Sigma</annotation></semantics> of rank <semantics>m<annotation encoding="application/x-tex">m</annotation></semantics>. Pick a point <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics> in it. The space whose points are lines (i.e. <semantics>1<annotation encoding="application/x-tex">1</annotation></semantics>-flats) through <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>, whose lines are planes (i.e. <semantics>2<annotation encoding="application/x-tex">2</annotation></semantics>-flats) through <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>, etc, with incidence inherited from <semantics>Σ<annotation encoding="application/x-tex">\Sigma</annotation></semantics>, is a polar space of the same type, of rank <semantics>m1<annotation encoding="application/x-tex">m-1</annotation></semantics>.

The Klein correspondence at breakneck speed

The bivectors of a <semantics>4<annotation encoding="application/x-tex">4</annotation></semantics>-dimensional vector space constitute a <semantics>6<annotation encoding="application/x-tex">6</annotation></semantics>-dimensional vector space. Apart from the zero bivector, these fall into two types: degenerate ones which can be decomposed as the wedge product of two vectors and therefore correspond to planes (or, projectively, lines); and non-degenerate ones, which, by, wedging with vectors on each side give rise to symplectic forms. Wedging two bivectors gives an element of the <semantics>1<annotation encoding="application/x-tex">1</annotation></semantics>-dimensional space of <semantics>4<annotation encoding="application/x-tex">4</annotation></semantics>-vectors, and, picking a basis, the single component of this wedge product gives a non-degenerate symmetric bilinear form on the <semantics>6<annotation encoding="application/x-tex">6</annotation></semantics>-dimensional vector space of bivectors, and hence, in odd characteristic, an orthogonal space, which turns out to be of plus type. It also turns out that this can be carried over to characteristic <semantics>2<annotation encoding="application/x-tex">2</annotation></semantics> as well, and gives a correspondence between <semantics>PG(3,q)<annotation encoding="application/x-tex">\mathrm{PG}(3,q)</annotation></semantics> and <semantics>Q 5 +(q)<annotation encoding="application/x-tex">\mathrm{Q}_5^+(q)</annotation></semantics>, and isomorphisms between their symmetry groups. It is precisely the degenerate bivectors that are the ones of norm <semantics>0<annotation encoding="application/x-tex">0</annotation></semantics>, and we get the following correspondence:

<semantics>Q 5 +(q) PG(3,q) point line orthogonal points intersecting lines line plane pencil plane 1 point plane 2 plane<annotation encoding="application/x-tex">\array{\arrayopts{\collayout{left}\collines{dashed}\rowlines{solid dashed}\frame{solid}} \mathbf{\mathrm{Q}_5^+(q)}&\mathbf{\mathrm{PG}(3,q)}\\ \text{point}&\text{line}\\ \text{orthogonal points}&\text{intersecting lines}\\ \text{line}&\text{plane pencil}\\ \text{plane}_1&\text{point}\\ \text{plane}_2&\text{plane} }</annotation></semantics>

Here, “plane pencil” is all the lines that both go through a particular point and lie in a particular plane: effectively a point on a plane. The two types of plane in <semantics>Q 5 +(q)<annotation encoding="application/x-tex">\mathrm{Q}_5^+(q)</annotation></semantics> are two families of maximal flats, and they correspond, in <semantics>PG(3,q)<annotation encoding="application/x-tex">\mathrm{PG}(3,q)</annotation></semantics>, to “all the lines through a particular point” and “all the lines in a particular plane”.

From fact 1c above, in <semantics>Q 5 +(q)<annotation encoding="application/x-tex">\mathrm{Q}_5^+(q)</annotation></semantics> we have that two maximal flats of of different type must either intersect in a line or not intersect at all, corresponding to the fact in <semantics>PG(3,q)<annotation encoding="application/x-tex">\mathrm{PG}(3,q)</annotation></semantics> that a point and a plane either coincide or don’t; while two maximal flats of the same type must intersect in a point, corresponding to the fact in <semantics>PG(3,q)<annotation encoding="application/x-tex">\mathrm{PG}(3,q)</annotation></semantics> that any two points lie in a line, and any two planes intersect in a line.

Triality zips past your window

In <semantics>Q 7 +(q)<annotation encoding="application/x-tex">\mathrm{Q}_7^+(q)</annotation></semantics>, you may observe from facts 1a and 1b that the following three things are equal in number: points; maximal flats of one type; maximal flats of the other type. This is because these three things are cycled by the triality symmetry.

Counting things over <semantics>𝔽 2<annotation encoding="application/x-tex">\mathbb{F}_2</annotation></semantics>

Over <semantics>𝔽 2<annotation encoding="application/x-tex">\mathbb{F}_2</annotation></semantics>, we have the following things:

2a. <semantics>PG(3,2)<annotation encoding="application/x-tex">\mathrm{PG}(3,2)</annotation></semantics> has <semantics>15<annotation encoding="application/x-tex">15</annotation></semantics> planes, each containing <semantics>7<annotation encoding="application/x-tex">7</annotation></semantics> points and <semantics>7<annotation encoding="application/x-tex">7</annotation></semantics> lines. It has (dually) <semantics>15<annotation encoding="application/x-tex">15</annotation></semantics> points, each contained in <semantics>7<annotation encoding="application/x-tex">7</annotation></semantics> lines and <semantics>7<annotation encoding="application/x-tex">7</annotation></semantics> planes. It has <semantics>35<annotation encoding="application/x-tex">35</annotation></semantics> lines, each containing <semantics>3<annotation encoding="application/x-tex">3</annotation></semantics> points and contained in <semantics>3<annotation encoding="application/x-tex">3</annotation></semantics> planes.

2b. <semantics>Q 5 +(2)<annotation encoding="application/x-tex">\mathrm{Q}_5^+(2)</annotation></semantics> has <semantics>35<annotation encoding="application/x-tex">35</annotation></semantics> points, corresponding to the <semantics>35<annotation encoding="application/x-tex">35</annotation></semantics> lines of <semantics>PG(3,2)<annotation encoding="application/x-tex">\mathrm{PG}(3,2)</annotation></semantics>, and <semantics>30<annotation encoding="application/x-tex">30</annotation></semantics> planes, corresponding to the <semantics>15<annotation encoding="application/x-tex">15</annotation></semantics> points and <semantics>15<annotation encoding="application/x-tex">15</annotation></semantics> planes of <semantics>PG(3,2)<annotation encoding="application/x-tex">\mathrm{PG}(3, 2)</annotation></semantics>. There’s lots and lots of other interesting stuff, but we will ignore it.

2c. <semantics>Q 7 +(2)<annotation encoding="application/x-tex">\mathrm{Q}_7^+(2)</annotation></semantics> has <semantics>135<annotation encoding="application/x-tex">135</annotation></semantics> points and <semantics>270<annotation encoding="application/x-tex">270</annotation></semantics> <semantics>3<annotation encoding="application/x-tex">3</annotation></semantics>-spaces, i.e. two families of maximal flats containing <semantics>135<annotation encoding="application/x-tex">135</annotation></semantics> elements each. A projective <semantics>7<annotation encoding="application/x-tex">7</annotation></semantics>-space has <semantics>255<annotation encoding="application/x-tex">255</annotation></semantics> points, so if we give it an orthogonal structure of plus type, it will have <semantics>255135=120<annotation encoding="application/x-tex">255-135=120</annotation></semantics> points of norm <semantics>1<annotation encoding="application/x-tex">1</annotation></semantics>.

<semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> mod <semantics>2<annotation encoding="application/x-tex">2</annotation></semantics>

Now we move onto the second part.

We’ll coordinatise the <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> lattice so that the coordinates of its points are of the following types:

a) All integer, summing to an even number

b) All integer+<semantics>12<annotation encoding="application/x-tex">\frac{1}{2}</annotation></semantics>, summing to an odd number.

Then the roots are of the following types:

a) All permutations of <semantics>(±1,±1,0,0,0,0,0,0)<annotation encoding="application/x-tex">\left(\pm1,\pm1,0,0,0,0,0,0\right)</annotation></semantics>

b) All points like <semantics>(±12,±12,±12,±12,±12,±12,±12,±12)<annotation encoding="application/x-tex">\left(\pm\frac{1}{2},\pm\frac{1}{2},\pm\frac{1}{2},\pm\frac{1}{2}, \pm\frac{1}{2},\pm\frac{1}{2},\pm\frac{1}{2},\pm\frac{1}{2}\right)</annotation></semantics> with an odd number of minus signs.

We now quotient <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> by <semantics>2E 8<annotation encoding="application/x-tex">2\mathrm{E}_8</annotation></semantics>. The elements of the quotient can by represented by the following:

a) All coordinates are <semantics>1<annotation encoding="application/x-tex">1</annotation></semantics> or <semantics>0<annotation encoding="application/x-tex">0</annotation></semantics>, an even number of each.

b) All coordinates are <semantics>±12<annotation encoding="application/x-tex">\pm\frac{1}{2}</annotation></semantics> with either <semantics>1<annotation encoding="application/x-tex">1</annotation></semantics> or <semantics>3<annotation encoding="application/x-tex">3</annotation></semantics> minus signs.

c) Take an element of type b and put a star after it. The meaning of this is: you can replace any coordinate <semantics>12<annotation encoding="application/x-tex">\frac{1}{2}</annotation></semantics> and replace it with <semantics>32<annotation encoding="application/x-tex">-\frac{3}{2}</annotation></semantics>, or any coordinate <semantics>12<annotation encoding="application/x-tex">-\frac{1}{2}</annotation></semantics> and replace it with <semantics>32<annotation encoding="application/x-tex">\frac{3}{2}</annotation></semantics>, to get an <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> lattice element representing this element of <semantics>E 8/2E 8<annotation encoding="application/x-tex">\mathrm{E}_8/2\mathrm{E}_8</annotation></semantics>.

This is an <semantics>8<annotation encoding="application/x-tex">8</annotation></semantics>-dimensional vector space over <semantics>𝔽 2<annotation encoding="application/x-tex">\mathbb{F}_2</annotation></semantics>.

Now we put the following quadratic form on this space: <semantics>Q(x)<annotation encoding="application/x-tex">Q(x)</annotation></semantics> is half the Euclidean norm-squared, mod <semantics>2<annotation encoding="application/x-tex">2</annotation></semantics>. This gives rise to the following bilinear form: the Euclidean dot product mod <semantics>2<annotation encoding="application/x-tex">2</annotation></semantics>. This turns out to be a perfectly good non-degenerate quadratic form of plus type over <semantics>𝔽 2<annotation encoding="application/x-tex">\mathbb{F}_2</annotation></semantics>.

There are <semantics>120<annotation encoding="application/x-tex">120</annotation></semantics> elements of norm <semantics>1<annotation encoding="application/x-tex">1</annotation></semantics>, and these correspond to roots of <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> , with <semantics>2<annotation encoding="application/x-tex">2</annotation></semantics> roots per element (related by switching the sign of all coordinates).

a) Elements of shape <semantics>(1,1,0,0,0,0,0,0)<annotation encoding="application/x-tex">\left(1,1,0,0,0,0,0,0\right)</annotation></semantics> are already roots in this form.

b) Elements of shape <semantics>(0,0,1,1,1,1,1,1)<annotation encoding="application/x-tex">\left(0,0,1,1,1,1,1,1\right)</annotation></semantics> correspond to the roots obtained by taking the complement (replacing all <semantics>1<annotation encoding="application/x-tex">1</annotation></semantics>s by <semantics>0<annotation encoding="application/x-tex">0</annotation></semantics> and vice versa) and then changing the sign of one of the <semantics>1<annotation encoding="application/x-tex">1</annotation></semantics>s.

c) Elements in which all coordinates are <semantics>±12<annotation encoding="application/x-tex">\pm\frac{1}{2}</annotation></semantics> with either <semantics>1<annotation encoding="application/x-tex">1</annotation></semantics> or <semantics>3<annotation encoding="application/x-tex">3</annotation></semantics> minus signs are already roots, and by switching all the signs we get the half-integer roots with <semantics>5<annotation encoding="application/x-tex">5</annotation></semantics> or <semantics>7<annotation encoding="application/x-tex">7</annotation></semantics> minus signs.

There are <semantics>135<annotation encoding="application/x-tex">135</annotation></semantics> non-zero elements of norm <semantics>0<annotation encoding="application/x-tex">0</annotation></semantics>, and these all correspond to lattice points in shell <semantics>2<annotation encoding="application/x-tex">2</annotation></semantics>, with <semantics>16<annotation encoding="application/x-tex">16</annotation></semantics> lattice points per element of the vector space.

a) There are <semantics>70<annotation encoding="application/x-tex">70</annotation></semantics> elements of shape <semantics>(1,1,1,1,0,0,0,0)<annotation encoding="application/x-tex">\left(1,1,1,1,0,0,0,0\right)</annotation></semantics>. We get <semantics>8<annotation encoding="application/x-tex">8</annotation></semantics> lattice points by changing an even number of signs (including <semantics>0<annotation encoding="application/x-tex">0</annotation></semantics>). We get another <semantics>8<annotation encoding="application/x-tex">8</annotation></semantics> lattice points by taking the complement and then changing an odd number of signs.

b) There is <semantics>1<annotation encoding="application/x-tex">1</annotation></semantics> element of shape <semantics>(1,1,1,1,1,1,1,1)<annotation encoding="application/x-tex">\left(1,1,1,1,1,1,1,1\right)</annotation></semantics>. This corresponds to the <semantics>16<annotation encoding="application/x-tex">16</annotation></semantics> lattice points of shape <semantics>(±2,0,0,0,0,0,0,0)<annotation encoding="application/x-tex">\left(\pm2,0,0,0,0,0,0,0\right)</annotation></semantics>.

c) There are <semantics>64<annotation encoding="application/x-tex">64</annotation></semantics> elements like <semantics>(±12,±12,±12,±12,±12,±12,±12,±12) *<annotation encoding="application/x-tex">\left(\pm\frac{1}{2},\pm\frac{1}{2},\pm\frac{1}{2},\pm\frac{1}{2},\pm\frac {1}{2},\pm\frac{1}{2},\pm\frac{1}{2},\pm\frac{1}{2}\right)^*</annotation></semantics>, with <semantics>1<annotation encoding="application/x-tex">1</annotation></semantics> or <semantics>3<annotation encoding="application/x-tex">3</annotation></semantics> minus signs. We get <semantics>8<annotation encoding="application/x-tex">8</annotation></semantics> actual lattice points by replacing <semantics>±12<annotation encoding="application/x-tex">\pm\frac{1}{2}</annotation></semantics> by <semantics>32<annotation encoding="application/x-tex">\mp\frac{3}{2}</annotation></semantics> in one coordinate, and another <semantics>8<annotation encoding="application/x-tex">8</annotation></semantics> by changing the signs of all coordinates.

This accounts for all <semantics>16135=2160<annotation encoding="application/x-tex">16\cdot135=2160</annotation></semantics> points in shell <semantics>2<annotation encoding="application/x-tex">2</annotation></semantics>.


<semantics>shape number (1,1,1,1,1,1,1,1) 1 (1,1,1,1,0,0,0,0) 70 (±12,±12,±12,±12,±12,±12,±12,±12) * 64 total 135<annotation encoding="application/x-tex">\array{\arrayopts{\collayout{left}\rowlines{solid}\collines{solid}\frame{solid}} \mathbf{shape}&\mathbf{number}\\ \left(1,1,1,1,1,1,1,1\right)&1\\ \left(1,1,1,1,0,0,0,0\right)&70\\ \left(\pm\tfrac{1}{2},\pm\tfrac{1}{2},\pm\tfrac{1}{2},\pm\tfrac{1}{2},\pm\tfrac{ 1}{2},\pm\tfrac{1}{2},\pm\tfrac{1}{2},\pm\tfrac{1}{2}\right)^*&64\\ \mathbf{total}&\mathbf{135} }</annotation></semantics>


<semantics>shape number (1,1,1,1,1,1,0,0) 28 (1,1,0,0,0,0,0,0) 28 (±12,±12,±12,±12,±12,±12,±12,±12) 64 total 120<annotation encoding="application/x-tex">\array{\arrayopts{\collayout{left}\rowlines{solid}\collines{solid}\frame{solid}} \mathbf{shape}&\mathbf{number}\\ \left(1,1,1,1,1,1,0,0\right)&28\\ \left(1,1,0,0,0,0,0,0\right)&28\\ \left(\pm\tfrac{1}{2},\pm\tfrac{1}{2},\pm\tfrac{1}{2},\pm\tfrac{1}{2},\pm\tfrac{ 1}{2},\pm\tfrac{1}{2},\pm\tfrac{1}{2},\pm\tfrac{1}{2}\right)&64\\ \mathbf{total}&\mathbf{120} }</annotation></semantics>

Since the quadratic form in <semantics>𝔽 2<annotation encoding="application/x-tex">\mathbb{F}_2</annotation></semantics> comes from the quadratic form in Euclidean space, it is preserved by the Weyl group <semantics>W(E 8)<annotation encoding="application/x-tex">W(\mathrm{E}_8)</annotation></semantics>. In fact the homomorphism <semantics>W(E 8)GO 8 +(2)<annotation encoding="application/x-tex">W(\mathrm{E}_8)\rightarrow \mathrm{GO}_8^+(2)</annotation></semantics> is onto, although (contrary to what I said in an earlier comment) it is a double cover — the element of <semantics>W(E 8)<annotation encoding="application/x-tex">W(\mathrm{E}_8)</annotation></semantics> that reverses the sign of all coordinates is a (in fact, the) non-trivial element element of the kernel.

Large <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> lattices

Pick a Fano plane structure on a set of seven points.

Here is a large <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> containing <semantics>(2,0,0,0,0,0,0,0)<annotation encoding="application/x-tex">\left(2,0,0,0,0,0,0,0\right)</annotation></semantics>:

(where <semantics>1i,j,k,p,q,r,s7<annotation encoding="application/x-tex">1\le i,j,k,p,q,r,s\le7</annotation></semantics>)

<semantics>±2e i<annotation encoding="application/x-tex">\pm2e_i</annotation></semantics>

<semantics>±e 0±e i±e j±e k<annotation encoding="application/x-tex">\pm e_0\pm e_i\pm e_j\pm e_k</annotation></semantics> where <semantics>i<annotation encoding="application/x-tex">i</annotation></semantics>, <semantics>j<annotation encoding="application/x-tex">j</annotation></semantics>, <semantics>k<annotation encoding="application/x-tex">k</annotation></semantics> lie on a line in the Fano plane

<semantics>±e p±e q±e r±e s<annotation encoding="application/x-tex">\pm e_p\pm e_q\pm e_r\pm e_s</annotation></semantics> where <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>, <semantics>q<annotation encoding="application/x-tex">q</annotation></semantics>, <semantics>r<annotation encoding="application/x-tex">r</annotation></semantics> , <semantics>s<annotation encoding="application/x-tex">s</annotation></semantics> lie off a line in the Fano plane.

Reduced to <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> mod <semantics>2<annotation encoding="application/x-tex">2</annotation></semantics>, these come to

i) <semantics>(1,1,1,1,1,1,1,1)<annotation encoding="application/x-tex">\left(1,1,1,1,1,1,1,1\right)</annotation></semantics>

ii) <semantics>e 0+e i+e j+e k<annotation encoding="application/x-tex">e_0+e_i+e_j+e_k</annotation></semantics> where <semantics>i<annotation encoding="application/x-tex">i</annotation></semantics>, <semantics>j<annotation encoding="application/x-tex">j</annotation></semantics>, <semantics>k<annotation encoding="application/x-tex">k</annotation></semantics> lie on a line in the Fano plane. E.g. <semantics>(1,1,1,0,1,0,0,0)<annotation encoding="application/x-tex">\left(1,1,1,0,1,0,0,0\right)</annotation></semantics>.

iii) <semantics>e p+e q+e r+e s<annotation encoding="application/x-tex">e_p+e_q+e_r+e_s</annotation></semantics> where <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>, <semantics>q<annotation encoding="application/x-tex">q</annotation></semantics>, <semantics>r<annotation encoding="application/x-tex">r</annotation></semantics>, <semantics>s<annotation encoding="application/x-tex">s</annotation></semantics> lie off a line in the Fano plane. E.g. <semantics>(0,0,0,1,0,1,1,1)<annotation encoding="application/x-tex">\left(0,0,0,1,0,1,1,1\right)</annotation></semantics>.

Each of these corresponds to <semantics>16<annotation encoding="application/x-tex">16</annotation></semantics> elements of the large <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> roots.

Some notes on these points:

1) They’re all isotropic, since they have a multiple of <semantics>4<annotation encoding="application/x-tex">4</annotation></semantics> non-zero entries.

2) They’re mutually orthogonal.

  a) Elements of types ii and iii are all orthogonal to <semantics>(1,1,1,1,1,1,1,1)<annotation encoding="application/x-tex">\left(1,1,1,1,1,1,1,1\right)</annotation></semantics> because they have an even number of ones (like all all-integer elements).

  b) Two elements of type ii overlap in two places: <semantics>e 0<annotation encoding="application/x-tex">e_0</annotation></semantics> and the point of the Fano plane that they share.

  c) If an element <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics> of type ii and an element <semantics>y<annotation encoding="application/x-tex">y</annotation></semantics> of type iii are mutual complements, obviously they have no overlap. Otherwise, the complement of <semantics>y<annotation encoding="application/x-tex">y</annotation></semantics> is an element of type ii, so <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics> overlaps with it in exactly two places; hence <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics> overlaps with <semantics>y<annotation encoding="application/x-tex">y</annotation></semantics> itself in the other two non-zero places of <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics>.

  d) From <semantics>c<annotation encoding="application/x-tex">c</annotation></semantics>, given two elements of type iii, one will overlap with the complement of the other in two places, hence (by the argument of c) will overlap with the other element itself in two places.

3) Adjoining the zero vector, they give a set closed under addition.

The rule for addition of all-integer elements is reasonably straightforward: if they are orthogonal, then treat the <semantics>1<annotation encoding="application/x-tex">1</annotation></semantics>s and <semantics>0<annotation encoding="application/x-tex">0</annotation></semantics>s as bits and add mod <semantics>2<annotation encoding="application/x-tex">2</annotation></semantics>. If they aren’t orthogonal, then do the same, then take the complement of the answer.

  a) Adding <semantics>(1,1,1,1,1,1,1,1)<annotation encoding="application/x-tex">\left(1,1,1,1,1,1,1,1\right)</annotation></semantics> to any of the others just gives the complement, which is a member of the set.

  b) Adding two elements of type ii, we set to <semantics>0<annotation encoding="application/x-tex">0</annotation></semantics> the <semantics>e 0<annotation encoding="application/x-tex">e_0</annotation></semantics> component and the component corresponding to the point of intersection in the Fano plane, leaving the <semantics>4<annotation encoding="application/x-tex">4</annotation></semantics> components where they don’t overlap, which are just the complement of the third line of the Fano plane through their point of intersection, and is hence a member of the set.

  c) Each element of type iii is the sum of the element of type i and an element of type ii, hence is covered implicitly by cases a and b.

4) There are <semantics>15<annotation encoding="application/x-tex">15</annotation></semantics> elements of the set.

  a) There is <semantics>(1,1,1,1,1,1,1,1)<annotation encoding="application/x-tex">\left(1,1,1,1,1,1,1,1\right)</annotation></semantics>.

  b) There are <semantics>7<annotation encoding="application/x-tex">7</annotation></semantics> corresponding to lines of the Fano plane.

  c) There are <semantics>7<annotation encoding="application/x-tex">7</annotation></semantics> corresponding to the complements of lines of the Fano plane.

From the above, these <semantics>15<annotation encoding="application/x-tex">15</annotation></semantics> elements form a maximal flat of <semantics>Q 7 +(2)<annotation encoding="application/x-tex">\mathrm{Q}_7^+(2)</annotation></semantics>. (That is, <semantics>15<annotation encoding="application/x-tex">15</annotation></semantics> points projectively, forming a projective <semantics>3<annotation encoding="application/x-tex">3</annotation></semantics>-space in a projective <semantics>7<annotation encoding="application/x-tex">7</annotation></semantics>-space.)

That a large <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> lattice projects to a flat is straightforward:

First, as a lattice it’s closed under addition over <semantics><annotation encoding="application/x-tex">\mathbb{Z}</annotation></semantics>, so should project to a subspace over <semantics>𝔽 2<annotation encoding="application/x-tex">\mathbb{F}_2</annotation></semantics>.

Second, since the cosine of the angle between two roots of <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> is always a multiple of <semantics>12<annotation encoding="application/x-tex">\frac{1}{2}</annotation></semantics>, and the points in the second shell have Euclidean length <semantics>2<annotation encoding="application/x-tex">2</annotation></semantics>, the dot product of two large <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> roots must always be an even integer. Also, the large <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> roots project to norm <semantics>0<annotation encoding="application/x-tex">0</annotation></semantics> points. So all points of the large <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> should project to norm <semantics>0<annotation encoding="application/x-tex">0</annotation></semantics> points.

It’s not instantly obvious to me that large <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> should project to a maximal flat, but it clearly does.

So I’ll assume each <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> corresponds to a maximal flat, and generally that everything that I’m going to talk about over <semantics>𝔽 2<annotation encoding="application/x-tex">\mathbb{F}_2</annotation></semantics> lifts faithfully to Euclidean space, which seems plausible (and works)! But I haven’t proved it. Anyway, assuming this, a bunch of stuff follows.

Total number of large <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> lattices

We immediately know there are <semantics>270<annotation encoding="application/x-tex">270</annotation></semantics> large <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> lattices, because there are <semantics>270<annotation encoding="application/x-tex">270</annotation></semantics> maximal flats in <semantics>Q 7 +(2)<annotation encoding="application/x-tex">\mathrm{Q}_7^+(2)</annotation></semantics>, either from the formula <semantics> i=0 m1(1+q i)<annotation encoding="application/x-tex">\prod_{i=0}^{m-1}\left(1+q^i\right)</annotation></semantics>, or immediately from triality and the fact that there are <semantics>135<annotation encoding="application/x-tex">135</annotation></semantics> points in <semantics>Q 7 +(2)<annotation encoding="application/x-tex">\mathrm{Q}_7^+(2)</annotation></semantics>.

Number of large <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> root systems sharing a given point

We can now bring to bear some more general theory. How many large <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> root-sets share a point? Let us project this down and instead ask, How many maximal flats share a given point?

Recall fact 1e:

1e. Pick a polar space <semantics>Σ<annotation encoding="application/x-tex">\Sigma</annotation></semantics> of rank <semantics>m<annotation encoding="application/x-tex">m</annotation></semantics>. Pick a point <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics> in it. The space whose points are lines (i.e. <semantics>1<annotation encoding="application/x-tex">1</annotation></semantics>-flats) through <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>, whose lines are planes (i.e. <semantics>2<annotation encoding="application/x-tex">2</annotation></semantics>-flats) through <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>, etc, with incidence inherited from <semantics>Σ<annotation encoding="application/x-tex">\Sigma</annotation></semantics>, form a polar space of the same type, of rank <semantics>m1<annotation encoding="application/x-tex">m-1</annotation></semantics>.

So pick a point <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics> in <semantics>Q 7 +(2)<annotation encoding="application/x-tex">\mathrm{Q}_7^+(2)</annotation></semantics>. The space of all flats containing <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics> is isomorphic to <semantics>Q 5 +(2)<annotation encoding="application/x-tex">\mathrm{Q}_5^+(2)</annotation></semantics>. The maximal flats containing <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics> in <semantics>Q 7 +(2)<annotation encoding="application/x-tex">\mathrm{Q}_7^+(2)</annotation></semantics> correspond to all maximal flats of <semantics>Q 5 +(2)<annotation encoding="application/x-tex">\mathrm{Q}_5^+(2)</annotation></semantics>, of which there are <semantics>30<annotation encoding="application/x-tex">30</annotation></semantics>. So there are <semantics>30<annotation encoding="application/x-tex">30</annotation></semantics> maximal flats of <semantics>Q 7 +(2)<annotation encoding="application/x-tex">\mathrm{Q}_7^+(2)</annotation></semantics> containing <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>, and hence <semantics>30<annotation encoding="application/x-tex">30</annotation></semantics> large <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> lattices containing a given point.

We see this if we fix <semantics>(1,1,1,1,1,1,1,1)<annotation encoding="application/x-tex">\left(1,1,1,1,1,1,1,1\right)</annotation></semantics>, and the maximal flats correspond to the <semantics>30<annotation encoding="application/x-tex">30</annotation></semantics> ways of putting a Fano plane structure on <semantics>7<annotation encoding="application/x-tex">7</annotation></semantics> points. Via the Klein correspondence, I guess this is a way to show that the <semantics>30<annotation encoding="application/x-tex">30</annotation></semantics> Fano plane structures correspond to the points and planes of <semantics>PG(3,2)<annotation encoding="application/x-tex">\mathrm{PG}(3,2)</annotation></semantics>.

Number of large <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> root system disjoint from a given large <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> root system

Now assume that large <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> lattices with non-intersecting sets of roots correspond to non-intersecting maximal flats. The intersections of maximal flats obey rule 1c:

1c. Two maximal flats of different types must intersect in a flat of odd codimension; two maximal flats of the same type must intersect in a flat of even codimension.

So two <semantics>3<annotation encoding="application/x-tex">3</annotation></semantics>-flats of opposite type must intersect in a plane or a point; if they are of the same type, they must intersect in a line or not at all (the empty set having dimension <semantics>1<annotation encoding="application/x-tex">-1</annotation></semantics>).

We want to count the dimension <semantics>1<annotation encoding="application/x-tex">-1</annotation></semantics> intersections, but it’s easier to count the dimension <semantics>1<annotation encoding="application/x-tex">1</annotation></semantics> intersections and subtract from the total.

So, given a <semantics>3<annotation encoding="application/x-tex">3</annotation></semantics>-flat, how many other <semantics>3<annotation encoding="application/x-tex">3</annotation></semantics>-flats intersect it in a line?

Pick a point <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics> in <semantics>Q 7 +(2)<annotation encoding="application/x-tex">\mathrm{Q}_7^+(2)</annotation></semantics>. The <semantics>3<annotation encoding="application/x-tex">3</annotation></semantics>-flats sharing that point correspond to the planes of <semantics>Q 5 +(2)<annotation encoding="application/x-tex">\mathrm{Q}_5^+(2)</annotation></semantics>. Then the set of <semantics>3<annotation encoding="application/x-tex">3</annotation></semantics>-flats sharing just a line through <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics> with our given <semantics>3<annotation encoding="application/x-tex">3</annotation></semantics>-flat correspond to the set of planes of <semantics>Q 5 +(2)<annotation encoding="application/x-tex">\mathrm{Q}_5^+(2)</annotation></semantics> sharing a single point with a given plane. By what was said above, this is all the other planes of the same type (there’s no other dimension these intersections can have). There are <semantics>14<annotation encoding="application/x-tex">14</annotation></semantics> of these (<semantics>15<annotation encoding="application/x-tex">15</annotation></semantics> planes minus the given one).

So, given a point in the <semantics>3<annotation encoding="application/x-tex">3</annotation></semantics>-flat, there are <semantics>14<annotation encoding="application/x-tex">14</annotation></semantics> other <semantics>3<annotation encoding="application/x-tex">3</annotation></semantics>-flats sharing a line (and no more) which passes through the point. There are <semantics>15<annotation encoding="application/x-tex">15</annotation></semantics> points in the <semantics>3<annotation encoding="application/x-tex">3</annotation></semantics>-flat, but on the other hand there are <semantics>3<annotation encoding="application/x-tex">3</annotation></semantics> points in a line, giving <semantics>14153=70<annotation encoding="application/x-tex">\frac{14\cdot15}{3}=70</annotation></semantics> <semantics>3<annotation encoding="application/x-tex">3</annotation></semantics>-spaces sharing a line (and no more) with a given <semantics>3<annotation encoding="application/x-tex">3</annotation></semantics>-flat.

But there are a total of <semantics>135<annotation encoding="application/x-tex">135</annotation></semantics> <semantics>3<annotation encoding="application/x-tex">3</annotation></semantics>-flats of a given type. If <semantics>1<annotation encoding="application/x-tex">1</annotation></semantics> of them is a given <semantics>3<annotation encoding="application/x-tex">3</annotation></semantics>-flat, and <semantics>70<annotation encoding="application/x-tex">70</annotation></semantics> of them intersect that <semantics>3<annotation encoding="application/x-tex">3</annotation></semantics>-flat in a line, then <semantics>135170=64<annotation encoding="application/x-tex">135-1-70=64</annotation></semantics> don’t intersect the <semantics>3<annotation encoding="application/x-tex">3</annotation></semantics>-flat at all. So there should be <semantics>64<annotation encoding="application/x-tex">64</annotation></semantics> large <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> lattices whose roots don’t meet the roots of a given large <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> lattice.

Other numbers of intersecting root systems

We can also look at the intersections of large <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> root systems with large <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> root systems of opposite type. What about the intersections of two <semantics>3<annotation encoding="application/x-tex">3</annotation></semantics>-flats in a plane? If we focus just on planes passing through a particular point, this corresponds, in <semantics>Q 5 +(2)<annotation encoding="application/x-tex">\mathrm{Q}_5^+(2)</annotation></semantics>, to planes intersecting in a line. There are <semantics>7<annotation encoding="application/x-tex">7</annotation></semantics> planes intersecting a given plane in a line (from the Klein correspondence — they correspond to the seven points in a plane or the seven planes containing a point of <semantics>PG(3,2)<annotation encoding="application/x-tex">\mathrm{PG}(3,2)</annotation></semantics>). So there are <semantics>7<annotation encoding="application/x-tex">7</annotation></semantics> <semantics>3<annotation encoding="application/x-tex">3</annotation></semantics>-flats of <semantics>Q 7 +(2)<annotation encoding="application/x-tex">\mathrm{Q}_7^+(2)</annotation></semantics> which intersect a given <semantics>3<annotation encoding="application/x-tex">3</annotation></semantics>-flat in a plane containing a given point. There <semantics>15<annotation encoding="application/x-tex">15</annotation></semantics> points to choose from, but <semantics>7<annotation encoding="application/x-tex">7</annotation></semantics> points in a plane, meaning that there are <semantics>7157=15<annotation encoding="application/x-tex">\frac{7\cdot15}{7}=15</annotation></semantics> <semantics>3<annotation encoding="application/x-tex">3</annotation></semantics>-flats intersecting a given <semantics>3<annotation encoding="application/x-tex">3</annotation></semantics>-flat in a plane. A plane has <semantics>7<annotation encoding="application/x-tex">7</annotation></semantics> points, so translating that to <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> lattices should give <semantics>716=112<annotation encoding="application/x-tex">7\cdot16=112</annotation></semantics> shared roots.

That leaves <semantics>13515=120<annotation encoding="application/x-tex">135-15=120</annotation></semantics> <semantics>3<annotation encoding="application/x-tex">3</annotation></semantics>-flats intersecting a given <semantics>3<annotation encoding="application/x-tex">3</annotation></semantics>-flat in a single point, corresponding to <semantics>16<annotation encoding="application/x-tex">16</annotation></semantics> shared roots.

<semantics>intersection dim. number same type 2 15 No 1 70 Yes 0 120 No 1 64 Yes<annotation encoding="application/x-tex">\array{\arrayopts{\collayout{left}\collines{solid}\rowlines{solid}\frame{solid}} \mathbf{\text{intersection dim.}}&\mathbf{\text{number}}&\mathbf{\text{same type}}\\ 2&15&No\\ 1&70&Yes\\ 0&120&No\\ -1&64&Yes }</annotation></semantics>

A couple of points here related to triality. Under triality, one type of maximal flat gets sent to the other type, and the other type gets sent to singular points (<semantics>0<annotation encoding="application/x-tex">0</annotation></semantics>-flats). The incidence relation of “intersecting in a plane” gets sent to ordinary incidence of a point with a flat. So the fact that there are <semantics>15<annotation encoding="application/x-tex">15</annotation></semantics> maximal flats that intersect a given maximal flat in a plane is a reflection of the fact that there are <semantics>15<annotation encoding="application/x-tex">15</annotation></semantics> points in a maximal flat (or, dually, <semantics>15<annotation encoding="application/x-tex">15</annotation></semantics> maximal flats of a given type containing a given point).

The intersection of two maximal flats of the same type translates into a relation between two singular points. Just from the numbers, we’d expect “intersection in a line” to translate into “orthogonal to”, and “disjoint” to translate into “not orthogonal to”.

In that case, a pair of maximal flats intersecting in a (flat) line translates to <semantics>2<annotation encoding="application/x-tex">2</annotation></semantics> mutually orthogonal flat points — whose span is a flat line. Which makes sense, because under triality, <semantics>1<annotation encoding="application/x-tex">1</annotation></semantics>-flats transform to <semantics>1<annotation encoding="application/x-tex">1</annotation></semantics>-flats, reflecting the fact that the central point of the <semantics>D 4<annotation encoding="application/x-tex">D_4</annotation></semantics> diagram (representing lines) is sent to itself under triality.

In that case, two disjoint maximal flats translates to a pair of non-orthogonal singular points, defining a hyperbolic line.

Fixing a hyperbolic line (pointwise) obviously reduces the rank of the polar space by <semantics>1<annotation encoding="application/x-tex">1</annotation></semantics>, picking out a <semantics>GO 6 +(2)<annotation encoding="application/x-tex">\mathrm{GO}_6^+(2)</annotation></semantics> subgroup of <semantics>GO 8 +(2)<annotation encoding="application/x-tex">\mathrm{GO}_8^+(2)</annotation></semantics>. By the Klein correspondence, <semantics>GO 6 +(2)<annotation encoding="application/x-tex">\mathrm{GO}_6^+(2)</annotation></semantics> is isomorphic to <semantics>PSL 4(2)<annotation encoding="application/x-tex">\mathrm{PSL}_4(2)</annotation></semantics>, which is just the automorphism group of <semantics>PG(3,2)<annotation encoding="application/x-tex">\mathrm{PG}(3, 2)</annotation></semantics> — i.e., here, the automorphism group of a maximal flat. So the joint stabiliser of two disjoint maximal flats is just automorphisms of one of them, which forces corresponding automorphisms of the other. This group is also isomorphic to the symmetric group <semantics>S 8<annotation encoding="application/x-tex">S_8</annotation></semantics>, giving all permutations of the coordinates (of the <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> lattice).

(My guess would be that the actions of <semantics>GL 4(2)<annotation encoding="application/x-tex">\mathrm{GL}_4(2)</annotation></semantics> on the two maximal flats would be related by an outer automorphsm of <semantics>GL 4(2)<annotation encoding="application/x-tex">\mathrm{GL}_4(2)</annotation></semantics>, in which the action on the points of one flat would match an action on the planes of the other, and vice versa, preserving the orthogonality relations coming from the symplectic structure implied by the orthogonal structure — i.e. the alternating form implied by the quadratic form.)

Nearest neighbours

We see this “non-orthogonal singular points” <semantics><annotation encoding="application/x-tex">\leftrightarrow</annotation></semantics> “disjoint maximal flats” echoed when we look at nearest neighbours.

Nearest neighbours in the second shell of the <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> lattice are separated from each other by an angle of <semantics>cos 134<annotation encoding="application/x-tex">\cos^{-1}\frac{3}{4}</annotation></semantics>, so have a mutual dot product of <semantics>3<annotation encoding="application/x-tex">3</annotation></semantics>, hence are non-orthogonal over <semantics>𝔽 2<annotation encoding="application/x-tex">\mathbb{F}_2</annotation></semantics>.

Let us choose a fixed point <semantics>(2,0,0,0,0,0,0,0)<annotation encoding="application/x-tex">\left(2,0,0,0,0,0,0,0\right)</annotation></semantics> in the second shell of <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> . This has as our chosen representative <semantics>(1,1,1,1,1,1,1,1)<annotation encoding="application/x-tex">\left(1,1,1,1,1,1,1,1\right)</annotation></semantics> in our version of <semantics>PG(7,2)<annotation encoding="application/x-tex">\mathrm{PG}(7,2)</annotation></semantics>, which has the convenient property that it is orthogonal to the all-integer points, and non-orthogonal to the half-integer points. The half-integer points in the second shell are just those that we write as <semantics>(±12,±12,±12,±12,±12,±12,±12,±12) <annotation encoding="application/x-tex">\left(\pm\frac{1}{2},\pm\frac{1}{2},\pm\frac{1}{2},\pm\frac{1}{2},\pm\frac{1}{2}, \pm\frac{1}{2},\pm\frac{1}{2},\pm\frac{1}{2}\right)^\star</annotation></semantics> in our notation, where the <semantics>*<annotation encoding="application/x-tex">*</annotation></semantics> means that we should replace any <semantics>12<annotation encoding="application/x-tex">\frac{1}{2}</annotation></semantics> by <semantics>32<annotation encoding="application/x-tex">-\frac{3}{2}</annotation></semantics> or replace any <semantics>12<annotation encoding="application/x-tex">-\frac{1}{2}</annotation></semantics> by <semantics>32<annotation encoding="application/x-tex">\frac{3}{2}</annotation></semantics> to get a corresponding element in the second shell of the <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> latttice, and where we require <semantics>1<annotation encoding="application/x-tex">1</annotation></semantics> or <semantics>3<annotation encoding="application/x-tex">3</annotation></semantics> minus signs in the notation, to correspond two points in the lattice with opposite signs in all coordinates.

Now, since each reduced isotropic point represents <semantics>16<annotation encoding="application/x-tex">16</annotation></semantics> points of the second shell, merely saying that two reduced points have dot product of <semantics>1<annotation encoding="application/x-tex">1</annotation></semantics> is not enough to pin down actual nearest neighbours.

But very conveniently, the sets of <semantics>16<annotation encoding="application/x-tex">16</annotation></semantics> are formed in parallel ways for the particular setup we have chosen. Namely, lifting <semantics>(1,1,1,1,1,1,1,1)<annotation encoding="application/x-tex">\left(1,1,1,1,1,1,1,1\right)</annotation></semantics> to a second-shell element, we can choose to put the <semantics>±2<annotation encoding="application/x-tex">\pm2</annotation></semantics> in each of the <semantics>8<annotation encoding="application/x-tex">8</annotation></semantics> coordinates, with positive or negative sign, and lifting an element of the form <semantics>(±12,±12,±12,±12,±12,±12,±12,±12) *<annotation encoding="application/x-tex">\left(\pm\frac{1}{2},\pm\frac{1}{2},\pm\frac{1}{2},\pm\frac{1}{2},\pm\frac{1}{2}, \pm\frac{1}{2},\pm\frac{1}{2},\pm\frac{1}{2}\right)^*</annotation></semantics> to a second-shell element, we can choose to put the <semantics>±32<annotation encoding="application/x-tex">\pm\frac{3}{2}</annotation></semantics> in each of the <semantics>8<annotation encoding="application/x-tex">8</annotation></semantics> coordinates, with positive or negative sign.

So we can line up our conventions, and choose, e.g., specifically <semantics>(+2,0,0,0,0,0,0,0)<annotation encoding="application/x-tex">\left(+2,0,0, 0,0,0,0,0\right)</annotation></semantics>, and choose neighbours of the form <semantics>(+32,±12,±12,±12,±12,±12,±12,±12)<annotation encoding="application/x-tex">\left(+\frac{3}{2},\pm\frac{1}{2},\pm\frac{1}{2},\pm\frac{1}{2},\pm\frac{1}{2}, \pm\frac{1}{2},\pm\frac{1}{2},\pm\frac{1}{2}\right)</annotation></semantics>, with an even number of minus signs.

This tells us we have <semantics>64<annotation encoding="application/x-tex">64</annotation></semantics> nearest neighbours, corresponding to the <semantics>64<annotation encoding="application/x-tex">64</annotation></semantics> isotropic points of half-integer form. Let us call this set of points <semantics>T<annotation encoding="application/x-tex">T</annotation></semantics>.

Now pick one of those <semantics>64<annotation encoding="application/x-tex">64</annotation></semantics> isotropic points, call it <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>. It lies, as we showed earlier, in <semantics>30<annotation encoding="application/x-tex">30</annotation></semantics> maximal flats, corresponding to the <semantics>30<annotation encoding="application/x-tex">30</annotation></semantics> plane flats of <semantics>Q 5 +(2)<annotation encoding="application/x-tex">\mathrm{Q}_5^+(2)</annotation></semantics>, and we would like to understand the intersections of these flats with <semantics>T<annotation encoding="application/x-tex">T</annotation></semantics>: that is, those nearest neighbours which belong to each large <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> lattice.

In any maximal flat, i.e. any <semantics>3<annotation encoding="application/x-tex">3</annotation></semantics>-flat, containing <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>, there will be <semantics>7<annotation encoding="application/x-tex">7</annotation></semantics> lines passing through <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>, each with <semantics>2<annotation encoding="application/x-tex">2</annotation></semantics> other points on it, totalling <semantics>14<annotation encoding="application/x-tex">14</annotation></semantics> which, together with <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics> itself form the <semantics>15<annotation encoding="application/x-tex">15</annotation></semantics> points of a copy of <semantics>PG(3,2)<annotation encoding="application/x-tex">\mathrm{PG}(3,2)</annotation></semantics>.

Now, the sum of two all-integer points is an all-integer point, but the sum of two half-integer points is also an all-integer point. So of the two other points on each of those lines, one will be half-integer and one all-integer. So there will be <semantics>7<annotation encoding="application/x-tex">7</annotation></semantics> half-integer points in addition to <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics> itself; i.e. the maximal flat will meet <semantics>T<annotation encoding="application/x-tex">T</annotation></semantics> in <semantics>8<annotation encoding="application/x-tex">8</annotation></semantics> points; hence the corresponding large <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> lattice will contain <semantics>8<annotation encoding="application/x-tex">8</annotation></semantics> of the nearest neighbours of <semantics>(2,0,0,0,0,0,0,0)<annotation encoding="application/x-tex">\left(2,0,0,0,0,0,0,0\right)</annotation></semantics>.

Also, because the sum of two half-integer points is not a half-integer point, no <semantics>3<annotation encoding="application/x-tex">3</annotation></semantics> of those <semantics>8<annotation encoding="application/x-tex">8</annotation></semantics> points will lie on a line.

But the only way that you can get <semantics>8<annotation encoding="application/x-tex">8</annotation></semantics> points in a <semantics>3<annotation encoding="application/x-tex">3</annotation></semantics>-space such that no <semantics>3<annotation encoding="application/x-tex">3</annotation></semantics> of them lie on a line of the space is if they are the <semantics>8<annotation encoding="application/x-tex">8</annotation></semantics> points that do not lie on a plane of the space. Hence the other <semantics>7<annotation encoding="application/x-tex">7</annotation></semantics> points — the ones lying in the all-integer subspace — must form a Fano plane.

So we have the following: inside the projective <semantics>7<annotation encoding="application/x-tex">7</annotation></semantics>-space of lattice elements mod <semantics>2<annotation encoding="application/x-tex">2</annotation></semantics>, we have the projective <semantics>6<annotation encoding="application/x-tex">6</annotation></semantics>-space of all-integer elements, and inside there we have the <semantics>5<annotation encoding="application/x-tex">5</annotation></semantics>-space of all-integer elements orthogonal to <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>, and inside there we have a polar space isomorphic to <semantics>Q 5 +(2)<annotation encoding="application/x-tex">\mathrm{Q}_5^+(2)</annotation></semantics>, and in there we have <semantics>30<annotation encoding="application/x-tex">30</annotation></semantics> planes. And adding <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics> to each element of one of those planes gives the <semantics>7<annotation encoding="application/x-tex">7</annotation></semantics> elements which accompany <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics> in the intersection of the isotropic half-integer points with the corresponding <semantics>3<annotation encoding="application/x-tex">3</annotation></semantics>-flat, which lift to the nearest neighbours of <semantics>(2,0,0,0,0,0,0,0)<annotation encoding="application/x-tex">\left(2,0,0,0,0,0,0,0\right)</annotation></semantics> lying in the corresponding large <semantics>E 8<annotation encoding="application/x-tex">\mathrm{E}_8</annotation></semantics> lattice.

by john ( at February 02, 2016 04:27 AM

John Baez - Azimuth

Corelations in Network Theory

Category theory reduces a large chunk of math to the clever manipulation of arrows. One of the fun things about this is that you can often take a familiar mathematical construction, think of it category-theoretically, and just turn around all the arrows to get something new and interesting!

In math we love functions. If we have a function

f: X \to Y

we can formally turn around the arrow to think of f as something going back from Y back to X. But this something is usually not a function: it’s called a ‘cofunction’. A cofunction from Y to X is simply a function from X to Y.

Cofunctions are somewhat interesting, but they’re really just functions viewed through a looking glass, so they don’t give much new—at least, not by themselves.

The game gets more interesting if we think of functions and cofunctions as special sorts of relations. A relation from X to Y is a subset

R \subseteq X \times Y

It’s a function when for each x \in X there’s a unique y \in Y with (x,y) \in R. It’s a cofunction when for each y \in Y there’s a unique x \in x with (x,y) \in R.

Just as we can compose functions, we can compose relations. Relations have certain advantages over functions: for example, we can ‘turn around’ any relation R from X to Y and get a relation R^\dagger from Y to X:

R^\dagger = \{(y,x) : \; (x,y) \in R \}

If we turn around a function we get a cofunction, and vice versa. But we can also do other fun things: for example, since both functions and cofunctions are relations, we can compose a function and a cofunction and get a relation.

Of course, relations also have certain disadvantages compared to functions. But it’s utterly clear by now that the category \mathrm{FinRel}, where the objects are finite sets and the morphisms are relations, is very important.

So far, so good. But what happens if we take the definition of ‘relation’ and turn all the arrows around?

There are actually several things I could mean by this question, some more interesting than others. But one of them gives a very interesting new concept: the concept of ‘corelation’. And two of my students have just written a very nice paper on corelations:

• Brandon Coya and Brendan Fong, Corelations are the prop for extraspecial commutative Frobenius monoids.

Here’s why this paper is important for network theory: corelations between finite sets are exactly what we need to describe electrical circuits made of ideal conductive wires! A corelation from a finite set X to a finite set Y can be drawn this way:

I have drawn more wires than strictly necessary: I’ve drawn a wire between two points whenever I want current to be able to flow between them. But there’s a reason I did this: a corelation from X to Y simply tells us when current can flow from one point in either of these sets to any other point in these sets.

Of course circuits made solely of conductive wires are not very exciting for electrical engineers. But in an earlier paper, Brendan introduced corelations as an important stepping-stone toward more general circuits:

• John Baez and Brendan Fong, A compositional framework for passive linear circuits. (Blog article here.)

The key point is simply that you use conductive wires to connect resistors, inductors, capacitors, batteries and the like and build interesting circuits—so if you don’t fully understand the math of conductive wires, you’re limited in your ability to understand circuits in general!

In their new paper, Brendan teamed up with Brandon Coya, and they figured out all the rules obeyed by the category \mathrm{FinCorel}, where the objects are finite sets and the morphisms are corelations. I’ll explain these rules later.

This sort of analysis had previously been done for \mathrm{FinRel}, and it turns out there’s a beautiful analogy between the two cases! Here is a chart displaying the analogy:

Spans Cospans
extra bicommutative bimonoids special commutative Frobenius monoids
Relations Corelations
extraspecial bicommutative bimonoids extraspecial commutative Frobenius monoids

I’m sure this will be cryptic to the nonmathematicians reading this, and even many mathematicians—but the paper explains what’s going on here.

I’ll actually say what an ‘extraspecial commutative Frobenius monoid’ is later in this post. This is a terse way of listing all the rules obeyed by corelations between finite sets—and thus, all the rules obeyed by conductive wires.

But first, let’s talk about something simpler.

What is a corelation?

Just as we can define functions as relations of a special sort, we can also define relations in terms of functions. A relation from X to Y is a subset

R \subseteq X \times Y

but we can think of this as an equivalence class of one-to-one functions

i: R \to X \times Y

Why an equivalence class? The image of i is our desired subset of X \times Y. The set R here could be replaced by any isomorphic set; its only role is to provide ‘names’ for the elements of X \times Y that are in the image of i.

Now we have a relation described as an arrow, or really an equivalence class of arrows. Next, let’s turn the arrow around!

There are different things I might mean by that, but we want to do it cleverly. When we turn arrows around, the concept of product (for example, cartesian product X \times Y of sets) turns into the concept of sum (for example, disjoint union X + Y of sets). Similarly, the concept of monomorphism (such as a one-to-one function) turns into the concept of epimorphism (such as an onto function). If you don’t believe me, click on the links!

So, we should define a corelation from a set X to a set Y to be an equivalence class of onto functions

p: X + Y \to C

Why an equivalence class? The set C here could be replaced by any isomorphic set; its only role is to provide ‘names’ for the sets of elements of X + Y that get mapped to the same thing via p.

In simpler terms, a corelation from X to a set Y is just a partition of the disjoint union X + Y. So, it looks like this:

If we like, we can then draw a line connecting any two points that lie in the same part of the partition:

These lines determine the corelation, so we can also draw a corelation this way:

This is why corelations describe circuits made solely of wires!

The rules governing corelations

The main result in Brandon and Brendan’s paper is that \mathrm{FinCorel} is equivalent to the PROP for extraspecial commutative Frobenius monoids. That’s a terse way of the laws governing \mathrm{FinCorel}.

Let me just show you the most important laws. In each of these law I’ll draw two circuits made of wires, and write an equals sign asserting that they give the same corelation from a set X to a set Y. The inputs X of each circuit are on top, and the outputs Y are at the bottom. I’ll draw 3-way junctions as little triangles, but don’t worry about that. When we compose two corelations we may get a wire left in mid-air, not connected to the inputs or outputs. We draw the end of the wire as a little circle.

There are some laws called the ‘commutative monoid’ laws:

and an upside-down version called the ‘cocommutative comonoid’ laws:

Then we have ‘Frobenius laws’:

and finally we have the ‘special’ and ‘extra’ laws:

All other laws can be derived from these in some systematic ways.

Commutative Frobenius monoids obey the commutative monoid laws, the cocommutative comonoid laws and the Frobenius laws. They play a fundamental role in 2d topological quantum field theory. Special Frobenius monoids are also well-known. But the ‘extra’ law, which says that a little piece of wire not connected to anything can be thrown away with no effect, is less well studied. Jason Erbele and I gave it this name in our work on control theory:

• John Baez and Jason Erbele, Categories in control. (Blog article here.)

For more

David Ellerman has spent a lot of time studying what would happen to mathematics if we turned around a lot of arrows in a certain systematic way. In particular, just as the concept of relation would be replaced by the concept of corelation, the concept of subset would be replaced by the concept of partition. You can see how it fits together: just as a relation from X to Y is a subset of X \times Y, a corelation from X to Y is a partition of X + Y.

There’s a lattice of subsets of a set:

In logic these subsets correspond to propositions, and the lattice operations are the logical operations ‘and’ and ‘or’. But there’s also a lattice of partitions of a set:

In Ellerman’s vision, this lattice of partitions gives a new kind of logic. You can read about it here:

• David Ellerman, Introduction to partition logic, Logic Journal of the Interest Group in Pure and Applied Logic 22 (2014), 94–125.

As mentioned, the main result in Brandon and Brendan’s paper is that \mathrm{FinCorel} is equivalent to the PROP for extraspecial commutative Frobenius monoids. After they proved this, they noticed that the result has also been stated in other language and proved in other ways by two other authors:

• Fabio Zanasi, Interacting Hopf Algebras—the Theory of Linear Systems, PhD thesis, École Normale Supériere de Lyon, 2015.

• K. Dosen and Z. Petrić, Syntax for split preorders, Annals of Pure and Applied Logic 164 (2013), 443–481.

Unsurprisingly, I prefer Brendan and Brandon’s approach to deriving the result. But it’s nice to see different perspectives!

by John Baez at February 02, 2016 02:00 AM

February 01, 2016

Tommaso Dorigo - Scientificblogging

Status Of "Anomaly!"
I believe it is appropriate if I restart this column today, after a two-month period of semi-inactivity, with a description of what has  been going on in my private - well, semi-private - life.

read more

by Tommaso Dorigo at February 01, 2016 07:12 PM

Quantum Diaries

Spun out of proportion: The Proton Spin Crisis

In the late 1980s, as particle colliders probed deeper into the building blocks of nature, there were hints of a strange and paradoxical behaviour in the heart of atoms. Fundamental particles have a curious quantum mechanical property known as “spin”, which the electron carries in magnitude ½. While the description of electron’s spin is fairly simple, protons are made up of many particles whose “spins” can add together in complicated ways and yet remarkably, its total spin turns out to be the same as the electron: ½. This led to one of the great mysteries of modern physics: how do all the particles inside the proton conspire together to give it a ½ spin? And what might this mean for our understanding of hadrons, the particles that make up most of the visible universe?

[This article is largely intended for a lay-audience and contains an introduction to foundational ideas such as spin. If you’ve had a basic introduction to Quantum Mechanics before, you may wish to skip to section marked —— ]

We’ve known about the proton’s existence for nearly a hundred years, so you’d be forgiven for thinking that we knew all there was to know about it. For many of us, our last exposure to the word “proton” was in high school chemistry, where they were described as a little sphere of positive charge that clumps with neutrons to make atomic nuclei, around which negatively charged electrons orbit to create all the atoms, which make up Life, the Universe and Everything1.


The simple, three-quark model of a proton (each coloured circle is a type of “quark”). 

Like many ideas in science, this is a simplified model that serves as a good introduction to a topic, but skips over the gory details and the bizarre, underlying reality of nature. In this article, we’ll focus on one particular aspect, the quantum mechanical “spin” of the proton. The quest to measure its origin has sparked discovery, controversy and speculation that has lasted 30 years, the answer to which is currently being sought at a unique particle collider in New York.

The first thing to note is that protons, unlike electrons2, are composite particles, made up from lots of other particles. The usual description is that the proton is made up of three smaller “quarks” which, as far as we know, can’t be broken down any further. This picture works remarkably well at low energies but it turns out at very high energies, like those being reached at the at the LHC, this description turns out to be inadequate. At that point, we have to get into the nitty-gritty and consider things like quark-antiquark pairs that live inside the proton interacting dynamically with other quarks without changing its the overall charge. Furthermore, there are particles called gluons that are exchanged between quarks, making them “stick” together in the proton and playing a crucial role in providing an accurate description for particle physics experiments.

So on closer inspection, our little sphere of positive charge turns out to be a buzzing hive of activity, with quarks and gluons all shuffling about, conspiring to create what we call the proton. It is by inferring the nature of these particles within the proton that a successful model of the strong nuclear force, known as Quantum Chromodynamics (QCD), was developed. The gluons were predicted and verfied to be the carriers of this force between quarks. More on them later.

Proton structure

A more detailed model of the proton. The golden chains between the quarks (the coloured spheres) are representations of gluons, transferred between them. Quark anti-quark pairs are also visible with arrows representing spins.

That’s the proton, but what exactly is spin? It’s often compared to angular momentum, like the objects in our everyday experience might have. Everyone who’s ever messed around on an office chair knows that once you get spun around in one, it often takes you a bit of effort to stop because the angular momentum you’ve built up keeps you going. If you did this a lot, you might have noticed that if you started spinning with your legs/arms outstretched and brought them inwards while you were spinning, you’d begin to spin faster! This is because angular momentum (L) is proportional to the radial (r) distribution of matter (i.e. how far out things are from the axis of rotation) multiplied by the speed of rotation3 (v). To put it mathematically L = m × v × r where m is just your constant mass. Since L is constant, as you decrease r (by bringing your arms/legs inwards), v (the speed at which you’re spinning) increases to compensate. All fairly simple stuff.

So clearly, for something to have angular momentum it needs to be distributed radially. Surely r has to be greater than 0 for L to be greater than 0. This is true, but it turns out that’s not all there is to the story. A full description of angular momentum at the quantum (atomic) level is given by something we denote as “J”. I’ll skip the details, but it turns out J = L + S, where L is orbital angular momentum, in a fashion similar to what we’ve discussed, and S? S is a slightly different beast.

Both L and S can only take on discrete values at the microscopic level, that is, they have quantised values. But whereas a point-like particle cannot have L > 0 in its rest frame (since if it isn’t moving around and v = 0, then L = 0), S will have a non-zero value even when the particle isn’t moving. S is what we call Spin. For the electron and quarks, it takes on the value of ½ in natural units.

Spin has a lot of very strange properties. You can think of it like a little arrow pointing in a direction in space but it’s not something we can truly visualise. One is tempted to think of the electron like the Earth, a sphere spinning about some kind of axis, but the electron is not a sphere, it’s a point-like particle with no “structure” in space. While an electron can have many different values of L depending on its energy (and atomic structure depends on these values), it only has one intrinsic magnitude of spin: ½. However, since spin can be thought of as an arrow, we have some flexibility. Loosely speaking, spin can point in many different directions but we’ll consider it as pointing “up” (+½) or “down” (- ½). If we try to measure it along a particular axis, we’re bound to find it in one of these states relative to our direction of measurement.


Focus on one of the red faces. When the cube rotates every 360 degrees, the red ribbon appears to go above and below the cube alternatively! Because the cube is coupled to its environment, it takes 720 degrees to return it to it’s original orientation.

One of the peculiar things about spin-½ is that it causes the wave-function of the electron to exhibit some mind bending properties. For example, you’d think rotating any object by 360 degrees would put it back into exactly the same state as it was, but it turns out that doesn’t hold true for electrons. For electrons, rotating them by 360 degrees introduces a negative sign into their wave-function! You have to spin it another 360 degrees to get it back into the same state! There are ways to visualise systems with similar behaviour (see right) but that’s just a sort of “metaphor” for what really happens to the electron. This links into the famous conclusion of Pauli’s that no two identical particles with spin-½ (or any other half-integer spin) can share the same quantum mechanical state.


Spin is an important property of matter that only really manifests on the quantum scale, and while we can’t visualise it, it ends up being important for the structure of atoms and how all solid objects obtain the properties they do. The other important property it has is that the spin of a free particle likes to align with magnetic fields4 (and the bigger the spin, the greater the magnetic coupling to the field). By using this property, it was discovered that the proton also had angular momentum J = ½. Since the proton is a stable particle, it was modelled to be in a low energy state with L = 0 and hence J = S = ½ (that is to say, the orbital angular momentum is assumed to be zero and hence we may simply call J, the “spin”). The fact the proton has spin and that spin aligns with magnetic fields, is a crucial element to what makes MRI machines work.

Once we got a firm handle on quarks in the late 1960s, the spin structure of the proton was thought to be fairly simple. The proton has spin-½. Quarks, from scattering experiments and symmetry considerations, were also inferred to have spin-½. Therefore, if the three quarks that make up the proton were in an “up-down-up” configuration, the spin of the proton naturally comes out as ½ – ½ + ½ = ½. Not only does this add up to the measured spin, but it also gives a pleasant symmetry to the quantum description of the proton, consistent with the Pauli exclusion principle (it doesn’t matter which of the three quarks is the “down” quark). But hang on, didn’t I say that the three-quarks story was incomplete? At high energies, there should be a lot more quark-antiquark pairs (sea quarks) involved, messing everything up! Even so, theorists predicted that these quark-antiquark pairs would tend not to be polarised, that is, have a preferred direction, and hence would not contribute to the total spin of the proton.

If you can get the entirety of the proton spinning in a particular direction (i.e. polarising it), it turns out the scattering of an electron against its constituent quarks should be sensitive to their spin! Thus, by scattering electrons at high energy, one could check the predictions of theorists about how the quarks’ spin contributes to the proton.

In a series of perfectly conducted experiments, the theory was found to be absolutely spot on with no discrepancy whatsoever. Several Nobel prizes were handed out and the entire incident was considered resolved, now just a footnote in history. OK, not really.

In truth, the total opposite happened. Although the experiments had a reasonable amount of uncertainty due to the inherent difficulty of polarising protons, a landmark paper by the European Muon Collaboration found results consistent with the quarks contributing absolutely no overall spin to the proton whatsoever! The measurements could be interpreted with the overall spin from the quarks being zero5. This was a complete shock to most physicists who were expecting verification from what was supposed to be a fairly straightforward measurement. Credit where it is due, there were theorists who had predicted that the assumption about orbital angular momentum (L = 0) had been rather ad-hoc and that L > 0 could account for some of the missing spin. Scarcely anyone would have expected, however, that the quarks would carry so little of the spin. Although the nuclear strong force, which governs how quarks and gluons combine to form the proton, has been tested to remarkable accuracy, the nature of its self-interaction makes it incredibly difficult to draw predictions from.

The feynman diagram for Deep Inelastic Scattering (electron line at the top, proton on the bottom). This type of scattering is sensitive to quark spin.

The Feynman diagram for Deep Inelastic Scattering (electron line at the top, proton on the bottom, with a photon exchanged between them). This type of scattering is sensitive to quark spin.

Future experiments (led by father and son rivals, Vernon and Emlyn Hughes6 of CERN and SLAC respectively) managed to bring this to a marginally less shocking proposal. The greater accuracy of the measurements from these collaborations had found that the total spin contributions from the quarks was actually closer to ~30%. An important discovery was that the sea quarks, thought not to be important, were actually found to have measurable polarisation. Although it cleared up some of the discrepancy, it still left 60-70% of spin unaccounted for. Today, following much more experimental activity in Deep Inelastic Scattering and precision low-energy elastic scattering, the situation has not changed in terms of the raw numbers. The best estimates still peg the quarks’ spin as constituting only about 30% of the total.

Remarkably, there are theoretical proposals to resolve the problem that were hinted at long before experiments were even conducted. As mentioned previously, although currently impossible to test experimentally, the quarks may carry orbital angular momentum (L) that could compensate for some of the missing spin. Furthermore, we have failed to mention the contribution of gluons to the proton spin. Gluons are spin-1 particles, and were thought to arrange themselves such that their total contribution to the proton spin was nearly non-existent.


The Brookhaven National Laboratory where RHIC is based (seen as the circle, top right).

The Relativistic Heavy Ion Collider (RHIC) in New York is currently the only spin-polarised proton collider in the world. This gives it a unique sensitivity to the spin structure of the proton. In 2014, an analysis of the data collected at RHIC indicated that the gluons (whose spin contribution can be inferred from polarised proton-proton collisions) could potentially account for up to 30 of the missing 70% of proton spin! About the same as the quarks. This would bring the “missing” amount down to about 40%, which could be accounted for by the unmeasurable orbital angular momentum of both quarks and gluons.

As 2016 kicks into gear, RHIC will be collecting data at a much faster rate than ever after a recent technical upgrade that should double it’s luminosity (loosely speaking, the rate at which proton collisions occur). With the increased statistics, we should be able to get an even greater handle on the exact origin of proton spin. 

The astute reader, provided they have not already wandered off, dizzy from all this talk of spinning protons, may be tempted to ask “Why on earth does it matter where the total spin comes from? Isn’t this just abstract accountancy?” This is a fair question and I think the answer is a good one. Protons, like all other hadrons (similar, composite particles made of quarks and gluons) are not very well understood at all. A peculiar feature of QCD called confinement binds individual quarks together so that they are never observed in isolation, only bound up in particles such as the proton. Understanding the spin structure of the proton can inform our theoretical models for understanding this phenomenon.

This has important implications, one being that 98% of the mass of all visible matter does not come from the Higgs Boson. It comes from the binding energy of protons! And the exact nature of confinement and precise properties of QCD have implications for the cosmology of the early universe. Finally, scattering experiments with protons have already revealed so much to fundamental physics, such as the comprehension of one of the fundamental forces of nature. As one of our most reliable probes of nature, currently in use at the LHC, understanding them better will almost certainly aid our attempts to unearth future discoveries.

Kind regards to Sebastian Bending (UCL) for several suggestions (all mistakes are unreservedly my own).


[1] …excluding dark matter and dark energy which constitute the dark ~95% of the universe.

[2] To the best of our knowledge.

[3] Strictly speaking the component of velocity perpendicular to the radial direction.

[4] Sometimes, spins in a medium like water like to align against magnetic fields, causing an opposite magnetic moment (known as diamagnetism). Since frogs are mostly water, this effect can and has been used to levitate frogs.

[5] A lot of the information here has been summarised from this excellent article by Robert Jaffe, whose collaboration with John Ellis on the Ellis-Jaffe rule led to many of the predictions discussed here.

[6] Emlyn was actually the spokesperson for SLAC, though he is listed as one of the primary authors on the SLAC papers regarding the spin structure of the proton.

by Ricky Nathvani at February 01, 2016 06:30 PM

CERN Bulletin

Registration of vehicles at the Gex sous-préfecture: now by appointment only
The Gex sous-préfecture has informed CERN that it has taken the following steps in order to reduce waiting times at its counters for the issue of carte grise vehicle registration certificates. As of 1 February 2016, you must book an appointment via the website for all services relating to the registration of vehicles, in particular the:   change of the holder of a registration certificate, issue of a certificat de situation administrative (administrative status certificate required for the sale of a vehicle), change of marital status (or company name in the case of legal entities), change of address, change in the technical specification of the vehicle, corrections to registration certificates, equests for duplicates (loss or theft of registration certificates), registration of a diplomatic vehicle (CERN), registration of a new vehicle, registration of vehicles purchased tax-free in the Pays de Gex free zone (formerly TTW series), and import of vehicles (from within the EU, from Switzerland, from outside the EU).   Further information about these services can be obtained by sending an e-mail to or by calling +33 4 50 41 51 51 on Mondays and Tuesdays between 2 p.m. and 4 p.m. and on Wednesdays between 9 a.m. and 12 noon. Please note that appointments cannot be booked by telephone.

February 01, 2016 04:56 PM

Clifford V. Johnson - Asymptotia

Hello world!

Welcome to WordPress. This is your first post. Edit or delete it, then start writing!

by admin at February 01, 2016 04:19 PM

CERN Bulletin

Fourth Thematic CERN School of Computing
The Fourth Thematic School of Computing (tCSC2016) takes place this year in Split, Croatia, from 22 to 28 May 2016.   The theme is "Efficient and Parallel Processing of Scientific Data", looking at: The challenge of scientific data processing: commonalities, analogies and the main differences between different sciences. Size of scientific software projects. Parallelism and asynchronism: computation and I/O. The School is open to postgraduate students and research workers with a few years' experience in elementary particle physics, computing, engineering or related fields.  All applicants are welcome, including former and future participants in the main CSC summer school. Registration will close on 15 February and participation is limited to 24 students. To register, please go here. About: The Thematic Schools are part of the annual series of CERN Schools of Computing, to promote advanced learning and knowledge exchange on the subject of scientific computing between young scientists and engineers involved in particle physics or other sciences.  They are shorter and more focused than the main summer CERN School of Computing, but still maintain the same guiding principles: an academic dimension covering advanced topics; theory and practice; networking and socialisation. Applications will be accepted until 15 February 2016. For more information on the CSC, see: For registration and more information on the tCSC2016, see:

by Alberto Pace, CSC Director at February 01, 2016 03:49 PM

CERN Bulletin

Meet the winner artists of Accelerate@CERN Taiwan | 3 February
The winners of Accelerate@CERN Taiwan are WenChi Su (left) and Pei-Ying Lin (right). Accelerate@CERN is the country-specific, one-month research award for artists who have never been in a science laboratory before. Accelerate@CERN Taiwan, is funded by the Ministry of Culture for Taiwan. From within thirty outstanding applicants, the winners of Accelerate@CERN Taiwan are WenChi Su - dancer and choreographer - and Pei-Ying Lin - digital artist. This is the first opportunity for two talented artists to work and research together on the joint creation of a new dance project which engages with the digital realm and is inspired by the world of particle physics. In the past month they have been exploring CERN together, and now they are working on their project. Meet the artists on Wednesday 3 February at 4:30 p.m. in Restaurant 1. For more information on Accelerate@CERN, see here. Follow the artists blog to know what they have been doing for the past month at CERN.

February 01, 2016 01:32 PM

Jester - Resonaances

750 ways to leave your lover
A new paper last week straightens out the story of the diphoton background in ATLAS. Some confusion was created because theorists misinterpreted the procedures described in the ATLAS conference note, which could lead to a different estimate of the significance of the 750 GeV excess. However, once the correct phenomenological and statistical approach is adopted, the significance quoted by ATLAS can be reproduced, up to small differences due to incomplete information available in public documents. Anyway, now that this is all behind, we can safely continue being excited at least until summer.  Today I want to discuss different interpretations of the diphoton bump observed by ATLAS. I will take a purely phenomenological point of view, leaving for the next time  the question of a bigger picture that the resonance may fit into.

Phenomenologically, the most straightforward interpretation is the so-called everyone's model: a 750 GeV singlet scalar particle produced in gluon fusion and decaying to photons via loops of new vector-like quarks. This simple construction perfectly explains all publicly available data, and can be easily embedded in more sophisticated models. Nevertheless, many more possibilities were pointed out in the 750 papers so far, and here I review a few that I find most interesting.

Spin Zero or More?  
For a particle decaying to two photons, there is not that many possibilities: the resonance has to be a boson and, according to young Landau's theorem, it cannot have spin 1. This leaves at the table spin 0, 2, or higher. Spin-2 is an interesting hypothesis, as this kind of excitations is predicted in popular models like the Randall-Sundrum one. Higher-than-two spins are disfavored theoretically. When more data is collected, the spin of the 750 GeV resonance can be tested by looking at the angular distribution of the photons. The rumor is that the data so far somewhat favor spin-2 over spin-0, although the statistics is certainly insufficient for any serious conclusions.  Concerning the parity, it is practically impossible to determine it by studying the diphoton final state, and both the scalar and the pseudoscalar option are equally viable at present. Discrimination may be possible in the future, but  only if multi-body decay modes of the resonance are discovered. If the true final state is more complicated than two photons (see below), then the 750 GeV resonance may have  any spin, including spin-1 and spin-1/2.

Narrow or Wide? 
The total width is an inverse of particle's lifetime (in our funny units). From the experimental point of view, the width larger than detector's  energy resolution  will show up as a smearing of the resonance due to the uncertainty principle. Currently, the ATLAS run-2 data prefer the width 10 times larger than the experimental resolution  (which is about 5 GeV in this energy ballpark), although the preference is not very strong in the statistical sense. On the other hand, from the theoretical point of view, it is much easier to construct models where the 750 GeV resonance is a narrow particle. Therefore, confirmation of the large width would have profound consequences, as it would significantly narrow down the scope of viable models.  The most exciting interpretation would then be that the resonance is a portal to a dark sector containing new light particles very weakly coupled to ordinary matter.    

How many resonances?  
One resonance is enough, but a family of resonances tightly packed around 750 GeV may also explain the data. As a bonus, this could explain the seemingly large width without opening new dangerous decay channels. It is quite natural for particles to come in multiplets with similar masses: our pion is an example where the small mass splitting π± and π0 arises due to electromagnetic quantum corrections. For Higgs-like multiplets the small splitting may naturally arise after electroweak symmetry breaking, and  the familiar 2-Higgs doublet model offers a simple realization. If the mass splitting of the multiplet is larger than the experimental resolution, this possibility can tested by precisely measuring the profile of the resonance and searching for a departure from the Breit-Wigner shape. On the other side of the spectrum is the idea is that there is no resonance at all at 750 GeV, but rather at another mass, and the bump at 750 GeV appears due to some kinematical accidents.
Who made it? 
The most plausible production process is definitely the gluon-gluon fusion. Production in collisions of light quark and antiquarks is also theoretically sound, however it leads to a more acute tension between run-2 and run-1 data. Indeed, even for the gluon fusion, the production cross section of a 750 GeV resonance in 13 TeV proton collisions is only 5 times larger than at 8 TeV. Given the larger amount of data collected in run-1, we would expect a similar excess there, contrary to observations. For a resonance produced from u-ubar or d-dbar the analogous ratio is only 2.5 (see the table), leading to much more  tension. The ratio climbs back to 5 if the initial state contains the heavier quarks: strange, charm, or bottom (which can also be found sometimes inside a proton), however I haven't seen yet a neat model that makes use of that. Another possibility is to produce the resonance via photon-photon collisions. This way one could cook up a truly minimal and very predictive model where the resonance couples only to photons of all the Standard Model particles. However, in this case, the ratio between 13 and 8 TeV cross section is very unfavorable, merely a factor of 2, and the run-1 vs run-2 tension comes back with more force. More options open up when associated production (e.g. with t-tbar, or in vector boson fusion) is considered. The problem with these ideas is that, according to what was revealed during the talk last December, there isn't any additional energetic particles in the diphoton events. Similar problems are facing models where the 750 GeV resonance appears as a decay product of a heavier resonance, although in this case some clever engineering or fine-tuning may help to hide the additional particles from experimentalist's eyes.

Two-body or more?
While a simple two-body decay of the resonance into two photons is a perfectly plausible explanation of all existing data, a number of interesting alternatives have been suggested. For example, the decay could be 3-body, with another soft visible or invisible  particle accompanying two photons. If the masses of all particles involved are chosen appropriately, the invariant mass spectrum of the diphoton remains sharply peaked. At the same time, a broadening of the diphoton energy due to the 3-body kinematics may explain why the resonance appears wide in ATLAS. Another possibility is a cascade decay into 4 photons. If the  intermediate particles are very light, then the pairs of photons from their decay are very collimated and may look like a single photon in the detector.
 ♬ The problem is all inside your head   and the possibilities are endless. The situation is completely different than during the process of discovering the  Higgs boson, where one strongly favored hypothesis was tested against more exotic ideas. Of course, the first and foremost question is whether the excess is really new physics, or just a nasty statistical fluctuation. But if that is confirmed, the next crucial task for experimentalists will be to establish the nature of the resonance and get model builders on the right track.  The answer is easy if you take it logically ♬ 

All ideas discussed above appeared in recent articles by various authors addressing the 750 GeV excess. If I were to include all references the post would be just one giant hyperlink, so you need to browse the literature yourself to find the original references.

by Jester ( at February 01, 2016 08:14 AM

January 31, 2016

Lubos Motl - string vacua and pheno

Transparency, public arguments: a wrong recipe for arXiv rejections
OneDrive: off-topic: tomorrow, Microsoft will reduce the free 15 GB space by 10 GB and abolish the free 15 GB camera roll space. Old users may click here and after two more clicks, they will avoid this reduction if they act on Sunday!
Crackpot blog Backreaction and its flavor appendix called Nature believe that it was wrong for the website of scientific preprints (100k papers a year, 1.1 million in total) to reject two submissions by students of quantum information who attempted to rebrand themselves as general relativists and argue that you can't ever fall into a black hole.

Thankfully, Ms Hossenfelder and others agree that the papers were wrong. But they still protest against the fact that the papers were rejected. Or to say the least, there should have been some "transparency" in the rejection – in other words, some details about the decision which should be followed by some arguments in the public.

I totally disagree with those comments.

The arXiv (the website) was established in the early 1990s as a tool for researchers to share their findings more quickly, before they got published in the paper journals that mattered at that time. Paul Ginsparg created the software and primarily fathered the hep-ph and hep-th (high energy physics phenomenology and theory) archives – he also invented the funny, friendly yet mocking philosophy-like nickname "phenomenologists" for the people who were not formal (mainly string) theorists but who actually cared what happens with the particles in the muddy world of germs and worms.

The hep-th and hep-ph archives were meant to serve rather particular communities of experts. They pretty much knew who belonged to those sets and who didn't. The set got expanded when a member trained a new student. Much like the whole web (a server at CERN serving Tim Berners-Lee and few pals), the got more "global" and potentially accessible to the whole mankind.

This evolution has posed some new challenges. The website had to get shielded from the thousands of potential worthless submissions by the laymen. There existed various ways to deal with the challenge but an endorsement system was introduced for hep-th, hep-ph, and other experts' archives. It is much easier to send papers to "less professional" archives within but it's still harder than to send them to the crackpot-dominated of Philip Gibbs.

The submissions are still filtered by moderators who are volunteers. One of them, Daniel Gottesman of the Perimeter Institute, has made an important response to those who try to criticize the arXiv moderators when they manage to submit their paper previously rejected by the arXiv to a classic journal:
“If a paper is rejected by arXiv and accepted by a journal, that does not mean that arXiv is the one that made a mistake.”
Exactly. The arXiv's filtering process isn't terribly cruel – authors above a certain quality level can be pretty sure that their paper gets accepted to the arXiv if they obey some sensible conditions so it's not like the "bloody struggle for some place under the Sun" in some printed journals considered prestigious.

But the arXiv's filters are still nontrivial and independent and it may happen that the arXiv-rejected paper gets to a printed journal – which often means that the printed journal has poor quality standards. There is no "right" and there cannot be any "right" to have any paper accepted to the arXiv. There is no "guarantee" that the arXiv is always more inclusive than all journals in the world. The arXiv's filtering policies are different which doesn't mean "softer and sloppier than everyone else's".

In this case, the rejected papers were written by students of Dr Nicolas Gisin – a senior quantum information expert in Geneva. But these students didn't write about something they're expected to be good at because they're Gisin's students.

Instead, they wrote about black holes and it was bullšit. You can't ever fall into a black hole, a layman often thinks before he starts to understand general relativity at a bit deeper level. They made the typical beginners' mistakes. Then they realized they were mistakes and did some smaller but still embarrassing mistakes that allowed them to say that "you can't ever fall into a Hawking-eaporating black hole" which is still demonstrably nonsense.

My understanding is that these preprints about black holes should not be allowed in the quantum information archive where they would be off-topic; and these students should not have the "automatic" right to post to high-energy physics or general relativity archives because they're not experts and they're not in a close enough contact with an expert. So I think it's a matter of common sense that papers from such authors about this topic are likely to be rejected – and if everyone seems to agree that the papers are wrong, what are we really talking about?

The reason why some people talk about this self-evidently justified rejection is that there are some people who would love to flood the arXiv with garbage and dramatically reduce its quality requirements. These people want it simply because they can't achieve the minimum quality threshold that is required in hep-th, hep-ph, and elsewhere – but they want to be considered as experts of the same quality, anyway. So they fight against any moderation. If there is any moderation at all, they scream, at least, they should get some complete justification why their submission was rejected. It's clear what would be done with such an explanation. The rejected authors would show it to friends, posted on blogs, and look for some political support that would push the moderators in a direction and these moderators could ultimately give up and accept the garbage, anyway.

The louder and more well-connected you would be, the more likely it would be for the garbage you wrote to be accepted to the arXiv at the end.

In fact, this Dr Nicolas Gisin already shows us where this "transparency" would lead. The actually relevant comment that should be said in this context is that Dr Gisin has partially failed as an adviser. He failed to stop his students from embarrassing themselves by (nearly) publishing a preprint about a part of physics that they clearly don't understand at the expert level. It's really this Dr Gisin, and not the arXiv moderators, who should have been the greatest obstacle that his students should have faced while submitting wrong papers on general relativity.

Instead, he became a defender of the "students' right to submit these wrong papers". Why this right should exist? Once you try to demand such non-existent "rights" and scream things that make it clear that you don't give a damn whether the papers have elementary errors or not, you are a problem for the arXiv. You are a potentially unstoppable source of junk that may get multiplied and that the experts would have to go through every day. It doesn't matter that you have published lots of good preprints to another place of the arXiv. You just don't have credentials to flood every sub-archive at

We see that Dr Gisin tried to inflate his ego by co-organizing an article in Nature that tries to pretend that it's a scandal that two young people who have the honor to be students of Dr Gisin himself were treated in this disrespectful way by the archives dedicated to general relativity or particle physics. With this screaming in the public, lots of people could join Dr Gisin and send the message to the arXiv moderators: How dare you? Those were students of our great Dr Gisin. You must treat them as prophets.

Sorry but they're not prophets. They were just students who tried to send wrong papers to professionals' archives about disciplines at which they are clearly not too good and unsurprisingly, they have failed. Even if Dr Gisin had sent the papers about the "impossibility to fall to a black hole", these papers should have been rejected.

The rejection may depend on some personal opinions or biases of a particular moderator – but there's nothing wrong about it. At the end, science has to be evaluated by some individual people. Gisin's students' papers could have been rejected for numerous simple reasons. If you demanded the moderators to publish some detailed explanations, it wouldn't help anybody. Any suggestion that some "arguments" between the rejected authors and moderators should follow means that
  • someone believes that there is a "right" for everyone to submit preprints anywhere to, but there's none
  • the moderators must be ready to sacrifice any amount of time and energy, but they don't have to
  • the interactions between the moderators and the would-be authors are discussions between two equal peers.
But the last point is simply not true, either. The rejected authors are primarily meant to be – and it's almost always the case – people who just don't know the basics or don't require their own papers to pass even the most modest tests of quality. One may say that they're crackpots or marginal crackpots. You just don't want the moderators to spend much time by communication with these people – because to save the time of actual researchers is the main reason of the rejection in the first place. So if you forced the moderators to spend an hour with every rejected crackpot paper, you could very well "open the gates" and force every researcher to waste a few seconds by looking over the abstract of the bullšit paper instead. If the gates were opened in this way, the number of junk papers would obviously start to grow.

The main problem of this "transparency" is that the meritocratic decision – one that ultimately must be done by someone who knows something about the subject, or a group of such people – would be replaced by a fight in the public arena.

Let me give you a simple example. It's just an example; there could be many other examples that are much less connected with the content of this weblog in the past but whose issues are very analogous, anyway. I believe – or hope – that loop quantum gravity papers aren't allowed at hep-th (just at gr-qc) because these people are acknowledged to be crackpots at the quality threshold expected in high energy physics. Every expert knows that even if there were something OK about loop quantum gravity (and there's nothing), there's no way how it could tell us something meaningful about particle physics.

Now, whenever a moderator would reject a loop quantum gravity paper at hep-th, the "transparency" regime would force him to explain the steps. In one way or another, more explicitly or less explicitly, he would have to reveal that he considers all the loop quantum gravity people to be cranks. Pretty much everyone in high-energy physics does. But almost no one says those things on a regular basis because people are "nice" and they want to avoid mud. OK, so the loop quantum gravity author would get this reply. What would he do with it? Would he learn a lesson? No, loop quantum gravity folks can never learn any lesson – that's a big part of the reason why they're crackpots.

Instead, this rejected author would send the explanation by the arXiv moderator to his friends, for example clueless inkspillers in Nature (e.g. Zeeya Merali who wrote this Nature rant about the "high-profile physicist" whose students were "outrageously" rejected), who would try to turn the explanation by the arXiv moderator into a scandal. Could anything good come out of it? Not at all. At most, the loop quantum gravity crackpot could assure himself that just like him or Sabine Hossenfelder, way over 99.9% of the public doesn't have the slightest idea about issues related to quantum gravity.

But the arXiv must still keep on working – it has become an important venue for the professionals in high energy physics. It's serving the relatively small community whose knowledge – and therefore also opinions – dramatically differ from the knowledge and opinions of the average member of the public. Clearly, if the hep-th arXiv were conquered by the community of the loop quantum gravity crackpots or the broader public that has been persuaded that loop quantum gravity is an OK science, the actual experts in quantum gravity would have to start a new website because hep-th would become unusable very soon, after it would be flooded by many more junk submissions. But hep-th is supposed to be their archive. That's how and why it was founded. The definition of "they" isn't quite sharp and clear but it's not completely ill-defined, either.

If a paper is rejected, it means that there is a significant disagreement between a moderator and the author of the preprint. The author must think that the preprint is good enough or great (that's why the paper was submitted) while the moderator doesn't share this opinion. If the author is smart or has really found something unusual, he may be right and the moderator may be wrong. The probability of that is clearly nonzero. It's just arguably small. But you simply can't improve the quality of the rejection process by turning the process into a potentially neverending public argument. It's essential that the expertise needed to evaluate submissions to the professional archives is not "omnipresent" which is why the broader "publication" of the details of the rejection transfers the influence and interest on a wrongly, too inclusively defined subgroup of the mankind.

Those are the reasons why I think that the calls for transparency, however fashionable these calls have become, are misplaced and potentially threatening for the remainder of meritocracy in science.

by Luboš Motl ( at January 31, 2016 08:17 AM

January 30, 2016

John Baez - Azimuth

Among the Bone Eaters

Anthropologists sometimes talk about the virtues and dangers of ‘going native’: doing the same things as the people they’re studying, adopting their habits and lifestyles—and perhaps even their beliefs and values. The same applies to field biologists: you sometimes see it happen to people who study gorillas or chimpanzees.

It’s more impressive to see someone go native with a pack of hyenas:

• Marcus Baynes-Rock, Among the Bone Eaters: Encounters with Hyenas in Harar, Penn State University Press, 2015.

I’ve always been scared of hyenas, perhaps because they look ill-favored and ‘mean’ to me, or perhaps because their jaws have incredible bone-crushing force:

This is a spotted hyena, the species of hyena that Marcus Baynes-Rock befriended in the Ethiopian city of Harar. Their bite force has been estimated at 220 pounds!

(As a scientist I should say 985 newtons, but I have trouble imagining what it’s like to have teeth pressing into my flesh with a force of 985 newtons. If you don’t have a feeling for ‘pounds’, just imagine a 100-kilogram man standing on a hyena tooth that is pressing into your leg.)

So, you don’t want to annoy a hyena, or look too edible. However, the society of hyenas is founded on friendship! It’s the bonds of friendship that will make one hyena rush in to save another from an attacking lion. So, if you can figure out how to make hyenas befriend you, you’ve got some heavy-duty pals who will watch your back.

In Harar, people have been associating with spotted hyenas for a long time. At first they served as ‘trash collectors’, but later the association deepened. According to Wikipedia:

Written records indicate that spotted hyenas have been present in the walled Ethiopian city of Harar for at least 500 years, where they sanitise the city by feeding on its organic refuse.

The practice of regularly feeding them did not begin until the 1960s. The first to put it into practice was a farmer who began to feed hyenas in order to stop them attacking his livestock, with his descendants having continued the practice. Some of the hyena men give each hyena a name they respond to, and call to them using a “hyena dialect”, a mixture of English and Oromo. The hyena men feed the hyenas by mouth, using pieces of raw meat provided by spectators. Tourists usually organize to watch the spectacle through a guide for a negotiable rate. As of 2002, the practice is considered to be on the decline, with only two practicing hyena men left in Harar.

Hyena man — picture by Gusjer

According to local folklore, the feeding of hyenas in Harar originated during a 19th-century famine, during which the starving hyenas began to attack livestock and humans. In one version of the story, a pure-hearted man dreamed of how the Hararis could placate the hyenas by feeding them porridge, and successfully put it into practice, while another credits the revelation to the town’s Muslim saints convening on a mountaintop. The anniversary of this pact is celebrated every year on the Day of Ashura, when the hyenas are provided with porridge prepared with pure butter. It is believed that during this occasion, the hyenas’ clan leaders taste the porridge before the others. Should the porridge not be to the lead hyenas’ liking, the other hyenas will not eat it, and those in charge of feeding them make the requested improvements. The manner in which the hyenas eat the porridge on this occasion are believed to have oracular significance; if the hyena eats more than half the porridge, then it is seen as portending a prosperous new year. Should the hyena refuse to eat the porridge or eat all of it, then the people will gather in shrines to pray, in order to avert famine or pestilence.

Marcus Baynes-Rock went to Harar to learn about this. He wound up becoming friends with a pack of hyenas:

He would play with them and run with them through the city streets at night. In the end he ‘went native’: he would even be startled, like the hyenas, when they came across a human being!

To get a feeling for this, I think you have to either read his book or listen to this:

In a city that welcomes hyenas, an anthropologist makes friends, Here and Now, National Public Radio, 18 January 2016.

Nearer the beginning of this quest, he wrote this:

The Old Town of Harar in eastern Ethiopia is enclosed by a wall built 500 years ago to protect the town’s inhabitants from hostile neighbours after a religious conflict that destabilised the region. Historically, the gates would be opened every morning to admit outsiders into the town to buy and sell goods and perhaps worship at one of the dozens of mosques in the Muslim city. Only Muslims were allowed to enter. And each night, non-Hararis would be evicted from the town and the gates locked. So it is somewhat surprising that this endogamous, culturally exclusive society incorporated holes into its defensive wall, through which spotted hyenas from the surrounding hills could access the town at night.

Spotted hyenas could be considered the most hated mammal in Africa. Decried as ugly and awkward, associated with witches and sorcerers and seen as contaminating, spotted hyenas are a public relations challenge of the highest order. Yet in Harar, hyenas are not only allowed into the town to clean the streets of food scraps, they are deeply embedded in the traditions and beliefs of the townspeople. Sufism predominates in Harar and at last count there were 121 shrines in and near the town dedicated to the town’s saints. These saints are said to meet on Mt Hakim every Thursday to discuss any pressing issues facing the town and it is the hyenas who pass the information from the saints on to the townspeople via intermediaries who can understand hyena language. Etymologically, the Harari word for hyena, ‘waraba’ comes from ‘werabba’ which translates literally as ‘news man’. Hyenas are also believed to clear the streets of jinn, the unseen entities that are a constant presence for people in the town, and hyenas’ spirits are said to be like angels who fight with bad spirits to defend the souls of spiritually vulnerable people.


My current research in Harar is concerned with both sides of the relationship. First is the collection of stories, traditions, songs and proverbs of which there are many and trying to understand how the most hated mammal in Africa can be accommodated in an urban environment; to understand how a society can tolerate the presence of a potentially dangerous
species. Second is to understand the hyenas themselves and their participation in the relationship. In other parts of Ethiopia, and even within walking distance of Harar, hyenas are dangerous animals and attacks on people are common. Yet, in the old town of Harar, attacks are unheard of and it is not unusual to see hyenas, in search of food scraps, wandering past perfectly edible people sleeping in the streets. This localised immunity from attack is reassuring for a researcher spending nights alone with the hyenas in Harar’s narrow streets and alleys.

But this sounds like it was written before he went native!

Social networks

By the way: people have even applied network theory to friendships among spotted hyenas:

• Amiyaal Ilany, Andrew S. Booms and Kay E. Holekamp, Topological effects of network structure on long-term social network dynamics in a wild mammal, Ecology Letters, 18 (2015), 687–695.

The paper is not open-access, but there’s an easy-to-read summary here:

Scientists puzzled by ‘social network’ of spotted hyenas,, 18 May 2015.

The scientists collected more than 55,000 observations of social interactions of spotted hyenas (also known as laughing hyenas) over a 20 year period in Kenya, making this one of the largest to date of social network dynamics in any non-human species.

They found that cohesive clustering of the kind where an individual bonds with friends of friends, something scientists call ‘triadic closure,’ was the most consistent factor influencing the long-term dynamics of the social structure of these mammals.

Individual traits, such as sex and social rank, and environmental effects, such as the amount of rainfall and the abundance of prey, also matter, but the ability of individuals to form and maintain social bonds in triads was key.

“Cohesive clusters can facilitate efficient cooperation and hence maximize fitness, and so our study shows that hyenas exploit this advantage. Interestingly, clustering is something done in human societies, from hunter-gatherers to Facebook users,” said Dr Ilany, who is the lead author on the study published in the journal Ecology Letters

Hyenas, which can live up to 22 years, typically live in large, stable groups known as clans, which can comprise more than 100 individuals.

According to the scientists, hyenas can discriminate maternal and paternal kin from unrelated hyenas and are selective in their social choices, tending to not form bonds with every hyena in the clan, rather preferring the friends of their friends.

They found that hyenas follow a complex set of rules when making social decisions. Males follow rigid rules in forming bonds, whereas females tend to change their preferences over time. For example, a female might care about social rank at one time, but then later choose based on rainfall amounts.

“In spotted hyenas, females are the dominant sex and so they can be very flexible in their social preferences. Females also remain in the same clan all their lives, so they may know the social environment better,” said study co-author Dr Kay Holekamp of Michigan State University.

“In contrast, males disperse to new clans after reaching puberty, and after they disperse they have virtually no social control because they are the lowest ranking individuals in the new clan, so we can speculate that perhaps this is why they are obliged to follow stricter social rules.”

If you like math, you might like this way of measuring ‘triadic closure’:

Triadic closure, Wikipedia.

For a program to measure triadic closure, click on the picture:

by John Baez at January 30, 2016 01:00 AM

January 29, 2016

Lubos Motl - string vacua and pheno

Munich: Kane vs Gross
Kane's attitude is the more scientific one

Yesterday, I mentioned Gordon Kane's paper based on his talk in Munich. Today, I noticed that
lots of the talk videos are available
on their website. The available speakers include Rovelli, Dawid, Pigliucci, Dardashti, Kragh, Achinstein, Schäffer, Smeenk, Kane, Quevedo, Wüthrich, Mukhanov, Ellis, Castellani, Lüst, Hossenfelder, Thebault, and Dvali while others may be added soon.

Among those, I was most interested in
Gordon Kane's talk
partly because I've read about some fiery exchange with David Gross. And yes, there was one. In the 45-minute talk, it starts around 30:00.

The main claim that ignited the battle was Gordy's assertion that M-theory on \(G_2\) holonomy manifolds with certain assumptions had predicted the mass \(m\approx 126\GeV\) of the Higgs boson before it was discovered; see e.g. Gordon Kane's blog post in December 2011. David Gross responded angrily. I've tried to understand the calculation "completely" with all the details and so far, I have failed. I feel that Gordon would have been able to compress the calculation or its logic if it were a clean one.

On the other hand, I partly do understand how the calculation works, what the assumptions are, and I just find it plausible that it's entirely accurate to say that with those assumptions including some notion of genericity, M-theory on 7D manifolds does produce the prediction of a \(126\GeV\) Higgs without fine-tuning. This statement surely isn't ludicrously wrong like many of the claims that I often criticize on this blog and some very careful researchers (importantly for me, Bobby Acharya) have pretty much joined Gordy in his research and in the summary of their conclusions, too.

Gross' and Kane's attitudes to the exchange were dramatically different. Gordon was focusing on the assumptions, calculations, and predictions; David was all about polls and the mob. "Gordon, you will agree that most of us wouldn't agree that M-theory has predicted the Higgs mass." And so on. Yes, no, what of it? If there's some important enough prediction and you have missed it or you don't understand it, it's your deficiency, David. If a majority of the community doesn't get it, it's the mistake of the members of the community. None of these votes can settle the question whether it's right for Gordon to say that M-theory has made the Higgs mass prediction, especially if most of these people know very well that they haven't even read any of these papers.

(By the way, Gordon phrases his predictions for the superpartner masses as predictions that have gone beyond the stage of "naive naturalness" which is how the people were estimating the masses decades ago. These days, they can work without this philosophy or strategy – as David Gross often categorizes naturalness.)

I think that David was acting like an inquisitor of a sort. The mob doesn't know or doesn't like that you have made that prediction, so you couldn't have done so. Well, that's a very lame criticism, David. With this approach of a bully, I sort of understand why you have sometimes endorsed the climate hysteria, too.

Also, I disagree with one particular claim by Gross, namely his assertion that the situation was in no way analogous to the prediction of Mercury's perihelion precession by general relativity. That was a prediction that would have killed general relativity if it had been falsified. Nothing like that is true in the case of Kane's M-theory predictions, Gross says.

Now, this claim is just rubbish, David. First of all, just like in the case of many of the string/M-theoretical predictions, the precession of Mercury's perihelion wasn't a full-fledged prediction but a postdiction. The precession anomaly had been known for a very long time before general relativity was completed. Einstein has only used this postdiction as a confirmation that increased his psychological certainty that he's on the right track (his heart has stopped for a second, we have heard) – and Gordon and his collaborators have arguably gone through totally analogous confirmations that have strengthened their belief that their class of compactifications is right (and string theorists – like reportedly Witten – have surely gone through the very same feeling when they learned that string theory postdicted gravity, and perhaps other things). At least, I don't see a glimpse of a real difference between the two situations.

Second, on top of this problem with David's argumentations, it's simply not true that any of these predictions or postdictions would have killed general relativity to the extent that they would convince Einstein to abandon it. One could be afraid that we need speculations about Einstein's thinking to know what would have happened if the confirmation hadn't taken place. Fortunately, we know what Einstein would have thought in that case – because someone has asked him:
When asked by his assistant what his reaction would have been if general relativity had not been confirmed by Eddington and Dyson in 1919, Einstein famously made the quip: "Then I would feel sorry for the dear Lord. The theory is correct anyway." [15]
Famously enough, in the first edition of the Czech Elegant Universe by Brian Greene, your humble correspondent translated "dear Lord" with the Czech word "lord" indicating Eddington. The quote makes sense in this way as well, doesn't it? ;-) I wasn't too aware of God, the other guy whom Einstein may have had in mind.

But back to the main topic.

Einstein would have definitely not abandoned general relativity. If the bending of light weren't observed, he would look for other explanations why it wasn't – abandoning GR wouldn't be among his top choices simply because the theory is beautiful and theoretically robust but was still able to pass some tests of agreement with the well-known physics (the Newtonian limit etc.). Today, many string theorists are actually more eager to abandon string theory for possibly inconclusive reasons than Einstein has ever been willing to abandon relativity.

The only possible kind of a difference between the two theories' predictions (GR and M-theory on \(G_2\) manifolds) is the fact that we think that GR is sort of a unique theory while M-theory on \(G_2\) manifolds, even as the "class of generic compactifications on 7D manifolds that Gordon has in mind", is not quite as unique. Even within string theory, there exist other classes of vacua, and even the \(G_2\) compactifications could be studied with somewhat different assumptions about the effective field theory we should get (not MSSM but a different model, and so on).

However, this difference isn't a function of purely intrinsic characteristics of the two theories. GR seems unique today because no one who is sensible and important enough is pushing any real "alternatives" to GR anymore. But these alternatives used to be considered and even Einstein himself has written papers proposing alternatives or "not quite corrected" versions of GR, especially before 1915.

My point is that in a couple of years, perhaps already in 2020, the accumulated knowledge may be such that it will be absolutely right to say that the situation of GR in 1919 and the situation of M-theory on \(G_2\) manifolds in 2016 were absolutely analogous. By 2020, it may become clear for most of the string theorists that the M-theory compactifications are the only way to go, some predictions – e.g. Gordon's predictions about the SUSY spectrum and cross sections – will have been validated, and all the reasonable people will simply return to the view that M-theory is at least as natural and important as GR and it has made analogous – and in fact, much more striking – predictions as GR.

In other words, the extra hindsight that we have in the case of GR – the fact that GR is an older theory (and has therefore passed a longer sequence of tests) – is the only indisputable qualitative difference between the two situations. I think that every other statement about differences (except for possible statements pointing out some particular bugs in the derivations in Gordon et al. papers, but Gross has been doing nothing of the sort) are just delusional or demagogic.

Sadly, the amount of energy that average members of the scientific, physics, or string community dedicate to the honest reading of other people's papers has decreased in recent years or a decade or so. But whenever it's true, people should be aware of this limitation of theirs and they should never try to market their laziness as no-go theorems. The fact that you or most of the people in your room don't understand something doesn't mean that it's wrong. And the greater amount of technical developments you have ignored, the greater is the probability that the problem is on your side.

by Luboš Motl ( at January 29, 2016 04:35 PM

January 28, 2016

Symmetrybreaking - Fermilab/SLAC

A mile-deep campus

Forget wide-open spaces—this campus is in a former gold mine.

Twice a week, when junior Arthur Turner heads to class at Black Hills State University in Spearfish, South Dakota, he takes an elevator to what is possibly the first nearly mile-deep educational campus in the world.

Groundwater sprinkles on his head as he travels 10 minutes and 4850 feet into a gold-mine-turned-research-facility. His goal is to help physicists there search for the origins of dark matter and the properties of neutrinos.

Sanford Underground Research Facility opened in 2007, five years after the closure of the Homestake Gold Mine. The mile of bedrock above acts as a natural shield, blocking most of the radiation that can interfere with sensitive physics experiments.

“On the surface, there are one or two cosmic rays going through your hand every second,” says Jaret Heise, the science director at Sanford Lab. But if you head underground, you reduce that flux by up to 10 million, to just one or two cosmic rays every month, he says.

Not only do these experiments need to be safeguarded from space radiation, they also need to be safeguarded from their own low levels of radiation.

“Every screw, every piece of material, has to be screened,” says BHSU Underground Campus Lab Director Brianna Mount.

BHSU offered to help Sanford Lab with this in 2014 by funding a cleanroom to maintain the background-radiation-counting detectors used to check incoming materials. Once the materials have been cleared, they can help with current experiments or build the next generation of sensitive instruments.

Heise is particularly excited for the capability to build a new generation of dark matter and neutrino detectors.

“As physics experiments become more and more sensitive, the materials from which they're made need to be that much cleaner,” Heise says. “And that's where these counters come into play, helping us to get the best materials, to fabricate these next-generation experiments."

In return, Sanford Lab offered to host an underground campus for BHSU. Two cleanrooms—one dedicated to physics and the other dedicated to biology—allow students and faculty to conduct a variety of experiments.

The lab finished outfitting the space in September 2015. Even though it’s a mile underground, the counters require their own shielding because the local rock and any nearby ductwork or concrete will give off a small amount of radiation.

Once the lab was fully shielded, a group of students, including Turner, moved in a microscope and two low-background counters. After exiting the freight elevator, also known as the cage, the students walked into an old mine shaft. Then they hiked roughly half a mile to the cleanrooms, meandering through old tunnels with floors that sparkle with mica, a common grain in the bedrock.

“It's just been one of the coolest things that I've ever been a part of … to actually see what physics researchers do,” Turner says.

All three of the instruments the students installed were quickly put to use. Heise expects that they will triple that number this year with the addition of six more detectors from labs and universities across the US.

With the opening of the underground campus, physics students can now work on low-background counting experiments in the mine. And biology students go to sites in the far regions of the mine (the facility extends as far as 8000 feet underground but is mostly buried in water below about 5800 feet) and sample water in order to study the fungi and bacteria that live there with no light and low oxygen. These critters might exist in similar crevasses on Mars or Jupiter’s moons, or they might hold the key to developing new types of antibiotics here on Earth. The students can now bring samples back to the underground laboratory (instead of having to haul them to BHSU’s main campus while packed in dry ice).

Students with non-science majors are using the new campus to their advantage too. “We've also had education majors and even a photography major underground,” Mount says. But that’s not all. Mount welcomes research ideas from students across the US—from different universities down to the littlest scientists, as young as kindergarteners. 

Although lab benches are installed in the cleanroom, it can’t easily accommodate 30 students, a typical class size, and students under the age of 18 legally cannot enter the underground lab. But BHSU has found ways to engage the students who can’t make the trek.

A professor can perform an experiment underground while a Swivl—a small robot that supports an iPad—follows him or her around the lab, streaming video back to a classroom. And the cleanroom microscope is hooked up to the Internet, allowing students to view slides in real time, something they will eventually be able to do from several states away.

by Shannon Hall at January 28, 2016 06:52 PM

January 27, 2016

ZapperZ - Physics and Physicists

Will You Be Doing This Physics Demo For Your Students?
I like my students, and I love physics demos, but I don't think I'll be doing THIS physics demo anytime soon, thankyouverymuch!

It is a neat effect, and if someone else performed this, the media would have proclaimed this as "defying the laws of physics".

Maybe I can do a demo on this on a smaller scale, perhaps  using a Barbie doll. And if you ask me how in the world I have a Barbie doll in my possession, I'll send my GI Joe to capture you!


by ZapperZ ( at January 27, 2016 01:31 PM

John Baez - Azimuth

The Internal Model Principle

“Every good key must be a model of the lock it opens.”

That sentence states an obvious fact, but perhaps also a profound insight if we interpret it generally enough.

That sentence is also the title of a paper:

• Daniel L. Scholten, Every good key must be a model of the lock it opens (the Conant & Ashby Theorem revisited), 2010.

Scholten gives a lot of examples, including these:

• A key is a model of a lock’s keyhole.

• A city street map is a model of the actual city streets

• A restaurant menu is a model of the food the restaurant prepares and sells.

• Honey bees use a kind of dance to model the location of a source of nectar.

• An understanding of some phenomenon (for example a physicist’s understanding of lightning) is a mental model of the actual phenomenon.

This line of thought has an interesting application to control theory. It suggests that to do the best job of regulating some system, a control apparatus should include a model of that system.

Indeed, much earlier, Conant and Ashby tried to turn this idea into a theorem, the ‘good regulator theorem’:

• Roger C. Conant and W. Ross Ashby, Every good regulator of a system must be a model of that system), International Journal of Systems Science 1 (1970), 89–97.

Scholten’s paper is heavily based on this earlier paper. He summarizes it as follows:

What all of this means, more or less, is that the pursuit of a goal by some dynamic agent (Regulator) in the face of a source of obstacles (System) places at least one particular and unavoidable demand on that agent, which is that the agent’s behaviors must be executed in such a reliable and predictable way that they can serve as a representation (Model) of that source of obstacles.

It’s not clear that this is true, but it’s an appealing thought.

A particularly self-referential example arises when the regulator is some organism and the System is the world it lives in, including itself. In this case, it seems the regulator should include a model of itself! This would lead, ultimately, to self-awareness.

It all sounds great. But Scholten raises an obvious question: if Conant and Ashby’s theorem is so great, why isn’t more well-known? Scholten puts it quite vividly:

Given the preponderance of control-models that are used by humans (the evidence for this preponderance will be surveyed in the latter part of the paper), and especially given the obvious need to regulate that system, one might guess that the C&A theorem would be at least as famous as, say, the Pythagorean Theorem (a^2 + b^2 = c^2), the Einstein mass-energy equivalence (E = mc^2, which can be seen on T-shirts and bumper stickers), or the DNA double helix (which actually shows up in TV crime dramas and movies about super heroes). And yet, it would appear that relatively few lay-persons have ever even heard of C&A’s important prerequisite to successful regulation.

There could be various explanations. But here’s mine: when I tried to read Conant and Ashby’s paper, I got stuck. They use some very basic mathematical notation in nonstandard ways, and they don’t clearly state the hypotheses and conclusion of their theorem.

Luckily, the paper is short, and the argument, while mysterious, seems simple. So, I immediately felt I should be able to dream up the hypotheses, conclusion, and argument based on the hints given.

Scholten’s paper didn’t help much, since he says:

Throughout the following discussion I will assume that the reader has studied Conant & Ashby’s original paper, possesses the level of technical competence required to understand their proof, and is familiar with the components of the basic model that they used to prove their theorem [….]

However, I have a guess about the essential core of Conant and Ashby’s theorem. So, I’ll state that, and then say more about their setup.

Needless to say, I looked around to see if someone else had already done the work of figuring out what Conant and Ashby were saying. The best thing I found was this:

• B. A. Francis and W. M. Wonham, The internal model principle of control theory, Automatica 12 (1976) 457–465.

This paper works in a more specialized context: linear control theory. They’ve got a linear system or ‘plant’ responding to some input, a regulator or ‘compensator’ that is trying to make the plant behave in a desired way, and a ‘disturbance’ that affects the plant in some unwanted way. They prove that to perfectly correct for the disturbance, the compensator must contain an ‘internal model’ of the disturbance.

I’m probably stating this a bit incorrectly. This paper is much more technical, but it seems to be more careful in stating assumptions and conclusions. In particular, they seem to give a precise definition of an ‘internal model’. And I read elsewhere that the ‘internal model principle’ proved here has become a classic result in control theory!

This paper says that Conant and Ashby’s paper provided “plausibility arguments in favor of the internal model idea”. So, perhaps Conant and Ashby inspired Francis and Wonham, and were then largely forgotten.

My guess

My guess is that Conant and Ashby’s theorem boils down to this:

Theorem. Let R and S be finite sets, and fix a probability distribution p on S. Suppose q is any probability distribution on R \times S such that

\displaystyle{ p(s) = \sum_{r \in R} q(r,s)  \; \textrm{for all} \; s \in S}

Let H(p) be the Shannon entropy of p and let H(q) be the Shannon entropy of q. Then

H(q) \ge H(p)

and equality is achieved if there is a function

h: S \to R

such that

q(r,s) = \left\{\begin{array}{cc} p(s)  &  \textrm{if} \; r = h(s) \\                                             0  & \textrm{otherwise}  \end{array} \right.       █

Note that this is not an ‘if and only if’.

The proof of this is pretty easy to anyone who knows a bit about probability theory and entropy. I can restate it using a bit of standard jargon, which may make it more obvious to experts. We’ve got an S-valued random variable, say \textbf{s}. We want to extend it to an R \times S-valued random variable (\textbf{r}, \textbf{s}) whose entropy is small as possible. Then we can achieve this by choosing a function h: S \to R, and letting \textbf{s} = h(\textbf{r}).

Here’s the point: if we make \textbf{s} be a function of \textbf{r}, we aren’t adding any extra randomness, so the entropy doesn’t go up.

What in the world does this have to do with a good regulator containing a model of the system it’s regulating?

Well, I can’t explain that as well as I’d like—sorry. But the rough idea seems to be this. Suppose that S is a system with a given random behavior, and R is another system, the regulator. If we want the combination of the system and regulator to behave as ‘nonrandomly’ as possible, we can let the state of the regulator be a function of the state of the system.

This theorem is actually a ‘lemma’ in Conant and Ashby’s paper. Let’s look at their setup, and the ‘good regulator theorem’ as they actually state it.

Their setup

Conant and Ashby consider five sets and three functions. In a picture:

The sets are these:

• A set Z of possible outcomes.

• A goal: some subset G \subseteq Z of good outcomes

• A set D of disturbances, which I might prefer to call ‘inputs’.

• A set S of states of some system that is affected by the disturbances.

• A set R of states of some regulator that is also affected by the disturbances.

The functions are these:

• A function \phi : D \to S saying how a disturbance determines a state of the system.

• A function \rho: D \to R saying how a disturbance determines a state of the regulator.

• A function \psi: S \times R \to Z saying how a state of the system and a state of the regulator determines an outcome.

Of course we want some conditions on these maps. What we want, I guess, is for the outcome to be good regardless of the disturbance. I might say that as follows: for every d \in D we have

\psi(\phi(d), \rho(d)) \in G

Unfortunately Conant and Ashby say they want this:

\rho \subset  [\psi^{-1}(G)]\phi

I can’t parse this: they’re using math notation in ways I don’t recognize. Can you figure out what they mean, and whether it matches my guess above?

Then, after a lot of examples and stuff, they state their theorem:

Theorem. The simplest optimal regulator R of a reguland S produces events R which are related to events S by a mapping h: S \to R.

Clearly I’ve skipped over too much! This barely makes any sense at all.

Unfortunately, looking at the text before the theorem, I don’t see these terms being explained. Furthermore, their ‘proof’ introduces extra assumptions that were not mentioned in the statement of the theorem. It begins:

The sets R, S, and Z and the mapping \psi: R \times S \to Z are presumed given. We will assume that over the set S there exists a probability distribution p(S) which gives the relative frequencies of the events in S. We will further assume that the behaviour of any particular regulator R is specified by a conditional distribution p(R|S) giving, for each event in S, a distribution on the regulatory events in R.

Get it? Now they’re saying the state of the regulator R depends on the state of the system S via a conditional probability distribution p(r|s) where r \in R and s \in S. It’s odd that they didn’t mention this earlier! Their picture made it look like the state of the regulator is determined by the ‘disturbance’ via the function \rho: D \to R. But okay.

They’re also assuming there’s a probability distribution on S. They use this and the above conditional probability distribution to get a probability distribution on R.

In fact, the set D and the functions out of this set seem to play no role in their proof!

It’s unclear to me exactly what we’re given, what we get to choose, and what we’re trying to optimize. They do try to explain this. Here’s what they say:

Now p(S) and p(R|S) jointly determine p(R,S) and hence p(Z) and H(Z), the entropy in the set of outcomes:

\displaystyle{ H(Z) = - \sum_{z \in Z} p(z) \log (p(z)) }

With p(S) fixed, the class of optimal regulators therefore corresponds to the class of optimal distributions p(R|S) for which H(Z) is minimal. We will call this class of optimal distributions \pi.

I could write a little essay on why this makes me unhappy, but never mind. I’m used to the habit of using the same letter p to stand for probability distributions on lots of different sets: folks let the argument of p say which set they have in mind at any moment. So, they’re starting with a probability distribution on S and a conditional probability distribution on r \in R given s \in S. They’re using these to determine probability distribution on R \times S. Then, presumably using the map \psi: S \times R \to Z, they get a probability distribution on Z. H(Z) is the entropy of the probability distribution on Z, and for some reason they are trying to minimize this.

(Where did the subset G \subseteq Z of ‘good’ outcomes go? Shouldn’t that play a role? Oh well.)

I believe the claim is that when this entropy is minimized, there’s a function h : S \to R such that

p(r|s) = 1 \; \textrm{if} \; r = h(s)

This says that the state of the regulator should be completely determined by the the state of the system. And this, I believe, is what they mean by

Every good regulator of a system must be a model of that system.

I hope you understand: I’m not worrying about whether the setup is a good one, e.g. sufficiently general for real-world applications. I’m just trying to figure out what the setup actually is, what Conant and Ashby’s theorem actually says, and whether it’s true.

I think I’ve just made a lot of progress. Surely this was no fun to read. But it I found it useful to write it.

by John Baez at January 27, 2016 01:00 AM

January 26, 2016

Symmetrybreaking - Fermilab/SLAC

Our imperfect vacuum

The emptiest parts of the universe aren’t so empty after all.

In the Large Hadron Collider, two beams of protons race around a 17-mile ring more than 1 million times before slamming into each other inside the massive particle detectors.

But rogue particles inside the beam pipes can pick off protons with premature collisions, reducing the intensity of the beam. As a result, the teams behind the LHC and other physics experiments around the world take great care to scrub their experimental spaces of as many unwanted particles as possible.

Ideally, they would conduct their experiments in a perfect vacuum. The problem is that there’s no such thing.

Even after a thorough evacuating, the LHC’s beam pipes contain about 3 million molecules per cubic centimeter. That density of particles is similar to what you would find 620 miles above Earth.

“In the real world [a perfect vacuum] doesn’t happen,” says Linda Valerio, a mechanical engineer who works on the vacuum system at Fermi National Accelerator Laboratory. “Scientists are able to determine the acceptable level of vacuum required for each experiment, and engineers design the vacuum system to that level. The better the vacuum must be, the more cost and effort associated with achieving and maintaining it."

Humans have been thinking about vacuums for thousands of years. Ancient philosophers called atomists argued that the world was made up of two elements: atoms and the void. Aristotle argued that nature would not allow a void to exist; particles around a vacuum would always move to fill it. And in the early 1700s, Isaac Newton argued that what seemed to be empty space was actually filled with an element called aether, a medium through which light could travel.

Physicists now know that even what appears to be empty space contains particles. Spaces between galaxies are known to contain a few hydrogen atoms per cubic meter.

No space is ever truly empty. Virtual particles pop in and out of existence everywhere. Virtual particles appear in matter-antimatter pairs and annihilate one another almost instantly. But they can interact with actual particles, which is how scientists find evidence of their existence.

Another inhabitant of the void is the faint thermal radiation left over from the big bang. It exists as a pattern of photons called the cosmic microwave background.

“When we think about vacuums, we generally think about [the absence of] particles with mass,” says Seth Digel, a SLAC experimental physicist who works with the Kavli Institute for Particle Astrophysics and Cosmology. “But if you expand the definition to include photons, to include the microwave background, then there isn’t any part of space that’s really empty."

It turns out the universe is a little less lonely than previously thought.

by Signe Brewster at January 26, 2016 03:44 PM

January 24, 2016

Geraint Lewis - Cosmic Horizons

Journey to the Far-Side of the Sun
There was a movie, in the old days, Journey to the Far-Side of the Sun (also known as Doppleganger) which (spoiler alert) posits that there is a mirror version of the Earth hidden on the other side of the Sun, sharing the orbit with our Earth. The idea is that this planet would always be hidden behind the Sun, and so we would not know it there there.

This idea comes up a lot, over and over again. In fact, it came up again last week on twitter. But there's a problem. It assumes the Earth is on a circular orbit.

I won't go into the details here, but one of the greatest insights in astronomy was the discovery of Kepler's laws of planetary motion, telling us that planets move on elliptical orbits. With this, there was the realisation that planets can't move at uniform speeds, but travel quickly when closer to the Sun, while slowing down as their orbits carry them to larger distance.
 There has been a lot of work examining orbits in the Solar System, and you can simply locate the position of a planet along its orbit. So it is similarly simply to consider two planets sharing the same orbit, but starting at different locations, one at the closest approach to the Sun, one at the farthest.

Let's start with a simple circular orbit with two planets. Everything here is scaled to the Earth's orbit, and the circles in the figures coming up are not to scale. By here's an instance in the orbit.

It should be obvious that at all points in the orbit, the planets remain exactly on opposite sides of the Sun, and so would not be visible to each other.

So, here's a way of conveying this. The x-axis is the point in the orbit (in Earth Years) while the y-axis is the distance a light ray between the two planets passes from the centre of the Sun (blue line). The red line is the radius of the Sun (in Astronomical Units).
The blue line, as expected, is at zero. The planets remain hidden from each other.

Let's take a more eccentric orbit, with an eccentricity of 0.1. Here is the orbit
This doesn't look too different to the circular case above. The red circle in there is the location of the closest approach of each line of sight to the centre of the Sun, which is no longer a point. Let's take a look at the separation plot as before. Again, the red is the radius of the Sun.
Wow! For small segments of the planets orbits, they are hidden from one another, but for most of the orbit, the light between the planets pass at large distances from the Sun. Now, it might be tricky to see each other directly due to the glare of the Sun, but opportunities such as eclipses would mean the planets should be visible to one another.

But an eccentricity of 0.1 is much more than that of the Earth, whose orbit is much closer to a circle with an eccentricity of 0.0167086 . Here's the orbit plot again.
So, the separation of the paths between the planets pass closer to the centre of the Sun, but, of course, smaller than the more eccentric orbits. What about the separation plot?
Excellent! As we saw before, for a large part of the orbits, the light paths between the planets passes outside the Sun! If the Earth did have a twin in the same orbit, it would be visible (modulo the glare of the Sun) for most of the year! We have never seen our Doppleganger planet!

Now, you might complain that maybe the other Earth is on the same elliptical orbit but flipped so we are both at closest approach at the same time, always being exactly on the other side of the Sun from one another. Maybe, but orbital mechanics are a little more complex than that, especially with a planet like Jupiter in the Solar System. It's tugs would be different on the Earth and its (evil?) twin, and so the orbits would subtly differ over time.

It is pretty hard to hide a planet in the inner Solar System!

by Cusp ( at January 24, 2016 11:20 PM

Jester - Resonaances

Gunpowder Plot: foiled
Just a week ago I hailed the new king, and already there was an assassination attempt. A new paper claims that the statistical significance of the 750 GeV diphoton excess is merely 2 sigma local. The  story is being widely discussed in the corridors and comment sections because we all like to watch things die...  The assassins used this plot:

The Standard Model prediction for the diphoton background at the LHC is difficult to calculate from first principles. Therefore,  the ATLAS collaboration assumes a theoretically motivated functional form for this background as a function of the diphoton invariant mass. The ansatz contains a number of free parameters, which are then fitted using the data in the entire analyzed range of invariant masses. This procedure leads to the prediction represented by the dashed line in the plot (but see later). The new paper assumes a slightly more complicated functional form with more free parameters, such that the slope of the background is allowed to change.  The authors argue that their more general  ansatz provides a better fit to the entire diphoton spectrum, and moreover predicts a larger background for the large invariant masses.  As a result, the significance of the 750 GeV excess decreases to an insignificant value of 2 sigma.
There are several problems with this claim.  First, I'm confused why the blue line is described as the ATLAS fit, since it is clearly different than the background curve in the money-plot provided by ATLAS (Fig. 1 in ATLAS-CONF-2015-081). The true ATLAS background is above the blue line, and much closer to the black line in the peak region (edit: it seems now that the background curve plotted by ATLAS corresponds to a1=0  and one more free parameter for an overall normalization, while the paper assumes fixed normalization). Second, I cannot reproduce the significance quoted in the paper. Taking the two ATLAS bins around 750 GeV, I find 3.2 sigma excess using the true ATLAS background, and 2.6 sigma using the black line (edit: this is because my  estimate is too simplistic, and the paper also takes into account the uncertainty on the background curve). Third, the postulated change of slope is difficult to justify theoretically. It would mean there is a new background component kicking in at ~500 GeV, but this does not seem to be the case in this analysis.

Finally, the problem with the black line is that it grossly overshoots the high mass tail,  which is visible even to a naked eye.  To be more quantitative, in the range 790-1590 GeV there are 17 diphoton events observed by ATLAS,  the true ATLAS backgrounds predicts 19 events, and the black line predicts 33 events. Therefore, the background shape proposed in the paper is inconsistent with the tail at the 3 sigma level! While the alternative background choice decreases the  significance at the 750 GeV peak, it simply moves (and amplifies) the tension to another place.

So, I think the plot is foiled and the  claim does not stand scrutiny.  The 750 GeV peak may well be just a statistical fluctuation that will go away when more data is collected, but it's unlikely to be a stupid error on the part of ATLAS. The king will live at least until summer.

by Jester ( at January 24, 2016 01:33 PM

January 23, 2016

Jester - Resonaances

Higgs force awakens
The Higgs boson couples to particles that constitute matter around us, such as electrons, protons, and neutrons. Its virtual quanta are constantly being exchanged between these particles.  In other words, it gives rise to a force -  the Higgs force. I'm surprised why this PR-cool aspect is not explored in our outreach efforts. Higgs bosons mediate the Higgs force in the same fashion as gravitons, gluons, photons, W and Z bosons mediate  the gravity, strong, electromagnetic, and  weak forces. Just like gravity, the Higgs force is always attractive and its strength is proportional, in the first approximation, to particle's mass. It is a force in a common sense; for example, if we bombarded long enough a detector with a beam of particles interacting only via the Higgs force, they would eventually knock off atoms in the detector.

There is of course a reason why the Higgs force is less discussed: it has never been detected directly. Indeed, in the absence of midi-chlorians it is extremely weak. First, it shares the feature of the weak interactions of being short-ranged: since the mediator is massive, the interaction strength is exponentially suppressed at distances larger than an attometer (10^-18 m), about 0.1% of the diameter of a proton. Moreover, for ordinary matter, the weak force is more important because of the tiny Higgs couplings to light quarks and electrons. For example, for the proton the Higgs force is thousand times weaker than the weak force, and for the electron it is hundred thousand times weaker. Finally, there are no known particles interacting only via the Higgs force and gravity (though dark matter in some hypothetical models has this property), so in practice the Higgs force is always a tiny correction to more powerful forces that shape the structure of atoms and nuclei. This is again in contrast to the weak force, which is particularly relevant for neutrinos who are immune to strong and electromagnetic forces.

Nevertheless, this new paper argues that the situation is not hopeless, and that the current experimental sensitivity is good enough to start probing the Higgs force. The authors propose to do it by means of atom spectroscopy. Frequency measurements of atomic transitions have reached the stunning accuracy of order 10^-18. The Higgs force creates a Yukawa type potential between the nucleus and orbiting electrons, which leads to a shift of the atomic levels. The effect is tiny, in particular it  is always smaller than the analogous shift due to the weak force. This is a serious problem, because calculations of the leading effects may not be accurate enough to extract the subleading Higgs contribution.  Fortunately, there may be tricks to reduce the uncertainties. One is to measure how the isotope shift of transition frequencies for several isotope pairs. The theory says that the leading atomic interactions should give rise to a universal linear relation (the so-called King's relation) between  isotope shifts for different transitions. The Higgs and weak interactions should lead to a violation of King's relation. Given many uncertainties plaguing calculations of atomic levels, it may still be difficult to ever claim a detection of the Higgs force. More realistically, one can try to set limits on the Higgs couplings to light fermions which will be better than the current collider limits.  

Atomic spectroscopy is way above my head, so I cannot judge if the proposal is realistic. There are a few practical issues to resolve before the Higgs force is mastered into a lightsaber. However, it is possible that a new front to study the Higgs boson will be opened in the near future. These studies will provide information about the Higgs couplings to light Standard Model fermions, which is complementary to the information obtained from collider searches.

by Jester ( at January 23, 2016 05:23 PM

January 20, 2016

Axel Maas - Looking Inside the Standard Model

More similar than expected
Some while ago I have written about a project a master student and myself have embarked upon: Using a so-called supersymmetric theory - or SUSY theory for short - to better understand ordinary theories.

Well, this work has come to fruition, both in the form of the completion of the master project as well as new insights written up in a paper. This time I would like to present these results a little bit.

To start, let me briefly rehearse what we did, and why. One of the aims of our research is to better understand how the theories work we are using to describe nature. A particular feature of these theories is redundancy. This redundancy makes many calculations possible, but at the same time introduces new problems, mainly about how to get unique results.

Now, in theories like the standard model, we have all problems at the same time: All the physics, all the formalism, and especially all the redundancy. But this is a tedious mess. It is therefore best to reduce the complexity and solve one problem at a time. This is done by taking a simpler theory, which has only one of the problems. This is what we did.

We took a (maximal) SUSY theory. In such a theory, the supersymmetry is very constraining, and a lot of features are exactly known. But the implications of redundancy are not. So we hoped that by applying the same procedures we use to deal with the redundancy in ordinary theories to this theory, we could check whether our approach is valid.

Of course, the first, and expected, finding was that even a very constraining theory is not simple. When it comes to technical details, anything interesting becomes a hard problem. So it required a lot of grinding work before we got results. I will not bore you with the details. If you want them, you can find them in the paper. No, here I want to discuss the final result.

The first finding was a rather comforting one. Doing the same things to this theory that we do to ordinary theories did not do too much damage. Using these approximations, the final results were still in agreement with what we do know exactly about this theory. This was a relief, because this lends a tiny amount of support more to what we are usually doing.

The real surprise was, however, a very different one. We knew that this theory shows a very different kind of behavior than all the theories we are usually dealing with. So we did expect that, even if our methods work, the results will still be drastically different from the other cases we are dealing with. But this was not so.

To understand better what we have found, it is necessary to know that this theory is similar in structure to a conventional theory. This conventional one is a theory of gluons, but without quarks to make the strong interactions complete. In the SUSY theory, we also have gluons. In addition, we have further new particles, which are needed to get.

The first surprise was that the gluons behaved unexpectedly similar to their counterparts in the original theory. Of course, there are differences, but these differences were expected. They came from the differences of both theories. But where they could be similar, they were. And not roughly so, but surprisingly precisely so. We have an idea why this could be the case, because there is one structural property, which is very restricting, and which appears in both theories. But we know that this is not enough, as we now other theories where this still is different, despite also having this one constraining structure. Since the way how the gluons are very similar is strongly influenced by the redundancy features of both theories, we can hope that this means we are treating the redundancy in a reasonable way.

The second surprise was that the new particles mirror the behavior of the gluons. Even though these particles are connected by supersymmetry to the gluons, the connection would have allowed many possible shapes of relations. But no, the relation is an almost exact mirror. And this time, there is no constraining structure which gives us a clue why, out of all possible relations, this one is picked. However, this is again related to redundancy, and perhaps, just speculating here, this could indicate more about how this redundancy works.

In total, we have learned quite a lot. We have more support for what we doing in ordinary theories. We have seen that some structures might be more universal than expected. And we may even have a clue in which direction we could learn more about how to deal with the redundancy in more immediately relevant theories.

by Axel Maas ( at January 20, 2016 05:24 PM

Symmetrybreaking - Fermilab/SLAC

Is the neutrino its own antiparticle?

The mysterious particle could hold the key to why matter won out over antimatter in the early universe.

Almost every particle has an antimatter counterpart: a particle with the same mass but opposite charge, among other qualities. 

This seems to be true of neutrinos, tiny particles that are constantly streaming through us. Judging by the particles released when a neutrino interacts with other matter, scientists can tell when they’ve caught a neutrino versus an antineutrino. 

But certain characteristics of neutrinos and antineutrinos make scientists wonder: Are they one and the same? Are neutrinos their own antiparticles?

This isn’t unheard of. Gluons and even Higgs bosons are thought to be their own antiparticles. But if scientists discover neutrinos are their own antiparticles, it could be a clue as to where they get their tiny masses—and whether they played a part in the existence of our matter-dominated universe. 

Dirac versus Majorana

The idea of the antiparticle came about in 1928 when British physicist Paul Dirac developed what became known as the Dirac equation. His work sought to explain what happened when electrons moved at close to the speed of light. But his calculations resulted in a strange requirement: that electrons sometimes have negative energy.

“When Dirac wrote down his equation, that’s when he learned antiparticles exist,” says André de Gouvêa, a theoretical physicist and professor at Northwestern University. “Antiparticles are a consequence of his equation.”

Physicist Carl Anderson discovered the antimatter partner of the electron that Dirac foresaw in 1932. He called it the positron—a particle like an electron but with a positive charge.

Dirac predicted that, in addition to having opposite charges, antimatter partners should have another opposite feature called chirality, which represents one of the inherent quantum properties a particle has. A particle can have either a right-handed or left-handed chirality.

Dirac’s equation allowed for neutrinos and antineutrinos to be different particles, and, as a result, four types of neutrino were possible: neutrinos with left- and right-handed chirality and antineutrinos with left- and right-handed chirality. 

But if the neutrinos had no mass, as scientists thought at the time, only left-handed neutrinos and right-handed antineutrinos needed to exist.

In 1937, Italian physicist Ettore Majorana debuted another theory: Neutrinos and antineutrinos are actually the same thing. The Majorana equation described neutrinos that, if they happened to have mass after all, could turn into antineutrinos and then back into neutrinos again. 


Artwork by Sandbox Studio, Chicago with Ana Kova

The matter-antimatter imbalance

Whether neutrino masses were zero remained a mystery until 1998, when the Super-Kamiokande and SNO experiments found they do indeed have very small masses—an achievement recognized with the 2015 Nobel Prize for Physics. Since then, experiments have cropped up across Asia, Europe and North America searching for hints that the neutrino is its own antiparticle.

The key to finding this evidence is something called lepton number conservation. Scientists consider it a fundamental law of nature that lepton number is conserved, meaning that the number of leptons and anti-leptons involved in an interaction should remain the same before and after the interaction occurs.

Scientists think that, just after the big bang, the universe should have contained equal amounts of matter and antimatter. The two types of particles should have interacted, gradually canceling one another until nothing but energy was left behind. Somehow, that’s not what happened.

Finding out that lepton number is not conserved would open up a loophole that would allow for the current imbalance between matter and antimatter. And neutrino interactions could be the place to find that loophole.

Neutrinoless double-beta decay

Scientists are looking for lepton number violation in a process called double beta decay, says SLAC theorist Alexander Friedland, who specializes in the study of neutrinos.

In its common form, double beta decay is a process in which a nucleus decays into a different nucleus and emits two electrons and two antineutrinos. This balances leptonic matter and antimatter both before and after the decay process, so it conserves lepton number.

If neutrinos are their own antiparticles, it’s possible that the antineutrinos emitted during double beta decay could annihilate one another and disappear, violating lepton number conservation. This is called neutrinoless double beta decay.

Such a process would favor matter over antimatter, creating an imbalance.

“Theoretically it would cause a profound revolution in our understanding of where particles get their mass,” Friedland says. “It would also tell us there has to be some new physics at very, very high energy scales—that there is something new in addition to the Standard Model we know and love.”

It’s possible that neutrinos and antineutrinos are different, and that there are two neutrino and anti-neutrino states, as called for in Dirac’s equation. The two missing states could be so elusive that physicists have yet to spot them.

But spotting evidence of neutrinoless double beta decay would be a sign that Majorana had the right idea instead—neutrinos and antineutrinos are the same.

“These are very difficult experiments,” de Gouvêa says. “They’re similar to dark matter experiments in the sense they have to be done in very quiet environments with very clean detectors and no radioactivity from anything except the nucleus you're trying to study."

Physicists are still evaluating their understanding of the elusive particles.

“There have been so many surprises coming out of neutrino physics,” says Reina Maruyama, a professor at Yale University associated with the CUORE neutrinoless double beta decay experiment. “I think it’s really exciting to think about what we don’t know.”

by Signe Brewster at January 20, 2016 05:00 PM

January 19, 2016

Symmetrybreaking - Fermilab/SLAC

A speed trap for dark matter

Analyzing the motion of X-ray sources could help researchers identify dark matter signals.  

Dark matter or not dark matter? That is the question when it comes to the origin of intriguing X-ray signals scientists have found coming from space.

In a theory paper published today in Physical Review Letters, scientists have suggested a surprisingly simple way of finding the answer: by setting up a speed trap for the enigmatic particles.

Eighty-five percent of all matter in the universe is dark: It doesn’t emit light, nor does it interact much with regular matter other than through gravity.

The nature of dark matter remains one of the biggest mysteries of modern physics. Most researchers believe that the invisible substance is made of fundamental particles, but so far they’ve evaded detection. One way scientists hope to prove their particle assumption is by searching the sky for energetic light that would emerge when dark matter particles decayed or annihilated each other in space.

Over the past couple of years, several groups analyzing data from two X-ray satellites—the European Space Agency’s XMM-Newton and NASA’s Chandra X-ray space observatories—reported the detection of faint X-rays with a well-defined energy of 3500 electronvolts (3.5 keV). The signal emanated from the center of the Milky Way; its nearest neighbor galaxy, Andromeda; and a number of galaxy clusters.

Some scientists believe it might be a telltale sign of decaying dark matter particles called sterile neutrinos—hypothetical heavier siblings of the known neutrinos produced in fusion reactions in the sun, radioactive decays and other nuclear processes. However, other researchers argue that there could be more mundane astrophysical origins such as hot gases.

There might be a straightforward way of distinguishing between the two possibilities, suggest researchers from Ohio State University and the Kavli Institute for Particle Astrophysics and Cosmology, a joint institute of Stanford University and the US Department of Energy's SLAC National Accelerator Laboratory.

It involves taking a closer look at the Doppler shifts of the X-ray signal. The Doppler effect is the shift of a signal to higher or lower frequencies depending on the relative velocity between the signal source and its observer. It’s used, for instance, in roadside speed traps by the police, but it could also help astrophysicists “catch” dark matter particles.

“On average, dark matter moves differently than gas,” says study co-author Ranjan Laha from KIPAC. “Dark matter has random motion, whereas gas rotates with the galaxies to which it is confined. By measuring the Doppler shifts in different directions, we can in principle tell whether a signal—X-rays or any other frequency—stems from decaying dark matter particles or not.”

Researchers would even know if the signal were caused by the observation instrument itself because then the Doppler shift would be zero for all directions

Although a promising approach, it can’t just yet be applied to the 3.5-keV X-rays because the associated Doppler shifts are very small. Current instruments either don’t have enough energy resolution for the analysis or they don’t operate in the right energy range.

However, this situation may change very soon with ASTRO-H, an X-ray satellite of the Japan Aerospace Exploration Agency, whose launch is planned for early this year. As the researchers show in their paper, it will have just the right specifications to return a verdict on the mystery X-ray line. Dark matter had better watch its speed.

by Manuel Gnida at January 19, 2016 03:11 PM

January 17, 2016

Jester - Resonaances

750 and what next
A week has passed since the LHC jamboree, but the excitement about the 750 GeV diphoton excess has not abated. So far, the scenario from 2011 repeats itself. A significant but not definitive signal is spotted in the early data set by the ATLAS and CMS experiments. This announcement is wrapped in multiple layers of caution and skepticism by experimentalists, but is universally embraced by theorists. What is unprecedented is the scale of theorist's response, which took a form of a hep-ph tsunami.    I still need time to digest this feast, and pick up interesting bits among general citation fishing.  So today I won't write about the specific models in which the 750 GeV particle could fit: I promise a post on that after the New Year (anyway, the short story is that, oh my god, it could be just anybody). Instead, I want to write about one point that was elucidated by the early papers,  namely that the diphoton resonance signal is unlikely to be on its own, and there should be accompanying signals in other channels. In the best case scenario, confirmation of the diphoton signal may come by analyzing the existing data in other channels collected this year or in run-1.

First of all, there should be a dijet signal. Since the new particle is almost certainly produced via gluon collisions,  it must be able to decay to gluons as well by time-reversing the production process. This would show up at the LHC as a pair of energetic jets with the invariant mass of 750 GeV. Moreover, in simplest models the 750 GeV particle decays to gluons most of the times. The precise dijet rate is very model-dependent, and in some models it  is too small to ever be observed, but typical scenarios predict order 1-10 picobarn dijet cross-sections. This would mean that thousands of such events have been produced in the LHC run-1 and this year in run-2. The plot on the right shows one example of a parameter space (green) overlaid with contours of dijet cross section (red lines) and limits from dijet resonance searches in run-1 with 8 TeV proton collisions (red area). Dijet resonance searches are routine at the LHC, however experimenters usually focus on the high-energy end of the spectrum, far above 1 TeV invariant mass. In fact, the 750 GeV region is not covered at all by the recent LHC searches at 13 TeV proton collision energy.

The next important conclusion is that there should be matching signals in other diboson channels at the 750 GeV invariant mass. For the 125 GeV Higgs boson, the signal was originally discovered  in  both the γγ and  the ZZ final states, while in  the WW channel the signal is currently similarly strong. If the 750 GeV particle were anything like the Higgs, the resonance should actually first show in the ZZ and WW final states (due to the large coupling to longitudinal polarizations of vector bosons which is a characteristic feature of Higgs-like particles).  From the non-observation of anything interesting in run-1 one can conclude that there must be little Higgsiness in the 750 GeV particle, less than 10%.  Nevertheless, even if the particle has nothing to do with the Higgs (for example, if it's a pseudo-scalar), it should still decay to diboson final states once in a while. This is because a neutral scalar cannot couple directly to photons, and the coupling has to  arise at the quantum level through some other new electrically charged particles, see the diagram above. The latter couple not only to photons but also to Z bosons, and sometimes to W bosons too.  While the details of the branching fractions are highly dependent, diboson signals  with comparable rates as the diphoton one are  generically predicted.  In this respect, the decays of the 750 GeV particle to one photon and one Z boson emerge as a new interesting battleground.  For the 125 GeV Higgs boson, decays to Zγ have not been observed yet, but in the heavier mass range the sensitivity is apparently better.  ATLAS made a search for high-mass Zγ resonances in the run-1 data,  and their limits already put non-trivial constraint on some models explaining the 750 GeV excess. Amusingly, the ATLAS Zγ search has a 1 sigma excess at 730 GeV...   CMS has no search in this mass range at all, and both experiments are yet to analyze the run-2 data in this channel.  So, in principle,  it is well possible that we learn something interesting even before the new round of collisions starts at the LHC.

Another generic prediction is that there should be vector-like quarks or other new colored particles just behind the corner. As mentioned above, such particles are necessary to generate an effective coupling of the 750 GeV particle to photons and gluons. In order for those couplings to be large enough to explain the observed signal,  at least one of the new states should have mass below ~1.5 TeV. Limits on vector-like quarks depend on what they decay to,  but the typical sensitivity in run-1 is around 800 GeV. In run-2, CMS already presented a search for a charge 5/3 quark decaying to a top quark and a W boson, and they were able to improve the run-1 limits on the new quark's mass from 800 GeV up to 950 GeV. Limits on other type of new quarks should follow shortly.

On a bit more speculative side, ATLAS claims that the best fit to the data is obtained if the 750 GeV resonance is wider than the experimental resolution. While the statistical significance of this statement is not very high, it would have profound consequences if confirmed. Large width is possible only if the 750 GeV particle decays to other final states than photons and gluons. An exciting possibility is that the large width is due to decays to a new hidden sector with new light particles very weakly or not at all coupled to the Standard Model. If these particles do not leave any trace in the detector then the signal is the same monojet signature as that of dark matter: an energetic jet emitted before the collision without matching activity on the other side of the detector. In fact, dark matter searches in run-1 practically exclude the  possibility that the large width can be accounted for uniquely by invisible decays (see comments #2 and #13 below).  However, if the new particles in the hidden sector couple weakly to the known particles, they can decay back to our sector, possibly after some delay, leading to complicated exotic signals in the detector. This is the so-called hidden valley scenario that my fellow blogger has been promoting for some time. If the 750 GeV particle is confirmed to have a large width, the motivation for this kind of new physics will become very strong. Many of the possible signals that one can imagine in this context are yet to be searched for.    

Dijets, dibosons, monojets, vector-like quarks, hidden valley...  experimentalists will have hands full this winter.  A negative result in any of these searches would not strongly disfavor the diphoton signal, but would provide important clues for model building. A positive signal would break all hell loose, assuming it hasn't yet. So, we are waiting eagerly for further results from the LHC,  which should show up  around the time of the Moriond conference in March. Watch out for rumors on blogs and Twitter ;)

by Jester ( at January 17, 2016 07:11 PM

The n-Category Cafe

A Compositional Framework for Markov Processes

Last summer my students Brendan Fong and Blake Pollard visited me at the Centre for Quantum Technologies, and we figured out how to understand open continuous-time Markov chains! I think this is a nice step towards understanding the math of living systems.

Admittedly, it’s just a small first step. But I’m excited by this step, since Blake and I have been trying to get this stuff to work for a couple years, and it finally fell into place. And we think we know what to do next. Here’s our paper:

And here’s the basic idea….

Open detailed balanced Markov processes

A continuous-time Markov chain is a way to specify the dynamics of a population which is spread across some finite set of states. Population can flow between the states. The larger the population of a state, the more rapidly population flows out of the state. Because of this property, under certain conditions the populations of the states tend toward an equilibrium where at any state the inflow of population is balanced by its outflow.

In applications to statistical mechanics, we are often interested in equilibria such that for any two states connected by an edge, say <semantics>i<annotation encoding="application/x-tex">i</annotation></semantics> and <semantics>j,<annotation encoding="application/x-tex">j,</annotation></semantics> the flow from <semantics>i<annotation encoding="application/x-tex">i</annotation></semantics> to <semantics>j<annotation encoding="application/x-tex">j</annotation></semantics> equals the flow from <semantics>j<annotation encoding="application/x-tex">j</annotation></semantics> to <semantics>i.<annotation encoding="application/x-tex">i.</annotation></semantics> A continuous-time Markov chain with a chosen equilibrium having this property is called ‘detailed balanced’.

I’m getting tired of saying ‘continuous-time Markov chain’, so from now on I’ll just say ‘Markov process’, just because it’s shorter. Okay? That will let me say the next sentence without running out of breath:

Our paper is about open detailed balanced Markov processes.

Here’s an example:

The detailed balanced Markov process itself consists of a finite set of states together with a finite set of edges between them, with each state <semantics>i<annotation encoding="application/x-tex">i</annotation></semantics> labelled by an equilibrium population <semantics>q i>0,<annotation encoding="application/x-tex">q_i &gt;0,</annotation></semantics> and each edge <semantics>e<annotation encoding="application/x-tex">e</annotation></semantics> labelled by a rate constant <semantics>r e>0.<annotation encoding="application/x-tex">r_e &gt; 0.</annotation></semantics>

These populations and rate constants are required to obey an equation called the ‘detailed balance condition’. This equation means that in equilibrium, the flow from <semantics>i<annotation encoding="application/x-tex">i</annotation></semantics> to <semantics>j<annotation encoding="application/x-tex">j</annotation></semantics> equal the flow from <semantics>j<annotation encoding="application/x-tex">j</annotation></semantics> to <semantics>i.<annotation encoding="application/x-tex">i.</annotation></semantics> Do you see how it works in this example?

To get an ‘open’ detailed balanced Markov process, some states are designated as inputs or outputs. In general each state may be specified as both an input and an output, or as inputs and outputs multiple times. See how that’s happening in this example? It may seem weird, but it makes things work better.

People usually say Markov processes are all about how probabilities flow from one state to another. But we work with un-normalized probabilities, which we call ‘populations’, rather than probabilities that must sum to 1. The reason is that in an open Markov process, probability is not conserved: it can flow in or out at the inputs and outputs. We allow it to flow both in and out at both the input states and the output states.

Our most fundamental result is that there’s a category <semantics>DetBalMark<annotation encoding="application/x-tex">{DetBalMark}</annotation></semantics> where a morphism is an open detailed balanced Markov process. We think of it as a morphism from its inputs to its outputs.

We compose morphisms in <semantics>DetBalMark<annotation encoding="application/x-tex">{DetBalMark}</annotation></semantics> by identifying the output states of one open detailed balanced Markov process with the input states of another. The populations of identified states must match. For example, we may compose this morphism <semantics>N<annotation encoding="application/x-tex">N</annotation></semantics>:

with the previously shown morphism <semantics>M<annotation encoding="application/x-tex">M</annotation></semantics> to get this morphism <semantics>MN<annotation encoding="application/x-tex">M \circ N</annotation></semantics>:

And here’s our second most fundamental result: the category <semantics>DetBalMark<annotation encoding="application/x-tex">{DetBalMark}</annotation></semantics> is actually a dagger compact category. This lets us do other stuff with open Markov processes. An important one is ‘tensoring’, which lets us take two open Markov processes like <semantics>M<annotation encoding="application/x-tex">M</annotation></semantics> and <semantics>N<annotation encoding="application/x-tex">N</annotation></semantics> above and set them side by side, giving <semantics>MN<annotation encoding="application/x-tex">M \otimes N</annotation></semantics>:

The compactness is also important. This means we can take some inputs of an open Markov process and turn them into outputs, or vice versa. For example, using the compactness of <semantics>DetBalMark<annotation encoding="application/x-tex">{DetBalMark}</annotation></semantics> we can get this open Markov process from <semantics>M<annotation encoding="application/x-tex">M</annotation></semantics>:

In fact all the categories in our paper are dagger compact categories, and all our functors preserve this structure. Dagger compact categories are a well-known framework for describing systems with inputs and outputs, so this is good.

The analogy to electrical circuits

In a detailed balanced Markov process, population can flow along edges. In the detailed balanced equilibrium, without any flow of population from outside, the flow along from state <semantics>i<annotation encoding="application/x-tex">i</annotation></semantics> to state <semantics>j<annotation encoding="application/x-tex">j</annotation></semantics> will be matched by the flow back from <semantics>j<annotation encoding="application/x-tex">j</annotation></semantics> to <semantics>i.<annotation encoding="application/x-tex">i.</annotation></semantics> The populations need to take specific values for this to occur.

In an electrical circuit made of linear resistors, charge can flow along wires. In equilibrium, without any driving voltage from outside, the current along each wire will be zero. The potentials will be equal at every node.

This sets up an analogy between detailed balanced continuous-time Markov chains and electrical circuits made of linear resistors! I love analogy charts, so this makes me very happy:

    Circuits    Detailed balanced Markov processes
potential population
current flow
conductance rate constant
power dissipation

This analogy is already well known. Schnakenberg used it in his book Thermodynamic Network Analysis of Biological Systems. So, our main goal is to formalize and exploit it. This analogy extends from systems in equilibrium to the more interesting case of nonequilibrium steady states, which are the main topic of our paper.

Earlier, Brendan and I introduced a way to ‘black box’ a circuit and define the relation it determines between potential-current pairs at the input and output terminals. This relation describes the circuit’s external behavior as seen by an observer who can only perform measurements at the terminals.

An important fact is that black boxing is ‘compositional’: if one builds a circuit from smaller pieces, the external behavior of the whole circuit can be determined from the external behaviors of the pieces. For category theorists, this means that black boxing is a functor!

Our new paper with Blake develops a similar ‘black box functor’ for detailed balanced Markov processes, and relates it to the earlier one for circuits.

When you black box a detailed balanced Markov process, you get the relation between population–flow pairs at the terminals. (By the ‘flow at a terminal’, we more precisely mean the net population outflow.) This relation holds not only in equilibrium, but also in any nonequilibrium steady state. Thus, black boxing an open detailed balanced Markov process gives its steady state dynamics as seen by an observer who can only measure populations and flows at the terminals.

The principle of minimum dissipation

At least since the work of Prigogine, it’s been widely accepted that a large class of systems minimize entropy production in a nonequilibrium steady state. But people still fight about the the precise boundary of this class of systems, and even the meaning of this ‘principle of minimum entropy production’.

For detailed balanced open Markov processes, we show that a quantity we call the ‘dissipation’ is minimized in any steady state. This is a quadratic function of the populations and flows, analogous to the power dissipation of a circuit made of resistors. We make no claim that this quadratic function actually deserves to be called ‘entropy production’. Indeed, Schnakenberg has convincingly argued that they are only approximately equal.

But still, the ‘dissipation’ function is very natural and useful—and Prigogine’s so-called ‘entropy production’ is also a quadratic function.

Black boxing

I’ve already mentioned the category <semantics>DetBalMark,<annotation encoding="application/x-tex">{DetBalMark},</annotation></semantics> where a morphism is an open detailed balanced Markov process. But our paper needs two more categories to tell its story! There’s the category of circuits, and the category of linear relations.

A morphism in the category <semantics>Circ<annotation encoding="application/x-tex">{Circ}</annotation></semantics> is an open electrical circuit made of resistors: that is, a graph with each edge labelled by a ‘conductance’ <semantics>c e>0,<annotation encoding="application/x-tex">c_e &gt; 0,</annotation></semantics> together with specified input and output nodes:

A morphism in the category <semantics>LinRel<annotation encoding="application/x-tex">{LinRel}</annotation></semantics> is a linear relation <semantics>L:UV<annotation encoding="application/x-tex">L : U \rightsquigarrow V</annotation></semantics> between finite-dimensional real vector spaces <semantics>U<annotation encoding="application/x-tex">U</annotation></semantics> and <semantics>V.<annotation encoding="application/x-tex">V.</annotation></semantics> This is nothing but a linear subspace <semantics>LUV.<annotation encoding="application/x-tex">L \subseteq U \oplus V.</annotation></semantics> Just as relations generalize functions, linear relations generalize linear functions!

In our previous paper, Brendan and I introduced these two categories and a functor between them, the ‘black box functor’:

<semantics>:CircLinRel<annotation encoding="application/x-tex">\blacksquare \colon {Circ} \to {LinRel} </annotation></semantics>

The idea is that any circuit determines a linear relation between the potentials and net current flows at the inputs and outputs. This relation describes the behavior of a circuit of resistors as seen from outside.

Our new paper introduces a black box functor for detailed balanced Markov processes:

<semantics>:DetBalMarkLinRel<annotation encoding="application/x-tex"> \square \colon {DetBalMark} \to {LinRel} </annotation></semantics>

We draw this functor as a white box merely to distinguish it from the other black box functor. The functor <semantics><annotation encoding="application/x-tex">\square</annotation></semantics> maps any detailed balanced Markov process to the linear relation obeyed by populations and flows at the inputs and outputs in a steady state. In short, it describes the steady state behavior of the Markov process ‘as seen from outside’.

How do we manage to black box detailed balanced Markov processes? We do it using the analogy with circuits!

The analogy becomes a functor

Every analogy wants to be a functor. So, we make the analogy between detailed balanced Markov processes and circuits precise by turning it into a functor:

<semantics>K:DetBalMarkCirc<annotation encoding="application/x-tex"> K : {DetBalMark} \to {Circ}</annotation></semantics>

This functor converts any open detailed balanced Markov process into an open electrical circuit made of resistors. This circuit is carefully chosen to reflect the steady-state behavior of the Markov process. Its underlying graph is the same as that of the Markov process. So, the ‘states’ of the Markov process are the same as the ‘nodes’ of the circuit.

Both the equilibrium populations at states of the Markov process and the rate constants labelling edges of the Markov process are used to compute the conductances of edges of this circuit. In the simple case where the Markov process has exactly one edge from any state <semantics>i<annotation encoding="application/x-tex">i</annotation></semantics> to any state <semantics>j,<annotation encoding="application/x-tex">j,</annotation></semantics> the rule is this:

<semantics>C ij=H ijq j<annotation encoding="application/x-tex"> C_{i j} = H_{i j} q_j </annotation></semantics>


  • <semantics>q j<annotation encoding="application/x-tex">q_j</annotation></semantics> is the equilibrium population of the <semantics>j<annotation encoding="application/x-tex">j</annotation></semantics>th state of the Markov process,
  • <semantics>H ij<annotation encoding="application/x-tex">H_{i j}</annotation></semantics> is the rate constant for the edge from the <semantics>j<annotation encoding="application/x-tex">j</annotation></semantics>th state to the <semantics>i<annotation encoding="application/x-tex">i</annotation></semantics>th state of the Markov process, and
  • <semantics>C ij<annotation encoding="application/x-tex">C_{i j}</annotation></semantics> is the conductance (that is, the reciprocal of the resistance) of the wire from the <semantics>j<annotation encoding="application/x-tex">j</annotation></semantics>th node to the <semantics>i<annotation encoding="application/x-tex">i</annotation></semantics>th node of the resulting circuit.

The detailed balance condition for Markov processes says precisely that the matrix <semantics>C ij<annotation encoding="application/x-tex">C_{i j}</annotation></semantics> is symmetric! This is just right for an electrical circuit made of resistors, since it means that the resistance of the wire from node <semantics>i<annotation encoding="application/x-tex">i</annotation></semantics> to node <semantics>j<annotation encoding="application/x-tex">j</annotation></semantics> equals the resistance of the same wire in the reverse direction, from node <semantics>j<annotation encoding="application/x-tex">j</annotation></semantics> to node <semantics>i.<annotation encoding="application/x-tex">i.</annotation></semantics>

A triangle of functors

If you paid careful attention, you’ll have noticed that I’ve described a triangle of functors:

And if you’ve got the tao of category theory flowing in your veins, you’ll be wondering if this diagram commutes.

In fact, this triangle of functors does not commute! However, a general lesson of category theory is that we should only expect diagrams of functors to commute up to natural isomorphism, and this is what happens here:

The natural transformation <semantics>α<annotation encoding="application/x-tex">\alpha</annotation></semantics> ‘corrects’ the black box functor for resistors to give the one for detailed balanced Markov processes.

The functors <semantics><annotation encoding="application/x-tex">\square</annotation></semantics> and <semantics>K<annotation encoding="application/x-tex">\blacksquare \circ K</annotation></semantics> are actually equal on objects. An object in <semantics>DetBalMark<annotation encoding="application/x-tex">{DetBalMark}</annotation></semantics> is a finite set <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> with each element <semantics>iX<annotation encoding="application/x-tex">i \in X</annotation></semantics> labelled a positive populations <semantics>q i.<annotation encoding="application/x-tex">q_i.</annotation></semantics> Both functors map this object to the vector space <semantics> X X.<annotation encoding="application/x-tex">\mathbb{R}^X \oplus \mathbb{R}^X.</annotation></semantics> For the functor <semantics>,<annotation encoding="application/x-tex">\square,</annotation></semantics> we think of this as a space of population-flow pairs. For the functor <semantics>K,<annotation encoding="application/x-tex">\blacksquare \circ K,</annotation></semantics> we think of it as a space of potential-current pairs. The natural transformation <semantics>α<annotation encoding="application/x-tex">\alpha</annotation></semantics> then gives a linear relation

<semantics>α X,q: X X X X<annotation encoding="application/x-tex">\alpha_{X,q} : \mathbb{R}^X \oplus \mathbb{R}^X \rightsquigarrow \mathbb{R}^X \oplus \mathbb{R}^X</annotation></semantics>

in fact an isomorphism of vector spaces, which converts potential-current pairs into population-flow pairs in a manner that depends on the <semantics>q i.<annotation encoding="application/x-tex">q_i.</annotation></semantics> I’ll skip the formula; it’s in the paper.

But here’s the key point. The naturality of <semantics>α<annotation encoding="application/x-tex">\alpha</annotation></semantics> actually allows us to reduce the problem of computing the functor <semantics><annotation encoding="application/x-tex">\square</annotation></semantics> to the problem of computing <semantics>.<annotation encoding="application/x-tex">\blacksquare.</annotation></semantics> Suppose

<semantics>M:(X,q)(Y,r)<annotation encoding="application/x-tex">M \colon (X,q) \to (Y,r) </annotation></semantics>

is any morphism in <semantics>DetBalMark.<annotation encoding="application/x-tex">{DetBalMark}.</annotation></semantics> The object <semantics>(X,q)<annotation encoding="application/x-tex">(X,q)</annotation></semantics> is some finite set <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> labelled by populations <semantics>q,<annotation encoding="application/x-tex">q,</annotation></semantics> and <semantics>(Y,r)<annotation encoding="application/x-tex">(Y,r)</annotation></semantics> is some finite set <semantics>Y<annotation encoding="application/x-tex">Y</annotation></semantics> labelled by populations <semantics>r.<annotation encoding="application/x-tex">r.</annotation></semantics> Then the naturality of <semantics>α<annotation encoding="application/x-tex">\alpha</annotation></semantics> means that this square commutes:

Since <semantics>α X,q<annotation encoding="application/x-tex">\alpha_{X,q}</annotation></semantics> and <semantics>α Y,r<annotation encoding="application/x-tex">\alpha_{Y,r}</annotation></semantics> are isomorphisms, we can solve for the functor <semantics><annotation encoding="application/x-tex">\square</annotation></semantics> as follows:

<semantics>(M)=α YK(M)α X 1<annotation encoding="application/x-tex"> \square(M) = \alpha_Y \circ \blacksquare K(M) \circ \alpha_X^{-1} </annotation></semantics>

This equation has a clear intuitive meaning! It says that to compute the behavior of a detailed balanced Markov process, namely <semantics>(f),<annotation encoding="application/x-tex">\square(f),</annotation></semantics> we convert it into a circuit made of resistors and compute the behavior of that, namely <semantics>K(f).<annotation encoding="application/x-tex">\blacksquare K(f).</annotation></semantics> This is not equal to the behavior of the Markov process, but we can compute that behavior by converting the input populations and flows into potentials and currents, feeding them into our circuit, and then converting the outputs back into populations and flows.

What we really did

So that’s a sketch of what we did, and I hope you ask questions if it’s not clear. But I also hope you read our paper! Here’s what we actually do in there. After an introduction and summary of results:

  • Section 3 defines open Markov processes and the open master equation.
  • Section 4 introduces detailed balance for open Markov processes.
  • Section 5 recalls the principle of minimum power for open circuits made of linear resistors, and explains how to black box them.
  • Section 6 introduces the principle of minimum dissipation for open detailed balanced Markov processes, and describes how to black box these.
  • Section 7 states the analogy between circuits and detailed balanced Markov processes in a formal way.
  • Section 8 describes how to compose open Markov processes, making them into the morphisms of a category.
  • Section 9 does the same for detailed balanced Markov processes.
  • Section 10 describes the ‘black box functor’ that sends any open detailed balanced Markov process to the linear relation describing its external behavior, and recalls the similar functor for circuits.
  • Section 11 makes the analogy between between open detailed balanced Markov processes and open circuits even more formal, by making it into a functor. We prove that together with the two black box functors, this forms a triangle that commutes up to natural isomorphism.
  • Section 12 is about geometric aspects of this theory. We show that linear relations in the image of these black box functors are Lagrangian relations between symplectic vector spaces. We also show that the master equation can be seen as a gradient flow equation.
  • Section 13 is a summary of what we have learned.

Finally, Appendix A is a quick tutorial on decorated cospans. This is a key mathematical tool in our work, developed by Brendan in an earlier paper.

by john ( at January 17, 2016 04:04 AM

The n-Category Cafe

Homotopy of Operads and Grothendieck-Teichmüller Groups

Benoit Fresse has finished a big two-volume book on operads, which you can now see on his website:

He writes:

The first aim of this book project is to give an overall reference, starting from scratch, on the application of methods of algebraic topology to operads. To be more specific, one of our main objectives is the development of a rational homotopy theory for operads. Most definitions, notably fundamental concepts of operad and homotopy theory, are carefully reviewed in order to make our account accessible to a broad readership, which should include graduate students, as well as researchers coming from the various fields of mathematics related to our main topics.

The second purpose of the book is to explain, from a homotopical viewpoint, a deep relationship between operads and Grothendieck-Teichmüller groups. This connection, which has been foreseen by M. Kontsevich (from researches on the deformation quantization process in mathematical physics), gives a new approach to understanding internal symmetries of structures occurring in various constructions of algebra and topology. In the book, we set up the background required by an in-depth study of this subject, and we make precise the interpretation of the Grothendieck-Teichmüller group in terms of the homotopy of operads. The book is actually organized for this ultimate objective, which readers can take either as a main motivation or as a leading example to learn about general theories.

The first volume is over 500 pages:

Contents: Introduction to the general theory of operads. Introduction to <semantics>E n<annotation encoding="application/x-tex">E_n</annotation></semantics>-operads. Relationship between <semantics>E 2<annotation encoding="application/x-tex">E_2</annotation></semantics>-operads and (braided) monoidal categories. Applications of Hopf algebras to the Malcev completion of groups, of groupoids, and of operads in groupoids. Operadic definition of the Grothendieck-Teichmüller groups and of the set of Drinfeld’s associators. Appendices on free operads, trees and the cotriple resolution of operads.

The second volume is over 700 pages:

Contents: Introduction to general methods of the theory of model categories. The homotopy theory of modules, algebras, and the rational homotopy of spaces. The (rational) homotopy of operads. Applications of the rational homotopy theory to <semantics>E n<annotation encoding="application/x-tex">E_n</annotation></semantics>-operads. Homotopy spectral sequences and the computation of homotopy automorphism spaces of operads. Applications to <semantics>E 2<annotation encoding="application/x-tex">E_2</annotation></semantics>-operads and the homotopy interpretation of the Grothendieck-Teichmüller group. Appendix on cofree cooperads and the Koszul duality of operads.

by john ( at January 17, 2016 03:24 AM

The n-Category Cafe

Thinking about Grothendieck

Here’s a new piece:

It’s short. I’ll quote just enough to make you want to read more.

During the early 60’s his conversations had a secure calmness. He would offer mathematical ideas with a smile that always had an expanse of generosity in it. Firm feet on the ground; sometimes barefoot. Transparency: his feelings towards people, towards things, were straightforwardly felt, straightforwardly expressed — often garnished with a sprig of morality. But perhaps the word ‘morality’ doesn’t set the right tone: one expects a dour or dire music to accompany any moral message. Grothendieck’s opinions, observations, would be delivered with an upbeat, an optimism, a sense that “nothing could be easier in the world” than to view things as he did. In fact, as many people have mentioned, Grothendieck didn’t butt against obstacles, but rather he arranged for obstacles to be dissolved even before he approached them. The mathematical road, he would seem to say, shows itself to be ‘the correct way’ by how easy it is to travel along it. This is, of course, a vastly different ‘ease’ than what was an intellectual abomination to Grothendieck: something he called, with horror, “tourner la manivelle” (or ‘cranking it out’).

by john ( at January 17, 2016 03:23 AM

January 16, 2016

Sean Carroll - Preposterous Universe

Quantum Fluctuations
<noscript>[<a href="" target="_blank">View the story &#8220;Quantum Fluctuations&#8221; on Storify</a>]</noscript>

by Sean Carroll at January 16, 2016 10:50 PM

Jon Butterworth - Life and Physics



[RSS 2.0 Feed] [Atom Feed]

Last updated:
February 09, 2016 08:21 PM
All times are UTC.

Suggest a blog: