Particle Physics Planet

February 21, 2018

Christian P. Robert - xi'an's og

Larry Brown (1940-2018)

Just learned a few minutes ago that my friend Larry Brown has passed away today, after fiercely fighting cancer till the end. My thoughts of shared loss and deep support first go to my friend Linda, his wife, and to their children. And to all their colleagues and friends at Wharton. I have know Larry for all of my career, from working on his papers during my PhD to being a temporary tenant in his Cornell University office in White Hall while he was mostly away in sabbatical during the academic year 1988-1989, and then periodically meeting with him in Cornell and then Wharton along the years. He and Linday were always unbelievably welcoming and I fondly remember many times at their place or in superb restaurants in Phillie and elsewhere.  And of course remembering just as fondly the many chats we had along these years about decision theory, admissibility, James-Stein estimation, and all aspects of mathematical statistics he loved and managed at an ethereal level of abstraction. His book on exponential families remains to this day one of the central books in my library, to which I kept referring on a regular basis… For certain, I will miss the friend and the scholar along the coming years, but keep returning to this book and have shared memories coming back to me as I will browse through its yellowed pages and typewriter style. Farewell, Larry, and thanks for everything!

by xi'an at February 21, 2018 07:48 PM

ZapperZ - Physics and Physicists

The Dark Life Of The Higgs Boson
I decided to modify a bit the title of the Symmetry article that I'm linking to, because in that article, the possible link between the Higgs boson and dark matter is made. This allows for the study of the decay of the Higgs to be used to detect the presence of dark matter.

The Standard Model not only predicts all the different possible decays of Higgs bosons, but how favorable each decay is. For instance, it predicts that about 60 percent of Higgs bosons will transform into a pair of bottom quarks, whereas only 0.2 percent will transform into a pair of photons. If the experimental results show Higgs bosons decaying into certain particles more or less often than predicted, it could mean that a few Higgs bosons are sneaking off and transforming into dark matter.

Of course, these kinds of precision measurements cannot tell scientists if the Higgs is evolving into dark matter as part of its decay path—only that it is behaving strangely. To catch the Higgs in the act, scientists need irrefutable evidence of the Higgs schmoozing with dark matter.

So there you have it.

If you are not up to speed on the discovery of the Higgs (i.e. you've been living under a rock for the past few years), I've mentioned a link to a nice update here.


by ZapperZ ( at February 21, 2018 02:09 PM

Lubos Motl - string vacua and pheno

Limited belief in experts: hard anecdotal facts vs experts' lore
SUSY and Ledecká's idiosynrasies

People are pretty much divided to two groups: those who divide people to two groups and those who don't. ;-) Also, they're divided to those who love to defend the status of "widely respected experts" and those who despise any "authorities".

Richard Feynman has said that "science is the belief in the ignorance of experts". On the other hand, his colleague Murray Gell-Mann, when I debated these things with him during the 2005 Sidneyfest, was mocking Feynman whose teeth were completely decaying etc. because he didn't trust experts (and e.g. the superstition that one should brush his teeth). The two men have clearly stood on the opposite sides of the axis I want to discuss. Both of them have been immensely successful which proves that "you don't have to be exactly in the middle".

Most people choose to be in the middle when it comes to lots of opinions. It's a convenient attitude. The golden mean often ends up being rather extreme. The contemporary postmodern, extreme, politically correct attitudes have become so widespread in the West because the extreme leftists were capable to convince the "convenient people in the middle" that joining the extreme left-wing cult is the right way to stay in the middle which is so important. That's why whole nations such as Germany are full of psychopaths defending lunacies (such as the open-door immigration policies) who scream that they're sane.

This tweet from The New York Times contains a video that showed all the female contestants in super-G which was won by snowboarder Ester Ledecká (CZE). Don't forget that the actual race was a slalom so the trajectories have been straightened up. She was going on the left side from the likes of Goggia (ITA) and Vonn (USA). That's not surprising because she has primarily been a snowboarder and that is a left-wing sport. ;-) Also, the great finish has been more important for her than the beginning of her run.

During her run, the Czech public TV was airing the ice-hockey match against Canada – it's not surprising given the popularity of ice-hockey in Czechia and the continuing success of our team: the Czech team got into the semifinals after it outscored both Canada and the U.S. (today) in the penalty shootouts. Nevertheless, with some delay, we could see how the reporters commented on the race.

In the studio, they had Mr Petr Vichnar – who is, along with Mr Robert Záruba, one of the sports reporting superstars who began their career as skillful young men during the late advanced socialism. And he had an expert in the studio, too: Ms Lucie Hrstková-Pešánová (she will be referred to as Ms Lucie) who competed in Alpine skiing a decade ago, when Czechia couldn't quite compare to the global elite yet.

Their narration has been widely discussed in the Czech media – and in the comment sections of the Czech Internet. Up to the very finish, you couldn't have figured out that Ledecká's run was great. Well, in fact, you would have believed it was absolutely terrible. In particular, Ms Lucie was bombarding Ledecká with remarks about constant mistakes. Three mistakes were painted as nearly fatal ones. Like most other commenters, I think that Ms Lucie was speaking in a tone that was mocking Ms Ledecká, too. I could hear "it's so cute that Ledecká even dares to compete against the real skiers, like what I used to be" in between the lines.

Now, I have no doubt that her comments about Ledecká's mistakes were partially based on some kind of an expertise that is inaccessible to us, the mortals. Folks like Ms Lucie have been fed lots of the wisdom about the right way to behave in every curve, about the need to stick to the optimal path and where the optimal path is. And she has internalized lots of this wisdom by attaching lots of her own experience. So if you wanted to hear an expert and be assured that someone is looking at Ledecká with all this expertise, you could have been satisfied.

But she and Mr Vichnar had to be missing something essential because after Ledecká was torn to pieces by them, she completed the run and won the race. The previous sentence was meant to be completely analogous to Feynman's "something has to be wrong because the airplanes don't land" in the cargo cult science talk.

Clearly, what they have been missing was her time. Her time was promising throughout the run and it showed up in green thrice – indicating that she was the leader. For some reason, they didn't notice. Because time decides about the winner, everything else should adapt to the desire to improve the time. You may have some knowledge about the optimal path and the right way to bend your body in one curve or another. But if this knowledge doesn't materially help to improve your time – or if ignoring the knowledge doesn't hurt someone's time – then the knowledge isn't terribly valuable. It is effectively false.

The wisdom about the optimal path and other things may be shared by the community of coaches, some of the achieved athletes in the past, and some pure theorists, too. But is it really true and essential? Isn't it just some group think, a bunch of collectively shared superstitions or half-truths? If a young woman manages to win while she ignores most of it, there is a pretty good reason to think that this lore – or group think – isn't so true or at least isn't so essential, isn't there?

When the final time showed that Ledecká won the event, Ms Lucie changed her tone dramatically. She started to yell and her high pitch voice made it impossible to convey any useful information (from Mr Petr Vichnar) at that moment. I think that this yelling could have been partly staged – she felt she needed to quickly compensate for her ludicrously negative reporting during the soon-to-be-legendary run itself.

So while I don't think that some catastrophe has occurred because Ledecká's run was commented very negatively when it was being built, I do agree with most of the commenters under the articles who think that Ms Lucie – and even Mr Vichnar – did a rather poor, perhaps embarrassing, job in this case. I would even say that the TV viewers were being misled about some key information during the run. Mr Petr Vichnar would have to err dozens of times to lose most of his credibility in my eyes – and he would have some credibility left even afterwards.

Supersymmetry: belief despite the LHC

OK, so in the example above – I could obviously pick hundreds of other, recent or less recent, examples but I wanted to have an example in mind – I was among the people who prefer to trust "hard facts" and not the status of some "respected experts" such as Ms Lucie. But it's not my "dogmatic attitude". From hundreds of opposing examples, let me pick the following one:
Like a majority of experts who would agree that their field is a fundamental high-energy physics, I keep on thinking that supersymmetry is exploited by Nature at some higher energy scale, despite the negative results of the LHC's search for new physics (including supersymmetry) so far.
You could say that in this case, and in many others, I am on the side of the "expertise against the hard facts". The LHC has said something negative about supersymmetry so far but I still think that it's very likely that supersymmetry is relevant in Nature.

In some cases, I am pro-hard-facts, in others, I am pro-subtle-expertise, if you wish. Is it a contradiction? Of course, it is not one. It is not a contradiction because the two situations aren't equivalent. In particular, there is a fundamental difference:
Ms Ester Ledecká's gold medal sharply and rigorously rules out the claim that she has made some fatal mistakes during her run.


The failure to find supersymmetry at the LHC so far doesn't directly imply that supersymmetry isn't there is Nature. It is at most some circumstantial evidence capable of quantitatively reducing our confidence in supersymmetry.
These have been two examples of mine. In some cases, the hard facts are more important than some possible respectability of experts such as Ms Lucie because the true disagreement (the truly big question we are assumed to care about) is about some particular quantity – like the competitiveness of her time – and some hard facts completely settle the question.

On the other hand, in some cases, like the case of SUSY, hard facts such as the invisibility of supersymmetry at the LHC don't directly settle the "big questions". So it makes sense to keep on treating the broader, less tangible, but more abstract arguments known to the experts as comparably important as before. String theory requires supersymmetry in the promising realistic vacua so there has to be supersymmetry in Nature, not to mention a few other arguments that are still standing.

In the real world, we encounter lots of situations. Are they more similar to the case of Ledecká's alleged mistakes where the gold medal seems to be the ultimate hard fact that settles the discussion? Or are they more similar to supersymmetry searched for at the LHC which simply doesn't have enough capacity to decide some truly big questions about supersymmetry in Nature?

My point is that real-world questions may fall on both sides of this dichotomy – and everywhere in between, too. Sometimes it's good to dismiss the comments by the experts because they have really been shown invalid (i.e. shown to be a collectively shared superstition or group think) by some hard facts; sometimes, it isn't the case because the hard facts – anecdotal evidence – just isn't enough to change our knowledge about some bigger, more general questions.

Yes, I often end up being on the pro-expert side in these discussions, as well as the anti-expert side.

What seems remarkable to me is that a big majority of the people who love to comment on things are either "fanatically pro-expert" or "fanatically anti-expert" activists. You may figure out which of these two camps is theirs – and their opinions about basically everything become completely predictable (as long as one possible answer to the question is much more defended by some "respected authorities"). People in the pro-expert camp will defend the experts and the "respected authorities" despite any facts, including the hardest ones; and people in the anti-expert camp will be satisfied with an arbitrarily weak, vaguely related, anecdotal evidence to strengthen their view that all experts are crooks and there's nothing valuable in any expertise in the world at all.

Needless to say, I think that both of these extreme camps are comparably naive and borderline dishonest. There can't be systematic progress without any experts or expertise at all; but there can't be systematic progress when experts or authorities are considered infallible, either. In many questions that are affected by this pro-expert/anti-expert tension, your thinking should be more nuanced and if you just bring your pro-expert or anti-expert prejudices, you're just an animal whose presence in the debates is counterproductive.

by Luboš Motl ( at February 21, 2018 12:12 PM

Peter Coles - In the Dark

Why I’m taking part in the UCU Strike Action

In case you weren’t aware, from tomorrow (22nd February) the University and College Union (UCU) is taking industrial action over proposed drastic cuts to staff pensions funded by the Universities Superannuation Scheme (USS). You can find some background to the pensions dispute here (and in related articles). A clear explanation of why the employers’ justification for these cuts is little more than fraudulent is given here and here you can find an example of the effect of the proposed changes on a real person’s pension (ie a cut of almost 50%). I also blogged about this a few weeks ago. There’s no doubt whose side the Financial Times is on, either.

I am not a member of UCU – I left its forerunner organisation the Association of University Teachers (AUT) as a result of its behaviour when I was at the University of Nottingham – but I will be participating in the industrial action, which takes place over four weeks as follows:

  • Week one – Thursday 22 and Friday 23 February (two days)
  • Week two – Monday 26, Tuesday 27 and Wednesday 28 February (three days)
  • Week three – Monday 5, Tuesday 6, Wednesday 7 and Thursday 8 March (four days)
  • Week four – Monday 12, Tuesday 13, Wednesday 14, Thursday 15 and Friday 16 March (five days)

This is a bit complicated for me because I only work half-time at Cardiff University (usually Mondays, Tuesdays and half of Wednesdays) and at Maynooth University the rest of the time. The USS only covers UK universities, and the dispute does not apply in the Republic of Ireland (though it does affect higher education institutions in Northern Ireland) so I won’t be on strike when I’m working for Maynooth University, which includes the first two strike days (tomorrow and Friday). I will be participating in industrial action next week, however, and have today sent an announcement to my students they hear from me that the strike has been called off there will be no lectures on 27th February, 6th March or 13th March.

All staff will be docked pay for days not worked owing to strike action, of course, but that will be far less than the amount to be lost in these pension cuts. In my case I will be docked the equivalent of three weeks’ pay as 2.5 days a week I work are all strike days in Weeks 2-4. Moreover, I shall be leaving the UK for Ireland this summer and the pension cuts will not affect my pension anyway – any changes will not be made until after I’ve left the USS scheme. Nevertheless, this is an important issue and I feel it is right to take a stand.

One final comment. Last week Cardiff University sent an email to staff including a link to a website that stated:

If staff refuse to cross a picket line and they are not a member of UCU they will be in breach of their contract of employment with the University.

In fact, any strike action (even by a union member) is a breach of contract. The law however prevents employers dismissing staff who participate in industrial action, provided that it is lawful (i.e. following a ballot, and with due notice given to the employer, etc). The government website makes it clear that non-union members have exactly the same protection as union members in this regard. The Cardiff website has now been changed, but I’m very unhappy that this extremely misleading communication was sent out in the first place.

I sincerely hope that there is a negotiated settlement to this issue. Nobody wants to go on strike, especially when it has the potential to damage students’ learning. But there comes a point where you have to draw a line in the sand, and we have reached that point. I hope I’m proved wrong, but I think this could be a very prolonged and very unpleasant dispute.

by telescoper at February 21, 2018 11:58 AM

The n-Category Cafe

Cartesian Bicategories

guest post by Daniel Cicala and Jules Hedges

We continue the Applied Category Theory Seminar with a discussion of Carboni and Walters’ paper Cartesian Bicategories I. The star of this paper is the notion of ‘bicategories of relations’. This is an abstraction of relations internal to a category. As such, this paper provides excellent, if technical, examples of internal relations and other internal category theory concepts. In this post, we discuss bicategories of relations while occasionally pausing to enjoy some internal category theory such as relations, adjoints, monads, and the Kleisli construction.

We’d like to thank Brendan Fong and Nina Otter for running such a great seminar. We’d also like to thank Paweł Sobociński and John Baez for helpful discussions.

Shortly after Bénabou introduced bicategories, a program was initiated to study these through profunctor bicategories. Carboni and Walters, however, decided to study bicategories with a more relational flavor. This is not quite as far a departure as one might think. Indeed, relations and profunctors are connected. Let’s recall two facts:

  • a profunctor from <semantics>C<annotation encoding="application/x-tex">C</annotation></semantics> to <semantics>D<annotation encoding="application/x-tex">D</annotation></semantics> is a functor from <semantics>D op×C<annotation encoding="application/x-tex">D^{op} \times C</annotation></semantics> to <semantics>Set<annotation encoding="application/x-tex">Set</annotation></semantics>, and

  • a relation between sets <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics> and <semantics>y<annotation encoding="application/x-tex">y</annotation></semantics> can be described with a <semantics>{0,1}<annotation encoding="application/x-tex">\{0,1\}</annotation></semantics>-valued matrix of size <semantics>x×y<annotation encoding="application/x-tex">x \times y</annotation></semantics>.

Heuristically, profunctors can be thought of as a generalization of relations when considering profunctors as “<semantics>Set<annotation encoding="application/x-tex">\mathbf{Set}</annotation></semantics>-valued matrix of size <semantics>ob(C)×ob(D)<annotation encoding="application/x-tex">\text{ob} (C) \times \text{ob} (D)</annotation></semantics>”. As such, a line between profunctors and relations appears. In Cartesian Bicategories I, authors Carboni and Walters walk this line and continue a study of bicategories from a relational viewpoint.

The primary accomplishment of this paper is to characterize ‘bicategories of internal relations’ <semantics>Rel(E)<annotation encoding="application/x-tex">\mathbf{Rel}(E)</annotation></semantics> and of ‘ordered objects’ <semantics>Ord(E)<annotation encoding="application/x-tex">\mathbf{Ord}(E)</annotation></semantics> in a regular category <semantics>E<annotation encoding="application/x-tex">E</annotation></semantics>. To do this, the authors begin by introducing the notion of Cartesian bicategory, an early example of a bicategory with a monoidal product. They then explore bicategories of relations, which are Cartesian bicategories whose objects are Frobenius monoids. The name “bicategories of relations” indicates their close relationship with classical relations <semantics>Rel<annotation encoding="application/x-tex">\mathbf{Rel}</annotation></semantics>.

We begin by defining the two most important examples of a bicategory of relations: <semantics>Rel(E)<annotation encoding="application/x-tex">\mathbf{Rel}(E)</annotation></semantics> and <semantics>Ord(E)<annotation encoding="application/x-tex">\mathbf{Ord}(E)</annotation></semantics>. Knowing these bicategories will ground us as we wade through the theory of Cartesian bicategories. We finish by characterizing <semantics>Rel(E)<annotation encoding="application/x-tex">\mathbf{Rel}(E)</annotation></semantics> and <semantics>Ord(E)<annotation encoding="application/x-tex">\mathbf{Ord}(E)</annotation></semantics> in terms of the developed theory.

Internal relations

In set theory, a relation from <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics> to <semantics>y<annotation encoding="application/x-tex">y</annotation></semantics> is a subset of <semantics>x×y<annotation encoding="application/x-tex">x \times y</annotation></semantics>. In category theory, things become more subtle. A relation <semantics>r<annotation encoding="application/x-tex">r</annotation></semantics> from <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics> to <semantics>y<annotation encoding="application/x-tex">y</annotation></semantics> internal to a category <semantics>C<annotation encoding="application/x-tex">C</annotation></semantics> is a ‘jointly monic’ <semantics>C<annotation encoding="application/x-tex">C</annotation></semantics>-span <semantics>xr 0r^r 1y<annotation encoding="application/x-tex">x \xleftarrow{r_0} \hat{r} \xrightarrow{r_1} y</annotation></semantics> That is, for any arrows <semantics>a,b:ur^<annotation encoding="application/x-tex">a , b \colon u \to \hat{r}</annotation></semantics> such that <semantics>r 0a=r 0b<annotation encoding="application/x-tex">r_0 a = r_0 b</annotation></semantics> and <semantics>r 1a=r 1b<annotation encoding="application/x-tex">r_1 a = r_1 b</annotation></semantics> hold, then <semantics>a=b<annotation encoding="application/x-tex">a = b</annotation></semantics>. In a category with products, this definition simplifies substantially; it is merely a monic arrow <semantics>r:r^x×y<annotation encoding="application/x-tex">r \colon \hat{r} \to x \times y</annotation></semantics>.

Given a span <semantics>xcwdy<annotation encoding="application/x-tex">x \xleftarrow{c} w \xrightarrow{d} y</annotation></semantics> and the relation <semantics>rr 0,r 1<annotation encoding="application/x-tex">r \coloneqq \langle r_0 , r_1 \rangle </annotation></semantics> from above, we say that <semantics>c<annotation encoding="application/x-tex">c</annotation></semantics> is <semantics>r<annotation encoding="application/x-tex">r</annotation></semantics>-related to <semantics>d<annotation encoding="application/x-tex">d</annotation></semantics> if there is an <semantics>wr^<annotation encoding="application/x-tex">w \to \hat{r}</annotation></semantics> so that

commutes. We will write <semantics>r:cd<annotation encoding="application/x-tex"> r \colon c \nrightarrow d</annotation></semantics> when <semantics>c<annotation encoding="application/x-tex">c</annotation></semantics> is <semantics>r<annotation encoding="application/x-tex">r</annotation></semantics>-related to <semantics>d<annotation encoding="application/x-tex">d</annotation></semantics>.

While we can talk about relations internal to <semantics>any<annotation encoding="application/x-tex">any</annotation></semantics> category, we cannot generally assemble them into another category. However, if we start with a regular category <semantics>E<annotation encoding="application/x-tex">E</annotation></semantics>, then there is a bicategory <semantics>Rel(E)<annotation encoding="application/x-tex">\mathbf{Rel}(E)</annotation></semantics> of relations internal to <semantics>E<annotation encoding="application/x-tex">E</annotation></semantics>. The objects are those of <semantics>E<annotation encoding="application/x-tex">E</annotation></semantics>. The arrows are the relations internal to <semantics>E<annotation encoding="application/x-tex">E</annotation></semantics> with composition given by pullback:

Additionally, we have a unique 2-cell, written <semantics>rs<annotation encoding="application/x-tex">r \leq s</annotation></semantics>, whenever <semantics>s:r 0r 1<annotation encoding="application/x-tex"> s \colon r_0 \nrightarrow r_1 </annotation></semantics>. Diagrammatically, <semantics>rs<annotation encoding="application/x-tex">r \leq s</annotation></semantics> if there exists a commuting diagram

Internal ordered objects

We are quite used to the idea of having an order on a set. But what about an order on a category? This is captured by <semantics>Ord(E),<annotation encoding="application/x-tex">\mathbf{Ord}(E),</annotation></semantics> the bicategory of ordered objects and ideals in a regular category <semantics>E<annotation encoding="application/x-tex">E</annotation></semantics>.

The objects of <semantics>Ord(E)<annotation encoding="application/x-tex">\mathbf{Ord}(E)</annotation></semantics> are ordered objects in <semantics>E<annotation encoding="application/x-tex">E</annotation></semantics>. An ordered object is a pair <semantics>(x,r)<annotation encoding="application/x-tex">(x,r)</annotation></semantics> consisting of an <semantics>E<annotation encoding="application/x-tex">E</annotation></semantics>-object <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics> and a reflexive and transitive relation <semantics>r:xx<annotation encoding="application/x-tex">r : x \to x</annotation></semantics> internal to <semantics>E<annotation encoding="application/x-tex">E</annotation></semantics>.

(Puzzle: <semantics>r<annotation encoding="application/x-tex"> r </annotation></semantics> is a monic of type <semantics>r:r^x×x<annotation encoding="application/x-tex"> r \colon \hat{r} \to x \times x</annotation></semantics>. Both reflexivity and transitivity can be defined using morphisms. What are the domains and codomains? What properties should be satisfied?)

The arrows of <semantics>Ord(E)<annotation encoding="application/x-tex">\mathbf{Ord}(E)</annotation></semantics> are a sort of ‘order preserving relation’ called an ideal. Precisely, an ideal <semantics>f:(x,r)(y,s)<annotation encoding="application/x-tex">f \colon (x,r) \to (y,s)</annotation></semantics> between ordered objects is a relation <semantics>f:xy<annotation encoding="application/x-tex">f \colon x \nrightarrow y</annotation></semantics> such that given

  • morphisms <semantics>a,a,b,b<annotation encoding="application/x-tex"> a , a' , b , b' </annotation></semantics> with a common domain <semantics>z<annotation encoding="application/x-tex"> z </annotation></semantics>, and

  • relations <semantics>r:aa<annotation encoding="application/x-tex"> r \colon a \nrightarrow a'</annotation></semantics>, <semantics>f:ab<annotation encoding="application/x-tex"> f \colon a' \nrightarrow b'</annotation></semantics>, and <semantics>s:bb<annotation encoding="application/x-tex"> s \colon b' \nrightarrow b </annotation></semantics>

then <semantics>f:ab<annotation encoding="application/x-tex"> f \colon a \nrightarrow b</annotation></semantics>.

In <semantics>Set<annotation encoding="application/x-tex"> \mathbf{Set} </annotation></semantics>, an ordered object is a preordered set and an ideal <semantics>f:(x,r)(y,s)<annotation encoding="application/x-tex"> f \colon (x,r) \to (y,s) </annotation></semantics> is a directed subset of <semantics>x×y<annotation encoding="application/x-tex">x \times y </annotation></semantics> with the property that if it contains <semantics>s<annotation encoding="application/x-tex"> s </annotation></semantics> and <semantics>ss<annotation encoding="application/x-tex"> s' \leq s </annotation></semantics>, then it contains <semantics>s<annotation encoding="application/x-tex"> s' </annotation></semantics>.

There is at most a single 2-cell between parallel arrows in <semantics>Ord(E)<annotation encoding="application/x-tex">\mathbf{Ord}(E)</annotation></semantics>. Given <semantics>f,g:(x,r)(y,s)<annotation encoding="application/x-tex">f , g \colon (x,r) \to (y,s)</annotation></semantics>, write <semantics>fg<annotation encoding="application/x-tex">f \leq g</annotation></semantics> whenever <semantics>g:f 0f 1<annotation encoding="application/x-tex"> g \colon f_0 \nrightarrow f_1 </annotation></semantics>.

Cartesian Bicategories

Now that we know what bicategories we have the pleasure of working with, we can move forward with the theoretical aspects. As we work through the upcoming definitions, it is helpful to recall our motivating examples <semantics>Rel(E)<annotation encoding="application/x-tex">\mathbf{Rel}(E)</annotation></semantics> and <semantics>Ord(E)<annotation encoding="application/x-tex">\mathbf{Ord}(E)</annotation></semantics>.

As mentioned above, in the early days of bicategory theory, mathematicians would study bicategories as <semantics>V<annotation encoding="application/x-tex">V</annotation></semantics>-enriched profunctor bicategories for some suitable <semantics>V<annotation encoding="application/x-tex">V</annotation></semantics>. A shrewd observation was made that when <semantics>V<annotation encoding="application/x-tex">V</annotation></semantics> is Cartesian, a <semantics>V<annotation encoding="application/x-tex">V</annotation></semantics>-profunctor bicategory has several important commonalities with <semantics>Rel(E)<annotation encoding="application/x-tex">\mathbf{Rel}(E)</annotation></semantics> and <semantics>Ord(E)<annotation encoding="application/x-tex">\mathbf{Ord}(E)</annotation></semantics>. Namely, there is the existence of a Cartesian product <semantics><annotation encoding="application/x-tex">\otimes</annotation></semantics>, plus for each object <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics>, a diagonal arrow <semantics>Δ:xxx<annotation encoding="application/x-tex">\Delta \colon x \to x \otimes x</annotation></semantics> and terminal object <semantics>ϵ:xI<annotation encoding="application/x-tex">\epsilon \colon x \to I</annotation></semantics>. With this insight, Carboni and Walters decided to take this structute as primitive.

To simplify coherence, we only look at locally posetal bicategories (i.e. <semantics>Pos<annotation encoding="application/x-tex">\mathbf{Pos}</annotation></semantics>-enriched categories). This renders 2-dimensional coherences redundant as all parallel 2-cells manifestly commute. This assumption also endows each hom-poset with 2-cells <semantics><annotation encoding="application/x-tex">\leq</annotation></semantics> and, as we will see, local meets <semantics><annotation encoding="application/x-tex">\wedge</annotation></semantics>. For the remainder of this article, all bicategories will be locally posetal unless otherwise stated.

Definition. A locally posetal bicategory <semantics>B<annotation encoding="application/x-tex">B</annotation></semantics> is Cartesian when equipped with

  • a symmetric tensor product <semantics>BBB<annotation encoding="application/x-tex">B \otimes B \to B</annotation></semantics>,

  • a cocommutative comonoid structure, <semantics>Δ x:xxx<annotation encoding="application/x-tex">\Delta_x \colon x \to x \otimes x</annotation></semantics>, and <semantics>ϵ x:xI<annotation encoding="application/x-tex">\epsilon_x \colon x \to I</annotation></semantics>, on every <semantics>B<annotation encoding="application/x-tex">B</annotation></semantics>-object <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics>

such that

  • every 1-arrow <semantics>r:xy<annotation encoding="application/x-tex">r \colon x \to y</annotation></semantics> is a lax comonoid homomorphism, i.e.

<semantics>Δ yr(rr)Δ xandϵ yrϵ x<annotation encoding="application/x-tex"> \Delta_y r \leq ( r \otimes r ) \Delta_x \quad \text{and} \quad \epsilon_y r \leq \epsilon_x </annotation></semantics>

  • for all objects <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics>, both <semantics>Δ x<annotation encoding="application/x-tex">\Delta_x</annotation></semantics> and <semantics>ϵ x<annotation encoding="application/x-tex">\epsilon_x</annotation></semantics> have right adjoints <semantics>Δ x *<annotation encoding="application/x-tex">\Delta^\ast_x</annotation></semantics> and <semantics>ϵ x *<annotation encoding="application/x-tex">\epsilon^\ast_x</annotation></semantics>.

Moreover, <semantics>(Δ x,ϵ x)<annotation encoding="application/x-tex">(\Delta_x , \epsilon_x)</annotation></semantics> is the only cocommutative comonoid structure on <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics> admitting right adjoints.

(Question: This definition contains a slight ambiguity in the authors use of the term “moreover”. Is the uniqueness property of the cocommutative comonoid structure an additional axiom or does it follow from the other axioms?)

If you’re not accustomed to thinking about adjoints internal to a general bicategory, place yourself for a moment in <semantics>Cat<annotation encoding="application/x-tex">\mathbf{Cat}</annotation></semantics>. Recall that adjoint functors are merely a pair of arrows (adjoint functors) together with a pair of 2-cells (unit and counit) obeying certain equations. But this sort of data can exist in any bicategory, not just <semantics>Cat<annotation encoding="application/x-tex">\mathbf{Cat}</annotation></semantics>. It is worth spending a minute to feel comfortable with this concept because, in what follows, adjoints play an important role.

Observe that the right adjoints <semantics>Δ x *<annotation encoding="application/x-tex">\Delta^\ast_x</annotation></semantics> and <semantics>ϵ x *<annotation encoding="application/x-tex">\epsilon^\ast_x</annotation></semantics> turn <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics> into a commutative monoid object, hence a bimonoid. The (co)commutative (co)monoid structure on an object <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics> extend to a tensor product on <semantics>xx<annotation encoding="application/x-tex">x \otimes x</annotation></semantics> as seen in this string diagram:

Ultimately, we want to think of arrows in a Cartesian bicategory as generalized relations. What other considerations are required to do this? To answer this, it is helpful to first think about what a generalized function should be.

For the moment, let’s use our <semantics>Set<annotation encoding="application/x-tex">\mathbf{Set}</annotation></semantics> based intuition. For a relation to be a function, we ask that every element of the domain is related to an element of the codomain (entireness) and that the relationship is unique (determinism). How do we encode these requirements into this new, general situation? Again, let’s use intuition from relations in <semantics>Set<annotation encoding="application/x-tex">\mathbf{Set}</annotation></semantics>. Let <semantics>rx×y<annotation encoding="application/x-tex">r \nrightarrow x \times y</annotation></semantics> be a relation and <semantics>r y×x<annotation encoding="application/x-tex">r^\circ \nrightarrow y \times x</annotation></semantics> be the relation defined by <semantics>r :yx<annotation encoding="application/x-tex"> r^{\circ} \colon y \nrightarrow x </annotation></semantics> whenever <semantics>r:xy<annotation encoding="application/x-tex">r \colon x \nrightarrow y</annotation></semantics>. To say that <semantics>r<annotation encoding="application/x-tex">r</annotation></semantics> is entire is equivalent to saying that the composite relation <semantics>r r<annotation encoding="application/x-tex">r^\circ r</annotation></semantics> contains the identity relation on <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics> (puzzle). To say that <semantics>r<annotation encoding="application/x-tex">r</annotation></semantics> is deterministic is to say that the composite relation <semantics>rr <annotation encoding="application/x-tex">rr^\circ</annotation></semantics> is contained by the identity (another puzzle). These two containments are concisely expressed by writing <semantics>1r r<annotation encoding="application/x-tex">1 \leq r^\circ r</annotation></semantics> and <semantics>rr 1<annotation encoding="application/x-tex">r r^\circ \leq 1</annotation></semantics>. Hence <semantics>r<annotation encoding="application/x-tex">r</annotation></semantics> and <semantics>r <annotation encoding="application/x-tex">r^\circ</annotation></semantics> form an adjoint pair! This leads us to the following definition.

Definition. An arrow of a Cartesian bicategory is a map when it has a right adjoint. Maps are closed under identity and composition. Hence, for any Cartesian bicategory <semantics>B<annotation encoding="application/x-tex">B</annotation></semantics>, there is the full subbicategory <semantics>Map(B)<annotation encoding="application/x-tex">\mathbf{Map}(B)</annotation></semantics> whose arrows are the maps in <semantics>B<annotation encoding="application/x-tex">B</annotation></semantics>.

(Puzzle: There is an equivalences of categories <semantics>EMap(Rel(E))<annotation encoding="application/x-tex">E \simeq \mathbf{Map}(\mathbf{Rel}(E))</annotation></semantics> for a regular category <semantics>E<annotation encoding="application/x-tex">E</annotation></semantics>. What does this say for <semantics>E:=Set<annotation encoding="application/x-tex">E := \mathbf{Set}</annotation></semantics>?)

We can now state what appears as Theorem 1.6 of the paper. Recall that <semantics>() *<annotation encoding="application/x-tex">(-)^\ast</annotation></semantics> refers to the right adjoint.

Theorem. Let <semantics>B<annotation encoding="application/x-tex">B</annotation></semantics> be a locally posetal bicategory. If <semantics>B<annotation encoding="application/x-tex">B</annotation></semantics> is Cartesian, then

  • <semantics>Map(B)<annotation encoding="application/x-tex">\mathbf{Map}(B)</annotation></semantics> has finite bicategorical products <semantics><annotation encoding="application/x-tex">\otimes</annotation></semantics>,

  • the hom-posets have finite meets <semantics><annotation encoding="application/x-tex">\wedge</annotation></semantics> (i.e. categorical products) and the identity arrow in <semantics>B(I,I)<annotation encoding="application/x-tex">B(I,I)</annotation></semantics> is maximal (i.e. a terminal object), and

  • bicategorical products and biterminal object in <semantics>Map(B)<annotation encoding="application/x-tex">\mathbf{Map}(B)</annotation></semantics> may be chosen so that <semantics>rs=(p *rp)(psp *)<annotation encoding="application/x-tex">r \otimes s = (p^\ast r p) \wedge (p s p^\ast)</annotation></semantics>,where <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics> denotes the appropriate projection.

Conversely, if the first two properties are satisfied and the third defines a tensor product, then <semantics>B<annotation encoding="application/x-tex">B</annotation></semantics> is Cartesian.

This theorem gives a nice characterisation of Cartesian bicategories. The first two axioms are straightforward enough, but what is the significance of the above tensor product equation?

It’s actually quite painless when you break it down. Note, every bicategorical product <semantics><annotation encoding="application/x-tex">\otimes</annotation></semantics> comes with projections <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics> and inclusions <semantics>p *<annotation encoding="application/x-tex">p^\ast</annotation></semantics>. Now, let <semantics>r:wy<annotation encoding="application/x-tex">r \colon w \to y</annotation></semantics> and <semantics>s:xz<annotation encoding="application/x-tex">s \colon x \to z</annotation></semantics> which gives <semantics>rs:wxyz<annotation encoding="application/x-tex">r \otimes s \colon w \otimes x \to y \otimes z</annotation></semantics>. One canonical arrow of type <semantics>wxyz<annotation encoding="application/x-tex">w \otimes x \to y \otimes z</annotation></semantics> is <semantics>p *rp<annotation encoding="application/x-tex">p^\ast r p</annotation></semantics> which first projects to <semantics>w<annotation encoding="application/x-tex">w</annotation></semantics>, arrives at <semantics>y<annotation encoding="application/x-tex">y</annotation></semantics> via <semantics>r<annotation encoding="application/x-tex">r</annotation></semantics>, which then includes into <semantics>yw<annotation encoding="application/x-tex">y \otimes w</annotation></semantics>. The other arrow is similar, except we first project to <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics>. The above theorem says that by combining these two arrows with a meet <semantics><annotation encoding="application/x-tex">\wedge</annotation></semantics>, the only available operation, we get our tensor product.

Characterizing bicatgories of internal relations

The next stage is to add to Cartesian bicategories the property that each object is a Frobenius monoid. In this section we will study such bicategories and see that Cartesian plus Frobenius provides a reasonable axiomatization of relations.

Recall that an object with monoid and comonoid structures is called a Frobenius monoid if the equation

holds. If you’re not familiar with this equation, it has an interesting history as outlined by Walters. Now, if every object in a Cartesian bicategory is a Frobenius monoid, we call it a bicategory of relations. This term is a bit overworked as it commonly refers to <semantics>Rel(E)<annotation encoding="application/x-tex">\mathbf{Rel}(E)</annotation></semantics>. Therefore, we will be careful to call the latter a “bicategory of internal relations”.

Why are bicatgories of relations better than simply Cartesian bicategories? For one, they admit a compact closed structure! This appears as Theorem 2.4 in the paper.

Theorem. A bicategory of relations has a compact closed structure. Objects are self-dual via the unit

<semantics>Δϵ x *:Ixx<annotation encoding="application/x-tex"> \Delta \epsilon^\ast_x \colon I \to x \otimes x </annotation></semantics>

and counit

<semantics>ϵΔ x *:xxI.<annotation encoding="application/x-tex"> \epsilon \Delta^\ast_x \colon x \otimes x \to I. </annotation></semantics>

Moreover, the dual <semantics>r <annotation encoding="application/x-tex">r^\circ</annotation></semantics> of any arrow <semantics>r:xy<annotation encoding="application/x-tex">r \colon x \to y</annotation></semantics> satisfies

<semantics>(rid)Δ(1r )Δr<annotation encoding="application/x-tex"> (r \otimes id) \Delta \leq (1 \otimes r^\circ) \Delta r </annotation></semantics>


<semantics>Δ *(rid)rΔ *(1r ).<annotation encoding="application/x-tex"> \Delta^\ast (r \otimes id) \leq r \Delta^\ast (1 \otimes r^\circ). </annotation></semantics>

Or if you prefer string diagrams, the above inequalities are respectively


Because a bicategory of relations is Cartesian, maps are still present. In fact, they have a very nice characterization here.

Lemma. In a bicategory of relations, an arrow <semantics>r<annotation encoding="application/x-tex">r</annotation></semantics> is a map iff it is a (strict) comonoid homomorphism iff <semantics>rr <annotation encoding="application/x-tex">r \dashv r^\circ</annotation></semantics>.

As one would hope, the adjoint of a map corresponds with the involution coming from the compact closed structure. The following corollary provides further evidence that maps are well-behaved.

Corollary. In a bicategory of relations:

  • <semantics>f<annotation encoding="application/x-tex">f</annotation></semantics> is a map implies <semantics>f =f *<annotation encoding="application/x-tex">f^\circ = f^\ast</annotation></semantics>. In particular, multiplication is adjoint to comultiplication and the unit is adjoint to the counit.

  • for maps <semantics>f<annotation encoding="application/x-tex">f</annotation></semantics> and <semantics>g<annotation encoding="application/x-tex">g</annotation></semantics>, if <semantics>fg<annotation encoding="application/x-tex">f \leq g</annotation></semantics> then <semantics>f=g<annotation encoding="application/x-tex">f=g</annotation></semantics>.

But maps don’t merely behave in a nice way. They also contain a lot of the information about a Cartesian bicategory and, when working with bicategories of relations, the local information is quite fruitful too. This is made precise in the following corollary.

Corollary. Let <semantics>F:BD<annotation encoding="application/x-tex">F \colon B \to D</annotation></semantics> be a pseudofunctor between bicategories of relations. The following are equivalent:

  • <semantics>F<annotation encoding="application/x-tex">F</annotation></semantics> strictly preserves the Frobenius structure.

  • The restriction <semantics>F:Map(B)Map(D)<annotation encoding="application/x-tex">F \colon \mathbf{Map}(B) \to \mathbf{Map}(D)</annotation></semantics> strictly preserves the comonoid structure.

  • <semantics>F<annotation encoding="application/x-tex">F</annotation></semantics> preserves local meets and <semantics>I<annotation encoding="application/x-tex">I</annotation></semantics>.

Characterizing bicategories of internal relations

The entire point of the theory developed above is to be able to prove things about certain classes of bicategories. In this section, we provide a characterization theorem for bicategories of internal relations. Freyd had already given this characterization using allegories. However, he relied on a proof by contradiction whereas using bicategories of relations allows for a constructive proof.

A bicategory of relations is meant to generalize bicategories of internal relations. Given a bicategory of relations, we’d like to know when an arrow is “like an internal relation”.

Definition. An arrow <semantics>r:xy<annotation encoding="application/x-tex">r \colon x \to y</annotation></semantics> is a tabulation if there exists maps <semantics>f:zx<annotation encoding="application/x-tex">f \colon z \to x</annotation></semantics> and <semantics>g:zy<annotation encoding="application/x-tex">g \colon z \to y</annotation></semantics> such that <semantics>r=gf *<annotation encoding="application/x-tex">r = g f^\ast</annotation></semantics> and <semantics>f *fg *g=1 z<annotation encoding="application/x-tex">f^\ast f \wedge g^\ast g = 1_z</annotation></semantics>.

This definition seems bizarre on its face, but it really is analogous to the jointly monic span-definition of an internal relation. That <semantics>r=gf *<annotation encoding="application/x-tex">r = g f^\ast</annotation></semantics> is saying that <semantics>r<annotation encoding="application/x-tex">r</annotation></semantics> is like a span <semantics>xfzgy<annotation encoding="application/x-tex">x \xleftarrow{f} z \xrightarrow{g} y</annotation></semantics>. The equation <semantics>f *fg *g=1 z<annotation encoding="application/x-tex">f^\ast f \wedge g^\ast g = 1_z</annotation></semantics> implies that this span is jointly monic (puzzle).

A bicategory of relations is called functionally complete if every arrow <semantics>r:xI<annotation encoding="application/x-tex">r \colon x \to I</annotation></semantics> has a tabulation <semantics>i:x rx<annotation encoding="application/x-tex">i \colon x_r \to x</annotation></semantics> and <semantics>t:x rI<annotation encoding="application/x-tex">t \colon x_r \to I</annotation></semantics>. One can show that the existence of these tabulations together with compact closedness is sufficient to obtain a unique (up to isomorphism) tabulation for every arrow. We now provide the characterization, presented as Theorem 3.5.

Theorem. Let <semantics>B<annotation encoding="application/x-tex">B</annotation></semantics> be a functionally complete bicategory of relations. Then:

  • <semantics>Map(B)<annotation encoding="application/x-tex">\mathbf{Map}(B)</annotation></semantics> is a regular category (all 2-arrows are trivial by an above corollary)

  • There is a biequivalence of bicategories <semantics>Rel(Map(B))B<annotation encoding="application/x-tex">\mathbf{Rel}(\mathbf{Map}(B)) \simeq B</annotation></semantics> obtained by sending the relation <semantics>f,g<annotation encoding="application/x-tex">\langle f,g \rangle</annotation></semantics> of <semantics>Rel(Map(B))<annotation encoding="application/x-tex">\mathbf{Rel}(\mathbf{Map}(B))</annotation></semantics> to the arrow <semantics>gf <annotation encoding="application/x-tex">g f^\circ</annotation></semantics> of <semantics>B<annotation encoding="application/x-tex">B</annotation></semantics>.

So all functionally complete bicategories of relations are bicategories of internal relations. An interesting quesion is whether any regular category can be realized as <semantics>Map(B)<annotation encoding="application/x-tex">\mathbf{Map}(B)</annotation></semantics> for some functionally complete bicategory of relations. Perhaps a knowledgeable passerby will gift us with an answer in the comments!

From this theorem, we can classify some important types of categories. For instance, bicategories of relations internal to a Heyting category are exactly the functionally complete bicategory of relations having all right Kan extensions. Bicategories of relations internal to an elementary topos are exactly the functionally complete bicategories of relations <semantics>B<annotation encoding="application/x-tex">B</annotation></semantics> such that <semantics>B(x,)<annotation encoding="application/x-tex">B(x,-)</annotation></semantics> is representable in <semantics>Map(B)<annotation encoding="application/x-tex">\mathbf{Map}(B)</annotation></semantics> for all objects <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics>.

Characterizing ordered object bicategories

The goal of this section is to characterize the bicategory <semantics>Ord(E)<annotation encoding="application/x-tex">\mathbf{Ord}(E)</annotation></semantics> of ordered objects and ideals. We already introduced <semantics>Ord(E)<annotation encoding="application/x-tex">\mathbf{Ord}(E)</annotation></semantics> earlier, but that definition isn’t quite abstract enough for our purposes. An equivalent way of defining an ordered object in <semantics>E<annotation encoding="application/x-tex">E</annotation></semantics> is as an <semantics>E<annotation encoding="application/x-tex">E</annotation></semantics>-object <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics> together with a relation <semantics>r<annotation encoding="application/x-tex">r</annotation></semantics> on <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics> such that <semantics>1r<annotation encoding="application/x-tex">1 \leq r</annotation></semantics> and <semantics>rrr<annotation encoding="application/x-tex">r r \leq r</annotation></semantics>. Does this data look familiar? An ordered object in <semantics>E<annotation encoding="application/x-tex">E</annotation></semantics> is simply a monad in <semantics>Rel(E)<annotation encoding="application/x-tex">\mathbf{Rel}(E)</annotation></semantics>!

(Puzzle: What is a monad in a general bicategory? Hint: how are adjoints defined in a general bicategory?)

Quite a bit is known about monads, and we can now apply that knowledge to our study of <semantics>Ord(E)<annotation encoding="application/x-tex">\mathbf{Ord}(E)</annotation></semantics>.

Recall that any monad in <semantics>Cat<annotation encoding="application/x-tex">\mathbf{Cat}</annotation></semantics> gives rise to a category of adjunctions. The initial object of this category is the Kleisli category. Since the Kleisli category can be defined using a universal property, we can define a Kleisli object in any bicategory. In general, a Kleisli object for a monad <semantics>t:xx<annotation encoding="application/x-tex">t \colon x \to x</annotation></semantics> need not exist but when it does, it is defined as an arrow <semantics>k:xx t<annotation encoding="application/x-tex">k : x \to x_t</annotation></semantics> plus a 2-arrow <semantics>θ:ktk<annotation encoding="application/x-tex">\theta \colon k t \to k</annotation></semantics> such that, given any arrow <semantics>f:xy<annotation encoding="application/x-tex">f \colon x \to y</annotation></semantics> and 2-arrow <semantics>α:ftf<annotation encoding="application/x-tex">\alpha \colon f t \to f</annotation></semantics>, there exists a unique arrow <semantics>h:x ty<annotation encoding="application/x-tex">h \colon x_t \to y</annotation></semantics> such that <semantics>hk=f<annotation encoding="application/x-tex">h k = f</annotation></semantics>. The pasting diagrams involved also commute:

As in the case of working inside <semantics>Cat<annotation encoding="application/x-tex">\mathbf{Cat}</annotation></semantics>, we would expect for <semantics>k<annotation encoding="application/x-tex">k</annotation></semantics> to be on the left of an adjoint pair, and indeed it is. We get a right adjoint <semantics>k *<annotation encoding="application/x-tex">k^\ast</annotation></semantics> such that the composite <semantics>k *k<annotation encoding="application/x-tex">k^\ast k</annotation></semantics> is our original monad <semantics>t<annotation encoding="application/x-tex">t</annotation></semantics>. The benefit of working in the locally posetal case is we also have that <semantics>kk *=1<annotation encoding="application/x-tex">k k^\ast = 1</annotation></semantics>. This realizes <semantics>t<annotation encoding="application/x-tex">t</annotation></semantics> as an idempotent:

<semantics>tt=k *kk *k=k *k=t.<annotation encoding="application/x-tex"> t t = k^\ast k k^\ast k = k^\ast k = t. </annotation></semantics>

It follows that the Kleisli object construction is exactly an idempotent splitting of <semantics>t<annotation encoding="application/x-tex">t</annotation></semantics>! This means we can start with an exact category <semantics>E<annotation encoding="application/x-tex">E</annotation></semantics> and construct <semantics>Ord(E)<annotation encoding="application/x-tex">\mathbf{Ord}(E)</annotation></semantics> by splitting the idempotents of <semantics>Rel(E)<annotation encoding="application/x-tex">\mathbf{Rel}(E)</annotation></semantics>. With this in mind, we move on to the characterization, presented as Theorem 4.6.

Theorem. A bicategory <semantics>B<annotation encoding="application/x-tex">B</annotation></semantics> is biequivalent to a bicategory <semantics>Ord(E)<annotation encoding="application/x-tex">\mathbf{Ord}(E)</annotation></semantics> if and only if

  • <semantics>B<annotation encoding="application/x-tex">B</annotation></semantics> is Cartesian,

  • every monad in <semantics>B<annotation encoding="application/x-tex">B</annotation></semantics> has a Kleisli object,

  • for each object <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics>, there is a monad on <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics> and a Frobenius monoid <semantics>x 0<annotation encoding="application/x-tex">x_0</annotation></semantics> that is isomorphic to the monad’s Kleisli object,

  • given a Frobenius monoid x and <semantics>f:xx<annotation encoding="application/x-tex">f \colon x \to x</annotation></semantics> with <semantics>f1<annotation encoding="application/x-tex">f \leq 1</annotation></semantics>, <semantics>f<annotation encoding="application/x-tex">f</annotation></semantics> splits.

Final words

The authors go on to look closer at bicategories of relations inside Grothendieck topoi and abelian categories. Both of these are regular categories, and so fit into the picture we’ve just painted. However, each have additional properties and structure that compels further study.

Much of what we have done can be done in greater generality. For instance, we can drop the local posetal requirement. However, this would greatly complicate matters by requiring non-trivial coherence conditions.

by john ( at February 21, 2018 07:07 AM

February 20, 2018

Christian P. Robert - xi'an's og

bad graphics and poor statistics

Reading through The Guardian website, I came across this terrible graphic about US airlines 2016 comparison for killing pests pets they carry. Beyond the gross imprecision resulting from resorting to a (gross) dead dog scale to report integers, the impression of Hawaiian Airlines having a beef with pets is just misleading: there were three animal deaths on this company for that year. And nine on United Airlines (including the late giant rabbit). The law of small numbers in action! Computing a basic p-value (!) based on a Poisson approximation (the most pet friendly distribution) does not even exclude Hawaiian Airlines. Without even considering the possibility that, among the half-million plus pets travelling on US airlines in 2016, some would have died anyway but it happened during a flight. (As a comparison, there are “between 114 and 360 medical” in-flight [human] deaths per year. For it’s worth.) The scariest part of The Guardian article [beyond the reliance on terrible graphs!] is the call to end up pets travelling as cargo, meaning they would join their owner in the cabin. As if stag and hen [parties] were not enough of a travelling nuisance..!

by xi'an at February 20, 2018 11:18 PM

Emily Lakdawalla - The Planetary Society Blog

Goodbye, ISS. Hello, private space stations?
The International Space Station may go away in 2025. Will private space stations be ready to fill the gap?

February 20, 2018 10:00 PM

Emily Lakdawalla - The Planetary Society Blog

Opportunity's sol 5000 self-portrait
Last week the Mars Exploration Rover Opportunity celebrated its 5000th sol on Mars, and it celebrated by taking the first complete Mars Exploration Rover self-portrait.

February 20, 2018 05:10 PM

Peter Coles - In the Dark

Learning Technology

I’m just taking a tea break in the Data Innovation Research Institute. Today has been a very day as I have to finish off a lot of things by tomorrow, for reasons that I’ll make clear in my next post…

It struck me when I was putting on the brew how much more technology we use for teaching now than when I was a student. I think many of my colleagues make far more effective use of the available technology than I do, but I do my best to overcome my Luddite tendencies. Reflecting on today’s teaching makes me feel just a little less like a dinosaur.

This morning I gave a two-hour lecture on my Cardiff module Physics of the Early Universe which, as usual, I recorded using our Panopto system. Although there was a problem with some of the University’s central file storage this morning, which made me a bit nervous about whether the lecture recording would work, it did. Predictably I couldn’t access the network drives from the PC in the lecture theatre, but I had anticipated that and took everything I needed on a memory stick.

After a short break for lunch I checked the lecture recording and made it available for registered students via the Virtual Learning Environment (VLE), known to its friends as Learning Central. I use this as a sort of repository of stuff connected with the module: notes, list of textbooks, problem sets, model answers, instructions and, of course, recorded lectures. The students also submit their coursework assignment (an essay) through this system, through the plagiarism detection software Turnitin.

This afternoon the students on my Computational Physics course in Maynooth University had a lab test, the first of four such tests, this one consisting of a short coding exercise. There are two lab sessions per week for this class, one on Thursdays (when I am normally in Maynooth to help supervise) and another on Tuesdays (when I am normally in Cardiff). I have a number of exercises, which are similar in scope but different in detail (to prevent copying) and the Tuesday lab has a completely different set of exercises from the Thursday one. In each exercise the students have to write a simple Python script to plot graphs of a function and its derivative (computed numerically) using matplotlib. The students upload their script and pictures of the plot to the VLE used in Maynooth, which is called Moodle.

In the manner of a TV chef announcing `here’s one I did earlier’, this a sample output produced by my `model’ code:

I wonder if you can guess of what function this is the derivative? By the way in this context `model’ does not mean `a standard of excellence’ but `an imitation of something’ (me being an imitation of a computational physicist). Anyway, students will get marks for producing plots that look right, but also for presenting a nice (commented!) bit of code

This afternoon I’m on Cardiff time but I was able to keep an eye on the submissions coming in to Moodle in case something went wrong. It seemed to work out OK, but the main problem now is that I’ve got 20-odd bits of code to mark! That will have to wait until I’m properly on Maynooth time!

Now, back to the grind…

by telescoper at February 20, 2018 05:07 PM

The n-Category Cafe

A Categorical Semantics for Causal Structure

guest post by Joseph Moeller and Dmitry Vagner

We begin the Applied Category Theory Seminar by discussing the paper A categorical semantics for causal structure by Aleks Kissinger and Sander Uijlen.

Compact closed categories have been used in categorical quantum mechanics to give a structure for talking about quantum processes. However, they prove insufficient to handle higher order processes, in other words, processes of processes. This paper offers a construction for a <semantics>*<annotation encoding="application/x-tex">\ast</annotation></semantics>-autonomous extension of a given compact closed category which allows one to reason about higher order processes in a non-trivial way.

We would like to thank Brendan Fong, Nina Otter, Joseph Hirsh and Tobias Heindel as well as the other participants for the discussions and feedback.


We begin with a discussion about the types of categories which we will be working with, and the diagrammatic language we use to reason about these categories.


Recall the following diagrammatic language we use to reason about symmetric monoidal categories. Objects are represented by wires. Arrows can be graphically encoded as

Composition <semantics><annotation encoding="application/x-tex">\circ</annotation></semantics> and <semantics><annotation encoding="application/x-tex">\otimes</annotation></semantics> depicted vertically and horizontally

satisfying the properties

and the interchange law

If <semantics>I<annotation encoding="application/x-tex">I</annotation></semantics> is the unit object, and <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics> is an object, we call arrows <semantics>IA<annotation encoding="application/x-tex">I \to A</annotation></semantics> states, <semantics>AI<annotation encoding="application/x-tex">A \to I</annotation></semantics> effects, and <semantics>II<annotation encoding="application/x-tex">I \to I</annotation></semantics> numbers.

The identity morphism on an object is only displayed as a wire, and both <semantics>I<annotation encoding="application/x-tex">I</annotation></semantics> and its identity morphism are not displayed.

Compact closed categories

A symmetric monoidal category is compact closed if each object <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics> has a dual object <semantics>A *<annotation encoding="application/x-tex">A^\ast</annotation></semantics> with arrows

<semantics>η A:IA *A,<annotation encoding="application/x-tex">\eta_A \colon I \to A^\ast \otimes A,</annotation></semantics>


<semantics>ϵ A:AA *I,<annotation encoding="application/x-tex">\epsilon_A \colon A \otimes A^\ast \to I,</annotation></semantics>

depicted as <semantics><annotation encoding="application/x-tex">\cup</annotation></semantics> and <semantics><annotation encoding="application/x-tex">\cap</annotation></semantics> and obeying the zigzag identities:

Given a process <semantics>f:AB<annotation encoding="application/x-tex">f \colon A \to B</annotation></semantics> in a compact closed category, we can construct a state <semantics>ρ f:IA *B<annotation encoding="application/x-tex">\rho_f \colon I \to A^\ast \otimes B</annotation></semantics> by defining

<semantics>ρ f=(1 A *f)η A.<annotation encoding="application/x-tex">\rho_f = (1_{A^\ast} \otimes f) \circ \eta_A.</annotation></semantics>

This gives a correspondence which is called “process-state duality”.

An example

Let <semantics>Mat( +)<annotation encoding="application/x-tex">Mat(\mathbb{R}_+)</annotation></semantics> be the category in which objects are natural numbers, and morphisms <semantics>mn<annotation encoding="application/x-tex">m \to n</annotation></semantics> are <semantics>n×m<annotation encoding="application/x-tex">n\times m</annotation></semantics> <semantics> +<annotation encoding="application/x-tex">\mathbb{R}_+</annotation></semantics>-matrices with composition given by the usual multiplication of matrices. This category is made symmetric monoidal with tensor defined by <semantics>nm=nm<annotation encoding="application/x-tex">n \otimes m = nm</annotation></semantics> on objects and the Kronecker product of matrices on arrows, <semantics>(fg) ij kl=f i kg j l<annotation encoding="application/x-tex">(f \otimes g)^{kl}_{ij} = f^k_i g^l_j</annotation></semantics>. For example

<semantics>[1 2 3 4][0 5 6 7]=[1[0 5 6 7] 2[0 5 6 7] 3[0 5 6 7] 4[0 5 6 7]]=[0 5 0 10 6 7 12 14 0 15 0 20 18 21 24 28]<annotation encoding="application/x-tex"> \begin{bmatrix} 1 & 2 \\ 3 & 4 \end{bmatrix} \otimes \begin{bmatrix} 0 & 5 \\ 6 & 7 \end{bmatrix} = \begin{bmatrix} 1\cdot \begin{bmatrix} 0 & 5 \\ 6 & 7 \end{bmatrix} & 2\cdot \begin{bmatrix} 0 & 5 \\ 6 & 7 \end{bmatrix} \\ 3 \cdot \begin{bmatrix} 0 & 5 \\ 6 & 7 \end{bmatrix} &4\cdot \begin{bmatrix} 0 & 5 \\ 6 & 7 \end{bmatrix} \end{bmatrix} = \begin{bmatrix} 0 & 5 & 0 & 10 \\ 6 & 7 & 12 & 14 \\ 0 & 15 & 0 & 20 \\ 18 & 21 & 24 & 28 \end{bmatrix} </annotation></semantics>

The unit with respect to this tensor is <semantics>1<annotation encoding="application/x-tex">1</annotation></semantics>. States in this category are column vectors, effects are row vectors, and numbers are <semantics>1×1<annotation encoding="application/x-tex">1\times 1</annotation></semantics> matrices, in other words, numbers. Composing a state <semantics>ρ:1n<annotation encoding="application/x-tex">\rho\colon 1 \to n</annotation></semantics> with an effect <semantics>π:n1<annotation encoding="application/x-tex">\pi\colon n \to 1</annotation></semantics>, is the dot product. To define a compact closed structure on this category, let <semantics>n *:=n<annotation encoding="application/x-tex">n^\ast := n</annotation></semantics>. Then <semantics>η n:1n 2<annotation encoding="application/x-tex">\eta_n \colon 1 \to n^2</annotation></semantics> and <semantics>ε n:n 21<annotation encoding="application/x-tex">\varepsilon_n \colon n^2 \to 1</annotation></semantics> are given by the Kronecker delta.

A Categorical Framework for Causality

Encoding causality

The main construction in this paper requires what is called a precausal category. In a precausal category, we demand that every system has a discard effect, which is a process <semantics>A:AI<annotation encoding="application/x-tex">_A \colon A \to I</annotation></semantics>. This collection of effects must be compatible with <semantics><annotation encoding="application/x-tex">\otimes</annotation></semantics>:

  • <semantics>AB=<annotation encoding="application/x-tex">_{A \otimes B} = </annotation></semantics><semantics>A<annotation encoding="application/x-tex">_A </annotation></semantics><semantics>B<annotation encoding="application/x-tex">_B</annotation></semantics>

  • <semantics>I=1<annotation encoding="application/x-tex">_I = 1</annotation></semantics>

A process <semantics>Φ:AB<annotation encoding="application/x-tex">\Phi \colon A \to B</annotation></semantics> is called causal if discarding <semantics>B<annotation encoding="application/x-tex">B</annotation></semantics> after having done <semantics>Φ<annotation encoding="application/x-tex">\Phi</annotation></semantics> is the same as just discarding <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics>.

If <semantics>A *<annotation encoding="application/x-tex">A^\ast</annotation></semantics> has discarding, we can produce a state for <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics> by spawning an <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics> and <semantics>A *<annotation encoding="application/x-tex">A^\ast</annotation></semantics> pair, then discarding the <semantics>A *<annotation encoding="application/x-tex">A^\ast</annotation></semantics>:

In the case of <semantics>Mat( +)<annotation encoding="application/x-tex">Mat(\mathbb{R}_+)</annotation></semantics>, the discard effect is given as row vector of <semantics>1<annotation encoding="application/x-tex">1</annotation></semantics>’s: <semantics>1:=(11)<annotation encoding="application/x-tex">\mathbf{1}:=(1\cdots 1)</annotation></semantics>. Composing a matrix with the discard effect sums the entries of each column. So if a matrix is a causal process, then its column vectors have entries that sum to <semantics>1<annotation encoding="application/x-tex">1</annotation></semantics>. Thus causal processes in <semantics>Mat( +)<annotation encoding="application/x-tex">Mat(\mathbb{R}_+)</annotation></semantics> are stochastic maps.

A process <semantics>Φ:ABAB<annotation encoding="application/x-tex">\Phi \colon A \otimes B \to A' \otimes B'</annotation></semantics> is one-way signalling with <semantics>AB<annotation encoding="application/x-tex">A \preceq B</annotation></semantics> if

and <semantics>BA<annotation encoding="application/x-tex">B \preceq A</annotation></semantics> if

and non-signalling if both <semantics>AB<annotation encoding="application/x-tex">A\preceq B</annotation></semantics> and <semantics>BA<annotation encoding="application/x-tex">B\preceq A</annotation></semantics>.

The intuition here is that <semantics>AB<annotation encoding="application/x-tex">A \preceq B</annotation></semantics> means <semantics>B<annotation encoding="application/x-tex">B</annotation></semantics> cannot signal to <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics>; the formal condition encodes the fact that had <semantics>B<annotation encoding="application/x-tex">B</annotation></semantics> influenced the transformation from <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics> to <semantics>A<annotation encoding="application/x-tex">A'</annotation></semantics>, then it couldn’t have been discarded prior to it occurring.

Consider the following example: let <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics> be a cup of tea, and <semantics>B<annotation encoding="application/x-tex">B</annotation></semantics> a glass of milk. Let <semantics>Φ<annotation encoding="application/x-tex">\Phi</annotation></semantics> the process of pouring half of <semantics>B<annotation encoding="application/x-tex">B</annotation></semantics> into <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics> then mixing, to form <semantics>A<annotation encoding="application/x-tex">A'</annotation></semantics> milktea and <semantics>B<annotation encoding="application/x-tex">B'</annotation></semantics> half-glass of milk. Clearly this process would not be the same as if we start by discarding the milk. Our intuition is that the milk “signalled” to, or influenced, the tea, and hence intuitively we do not have <semantics>AB<annotation encoding="application/x-tex">A \preceq B</annotation></semantics>.

A compact closed category <semantics>C<annotation encoding="application/x-tex">\C</annotation></semantics> is precausal if

1) Every system <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics> has a discard process <semantics>A<annotation encoding="application/x-tex">_A</annotation></semantics>

2) For every system <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics>, the dimension is invertible

3) <semantics>C<annotation encoding="application/x-tex">\C</annotation></semantics> has enough causal states

4) Second order causal processes factor

From the definition, we can begin to exclude certain causal situations from systems in precausal categories. In Theorem 3.12, we see that precausal categories do not admit ‘time-travel’.

Theorem   If there are systems <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics>, <semantics>B<annotation encoding="application/x-tex">B</annotation></semantics>, <semantics>C<annotation encoding="application/x-tex">C</annotation></semantics> such that

then <semantics>AI<annotation encoding="application/x-tex">A \cong I</annotation></semantics>.

In precausal categories, we have processes that we call first order causal. However, higher order processes collapse into first order processes, because precausal categories are compact closed. For example, letting <semantics>AB:=A *B<annotation encoding="application/x-tex">A \Rightarrow B := A^\ast\otimes B</annotation></semantics>,

<semantics>(AB)C=(A *B)*CBAC<annotation encoding="application/x-tex"> (A\Rightarrow B)\Rightarrow C = (A^\ast\otimes B)\ast\otimes C \cong B\Rightarrow A\otimes C </annotation></semantics>

We can see this happens because of the condition <semantics>AB(A *B *) *<annotation encoding="application/x-tex">A \otimes B \cong (A^\ast \otimes B^\ast)^\ast</annotation></semantics>. Weakening this condition of compact closed categories yields <semantics>*<annotation encoding="application/x-tex">\ast</annotation></semantics>-autonomous categories. From a precausal category <semantics>C<annotation encoding="application/x-tex">\C</annotation></semantics>, we construct a category <semantics>Caus[C]<annotation encoding="application/x-tex">Caus[\C]</annotation></semantics> of higher order causal relations.

The category of higher order causal processes

Given a set of states <semantics>cC(I,A)<annotation encoding="application/x-tex">c \subseteq \C(I,A)</annotation></semantics> for a system <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics>, define its dual by

<semantics>c *={πA *|ρc,πρ=1}C(I,A *)<annotation encoding="application/x-tex"> c^\ast = \{\pi \in A^\ast |\, \forall \rho \in c ,\, \pi \circ \rho = 1\} \subseteq \C(I, A^\ast) </annotation></semantics>

Then we say a set of states <semantics>cC(I,A)<annotation encoding="application/x-tex">c \subseteq \C(I,A)</annotation></semantics> is closed if <semantics>c=c **<annotation encoding="application/x-tex">c=c^{\ast\ast}</annotation></semantics>, and flat if there are invertible scalars <semantics>λ<annotation encoding="application/x-tex">\lambda</annotation></semantics>, <semantics>μ<annotation encoding="application/x-tex">\mu</annotation></semantics> such that <semantics>λ<annotation encoding="application/x-tex">\lambda</annotation></semantics> <semantics>c<annotation encoding="application/x-tex">\in c</annotation></semantics>, <semantics>μ<annotation encoding="application/x-tex">\mu</annotation></semantics> <semantics>c *<annotation encoding="application/x-tex">\in c^\ast</annotation></semantics>.

Now we can define the category <semantics>Caus[C]<annotation encoding="application/x-tex">Caus[\C]</annotation></semantics>. Let the objects be pairs <semantics>(A,c A)<annotation encoding="application/x-tex">(A, c_A)</annotation></semantics> where <semantics>c A<annotation encoding="application/x-tex">c_A</annotation></semantics> is a closed and flat set of states of the system <semantics>AC<annotation encoding="application/x-tex">A \in \C</annotation></semantics>. A morphism <semantics>f:(A,c A)(B,c B)<annotation encoding="application/x-tex">f \colon (A, c_A) \to (B, c_B)</annotation></semantics> is a morphism <semantics>f:AB<annotation encoding="application/x-tex">f \colon A \to B</annotation></semantics> in <semantics>C<annotation encoding="application/x-tex">\C</annotation></semantics> such that if <semantics>ρc A<annotation encoding="application/x-tex">\rho \in c_A</annotation></semantics>, then <semantics>fρc B<annotation encoding="application/x-tex">f \circ \rho \in c_B</annotation></semantics>. This category is a symmetric monoidal category with <semantics>(A,c A)(B,c B)=(AB,c AB)<annotation encoding="application/x-tex">(A, c_A) \otimes (B, c_B) = (A \otimes B, c_{A \otimes B})</annotation></semantics>. Further, it’s <semantics>*<annotation encoding="application/x-tex">\ast</annotation></semantics>-autonomous, so higher order processes won’t necessarily collapse into first order.

A first order system in <semantics>Caus[C]<annotation encoding="application/x-tex">Caus[\C]</annotation></semantics> is one of the form <semantics>(A,{<annotation encoding="application/x-tex">(A, \{</annotation></semantics><semantics>A} *)<annotation encoding="application/x-tex">_A\}^\ast)</annotation></semantics>. First order systems are closed under <semantics><annotation encoding="application/x-tex">\otimes</annotation></semantics>. In fact, <semantics>C C<annotation encoding="application/x-tex">C_C</annotation></semantics> admits a full faithful monoidal embedding into <semantics>Caus[C]<annotation encoding="application/x-tex">Caus[\C]</annotation></semantics> by assigning systems to their corresponding first order systems <semantics>A(A,{<annotation encoding="application/x-tex">A \mapsto (A, \{</annotation></semantics><semantics>A})<annotation encoding="application/x-tex">_A\})</annotation></semantics>.

For an example of a higher order process in <semantics>Caus[Mat( +)]<annotation encoding="application/x-tex">Caus[Mat(\mathbb{R}_+)]</annotation></semantics>, consider a classical switch. Let

<semantics>ρ 0=[1 0],ρ 1=[0 1],ρ i=ρ i T,<annotation encoding="application/x-tex">\rho_0 = \begin{bmatrix} 1\\0 \end{bmatrix}, \quad \rho_1 = \begin{bmatrix} 0\\1 \end{bmatrix}, \quad \rho'_i = \rho_i^T,</annotation></semantics>

and let <semantics>s<annotation encoding="application/x-tex">s</annotation></semantics> be the second order process

This process is of type <semantics>XC(AB)(BA)C<annotation encoding="application/x-tex">X \otimes C \multimap (A \multimap B) \otimes (B \multimap A) \multimap C'</annotation></semantics>, where the two middle inputs take types <semantics>AB<annotation encoding="application/x-tex">A \multimap B</annotation></semantics> on the left and <semantics>BA<annotation encoding="application/x-tex">B \multimap A</annotation></semantics> on the right. Since <semantics>ρ iρ j<annotation encoding="application/x-tex">\rho'_i \circ \rho_j</annotation></semantics> is <semantics>1<annotation encoding="application/x-tex">1</annotation></semantics> if <semantics>i=j<annotation encoding="application/x-tex">i=j</annotation></semantics> and <semantics>0<annotation encoding="application/x-tex">0</annotation></semantics> otherwise, then plugging in either <semantics>ρ 0<annotation encoding="application/x-tex">\rho_0</annotation></semantics> or <semantics>ρ 1<annotation encoding="application/x-tex">\rho_1</annotation></semantics> to the bottom left input switches the order that the <semantics>AB<annotation encoding="application/x-tex">A \multimap B</annotation></semantics> and <semantics>BA<annotation encoding="application/x-tex">B \multimap A</annotation></semantics> processes are composed in the final output process. This second order process is causal because

The authors go on to prove in Theorem 6.17 that a switch cannot be causally ordered, indicating that this process is genuinely second order.

by john ( at February 20, 2018 04:51 AM

February 19, 2018

Christian P. Robert - xi'an's og

the first Bayesian

In the first issue of Statistical Science for this year (2018), Stephen Stiegler pursues the origins of Bayesianism as attributable to Richard Price, main author of Bayes’ Essay. (This incidentally relates to an earlier ‘Og piece on that notion!) Steve points out the considerable inputs of Price on this Essay, even though the mathematical advance is very likely to be entirely Bayes’. It may however well be Price who initiated Bayes’ reflections on the matter, towards producing a counter-argument to Hume’s “On Miracles”.

“Price’s caution in addressing the probabilities of hypotheses suggested by data is rare in early literature.”

A section of the paper is about Price’s approach data-determined hypotheses and to the fact that considering such hypotheses cannot easily fit within a Bayesian framework. As stated by Price, “it would be improbable as infinite to one”. Which is a nice way to address the infinite mass prior.


by xi'an at February 19, 2018 11:18 PM

Emily Lakdawalla - The Planetary Society Blog

An Interplanetary Mateship: The Planetary Society Continues our Australian Initiative
Thanks to recent investments by our members in The Planetary Society’s Space Policy & Advocacy program, we now have the resources to institute a strategic effort to support the exploration of space in an international context.

February 19, 2018 11:00 PM

The n-Category Cafe

Gradual Typing

(Guest post by Max New)

Dan Licata and I have just put up a paper on the arxiv with a syntax and semantics for a gradually typed programming language, which is a kind of synthesis of statically typed and dynamically typed programming styles. The central insight of the paper is to show that the dynamic type checking used in gradual typing has the structure of a proarrow equipment. Using this we can show that some traditional definitions of dynamic type checks can be proven to be in fact unique solutions to the specifications provided by the structure of an equipment. It’s a classic application of category theory: finding a universal property to better understand what was previously an ad-hoc construction.

The paper is written for computer scientists, so I’ll try to provide a more category-theorist-accessible intro here.

Dynamic Typing

First, a very brief introduction to what dynamic typing is. To put it very simply, dynamically typed languages are languages where there is only one type, the “dynamic type” which I’ll call <semantics>?<annotation encoding="application/x-tex">?</annotation></semantics> (more details here). They have the same relationship to typed languages that monoids do to categories or operads to multicategories. So for example, in a dynamically typed language the function application rule looks like

<semantics>Γt:?Γu:?Γt(u):?<annotation encoding="application/x-tex">\frac{\Gamma \vdash t : ?\,\, \Gamma \vdash u : ?}{\Gamma \vdash t(u) : ?}</annotation></semantics>

That is, it always type-checks, whereas in a typed language the <semantics>t<annotation encoding="application/x-tex">t</annotation></semantics> above would need to be of a function type <semantics>AB<annotation encoding="application/x-tex">A \to B</annotation></semantics> and <semantics>u<annotation encoding="application/x-tex">u</annotation></semantics> would have to be of type <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics>. However, these languages are not like the pure untyped lambda calculus where everything is a function: they will also include other types of data like numbers, pairs, strings etc. So whereas in a statically typed language the syntax <semantics>3(3)<annotation encoding="application/x-tex">3(3)</annotation></semantics> would be rejected at type-checking time, we can think of the dynamic language as delaying this type error until run-time, when the program <semantics>3(3)<annotation encoding="application/x-tex">3(3)</annotation></semantics> will crash and say “3 is not a function”. This is implemented essentially by representing a value in the dynamic language as an element of a coproduct of all the primitive types in the language. Then <semantics>3<annotation encoding="application/x-tex">3</annotation></semantics> becomes <semantics>(number,3)<annotation encoding="application/x-tex">(number, 3)</annotation></semantics> and the implementation for the function application form will see that the value is not tagged as a function and crash. On the other hand if we have an application <semantics>(function,λx.t)((number,3))<annotation encoding="application/x-tex">(function, \lambda x. t)((number, 3))</annotation></semantics> the implementation will see that the tag is a function and then run <semantics>t[(number,3)/x]<annotation encoding="application/x-tex">t[(number,3)/x]</annotation></semantics>, which may itself raise a dynamic type error later if <semantics>t<annotation encoding="application/x-tex">t</annotation></semantics> uses its input as a pair, or a function.

So with all that ugliness, why do we care about these languages? Well it is very easy to write down a program and just run it without fighting with the type checker. Anyone who’s had to restructure their coq or agda program to convince the type checker that it is terminating should be able to relate here. Furthermore, even if I do think these languages are ugly and awful it is a fact that these languages are extremely popular: JavaScript, Python, Perl, Ruby, PHP, LISP/Scheme are all used to write real working code and we certainly don’t want to chuck out all of those codebases and laboriously rewrite them in a typed language.

Gradual Typing

Gradually typed languages give us a better way: they embed the dynamic language in them, but we can also program in a statically typed style and have it interoperate with the dynamically typed code. This means new code can be written in a statically typed language while still being able to call the legacy dynamically typed code without having to sacrifice all of the benefits of static typing.

To a first approximation, a gradually typed language is a statically typed language with dynamic type errors, a distinguished dynamic type <semantics>?<annotation encoding="application/x-tex">?</annotation></semantics> and a distinguished way to coerce values from any type to and from the dynamic type (possibly causing type errors). Then a dynamically typed program can be compiled to the gradual language by translating the implicit dynamic type checking to explicit casts.

For a gradually typed language to deserve the name, it should be on the one hand typed, meaning the types have their intended categorical semantics as products, exponentials, etc and on the other hand it should satisfy graduality. Graduality of a language means that the transition from a very loose, dynamic style to a precise, statically typed style should be as smooth as possible. More concretely, it means that changing the types used in the program to be less dynamic should lead to a refinement of the program’s behavior: if the term satisfies the new type, it should behave as before, but otherwise it should produce a dynamic type error.

We can formalize this idea by modeling our gradually typed language as a category internal to preorders: types and terms have related “dynamism” orderings (denoted by <semantics><annotation encoding="application/x-tex">\sqsubseteq</annotation></semantics>) and all type and term constructors are monotone with respect to these orderings. Then we can characterize the dynamic type as being the most dynamic type and the type error as a least dynamic term of each type. Making everything internal to preorders reproduces exactly the rules of type and term dynamism that programming language researchers have developed based on their operational intuitions.

We can then make a simple logic of type and term dynamism that we call “Preorder Type Theory”. Since we are doing cartesian type theory, it is the internal logic of virtually cartesian preorder categories, rather than plain preorder categories due to the structure of contexts. In the paper I present VCP categories by a direct definition, but behind the scenes I used Crutwell-Shulman’s notion of normalized T-monoid to figure out what the definition should be.

Casts and Equipments

While preorder type theory can express the basic ideas of a dynamic type and type errors, the key ingredient of gradual typing that we are missing is that we should be able to cast terms of one type to another so that we can still program in a dynamically typed style.

Gradually typed languages in the literature do this by having a cast form <semantics>BAt<annotation encoding="application/x-tex">\langle B \Leftarrow A\rangle t</annotation></semantics> in their language whose semantics is defined by induction on <semantics>A,B<annotation encoding="application/x-tex">A,B</annotation></semantics>. Our approach is to carve out 2 subsets of these casts which have universal properties with respect to the preorder category structure, which are the upcasts <semantics>BAt<annotation encoding="application/x-tex">\langle B \leftarrowtail A\rangle t</annotation></semantics> and downcasts <semantics>ABu<annotation encoding="application/x-tex">\langle A \twoheadleftarrow B\rangle u</annotation></semantics> which can only be formed when <semantics>AB<annotation encoding="application/x-tex">A \sqsubseteq B</annotation></semantics>. Then one of those “oblique” casts can be defined using the dynamic type: <semantics>BAt=B??At<annotation encoding="application/x-tex">\langle B \Leftarrow A\rangle t = \langle B \twoheadleftarrow ?\rangle \langle ? \leftarrowtail A\rangle t</annotation></semantics>.

What universal property do the upcasts and downcasts have? Well, first we notice that term dynamism gives a relational profunctor between terms of type <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics> and type <semantics>B<annotation encoding="application/x-tex">B</annotation></semantics> (i.e. a relation that is monotone in <semantics>B<annotation encoding="application/x-tex">B</annotation></semantics> and antitone in <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics>). Then we characterize the upcast and downcast as left- and right-representables for that profunctor, which consequently means the upcast is left adjoint to the downcast. More explicitly, this means for <semantics>t:A,u:B<annotation encoding="application/x-tex">t : A, u : B</annotation></semantics> we have <semantics>BAtututABu<annotation encoding="application/x-tex">\langle B \leftarrowtail A\rangle t \sqsubseteq u \iff t \sqsubseteq u \iff t \sqsubseteq \langle A \twoheadleftarrow B\rangle u</annotation></semantics>

By uniqueness of representables, this defines the upcasts and downcasts uniquely up to order-equivalence, giving us a specification for the casts. We show that this specification, combined with the monotonicity of all the term connectives and the <semantics>β,η<annotation encoding="application/x-tex">\beta,\eta</annotation></semantics> rules of the connectives allow us to derive the standard implementations of these casts as the unique solutions to this specification.

If you’re familiar with equipments this structure should look familiar: it gives a functor from the thin category ordering the objects to the category of adjunctions in the term category. This is a special case of a proarrow equipment where we view our preorder category as a “vertically thin” double category. So we extend our syntax to be the internal logic of vertically thin cartesian preorder categories that are equipments, and can model type error, dynamic type, functions and products and we call that system “Gradual Type Theory”. Then in that syntax we give some synthetic proofs of the uniqueness theorems for the casts and a few other theorems about equipments with certain universal properties.

Constructing Models

Finally, we give a construction for models of gradual type theory that extends Dana Scott’s classical models of dynamic typing to gradual typing. This connects our semantics to previous programming languages approaches and proves consistency for the system.

Briefly, we start with a locally thin 2-category <semantics>C<annotation encoding="application/x-tex">C</annotation></semantics> (such as the category of domains) and construct an equipment by defining the vertical arrows to be coreflections in <semantics>C<annotation encoding="application/x-tex">C</annotation></semantics>, which are adjunctions where the right adjoint is a retract of the left. Then we pick an object <semantics>d<annotation encoding="application/x-tex">d</annotation></semantics> in <semantics>C<annotation encoding="application/x-tex">C</annotation></semantics> and take a kind of vertical slice category <semantics>C/d<annotation encoding="application/x-tex">C/d</annotation></semantics>. So the objects are coreflections into <semantics>d<annotation encoding="application/x-tex">d</annotation></semantics> and the vertical arrows are commuting triangles of coreflections.

We use coreflections rather than just adjunctions for 2 reasons: first a sociological reason is that the casts people have developed are coreflections, and second a technical reason is that every coreflection is a monomorphism and so when we take the slice category we get a thin category and so our double category becomes a preorder category. This allows us to interpret type and term dynamism as orderings rather than a more complex category structure.

We then show how Dana Scott’s domain theoretic models of dynamic typing extend to a model of gradual type theory (where the type error is a diverging program) and another model where the type error and diverging program are different.

by shulman ( at February 19, 2018 07:50 PM

Emily Lakdawalla - The Planetary Society Blog

Ten times the solar system reminded us sample collection is hard
Some of the biggest discoveries we make in planetary science rely on the seemingly simple act of picking up and analyzing pieces of other worlds. When things go awry, scientists and engineers can sometimes squeeze amazing science out of a tough situation.

February 19, 2018 03:52 PM

Peter Coles - In the Dark

The Philharmonia Orchestra: Beethoven & Mahler

I spent yesterday afternoon at a very enjoyable concert at St David’s Hall in Cardiff for a programme of music by Beethoven and Mahler given by the Philharmonia Orchestra under Principal Guest Conductor Jakub Hrůša. The picture above was taken about 10 minutes before the concert started, from my seat in Tier 1. Quite a few people arrived between then and the beginning of the performance, but there wasn’t a very big audience. St David’s Hall may have been less than half full but those who did come were treated to some fantastic playing.

The first half of the concert consisted of Beethoven’s Piano Concerto No. 1 (in C) with soloist Piotr Anderszewski. This work was actually composed after his Piano Concerto No. 2 but was published first. It consists of three movements, an expansive slow movement (marked Largo) sandwiched between two sprightly up-tempo movements, marked Allegro con brio and Rondo-Allegro Scherzando, respectively. I think the first part of the last movement, full of energy and wit, is the best part of this work and Anderszewski play it with genuine sparkle. His performance was very well received, and he rounded it off with a charming encore in the form of a piece for solo piano by Bartok.

After the wine break we returned to find the piano gone, and the orchestra greatly expanded for a performance of Mahler’s Symphony No. 5 , the fourth movement of which (the `Adagietto’) is probably Mahler’s best-known music (made famous by its use in Visconti’s 1971 film Death in Venice). This lovely movement is sometimes performed on its own – a practice Mahler himself encouraged – but I think it’s particularly powerful when heard in its proper context, embedded in a large orchestral work that lasts well over an hour.

Although nominally five movements, this work is really in three sections: the first section consists of the first two movements (the first starting with Trauermarsch (a funeral march), and the second a stormy and at times savage movement, punctuated with brief interludes of peace). The last section consists of the beautiful Adagietto 4th movement (played entirely on the strings) followed by an energetic and ultimately triumphant finale. In between there’s an extended Scherzo, which is (unusually for Mahler) rather light and cheerful. Roughly speaking this symphony follows a trajectory from darkness into light and, although it certainly doesn’t go in a straight line, and does start with a death march, this is undoubtedly one of Mahler’s cheerier works!

The Philharmonia Orchestra gave a very accomplished and passionate reading of this piece, with especially fine playing from the brass section (who have lot to do). The exuberant ending brought many members of the audience to their feet and rightly so, as it was a very fine performance – the best I’ve heard live of this work.

by telescoper at February 19, 2018 02:20 PM

John Baez - Azimuth

Complex Adaptive System Design (Part 7)

In March, I’ll be talking at Spencer Breiner‘s workshop on Applied Category Theory at the National Institute of Standards and Technology. I’ll be giving a joint talk with John Foley about our work using operads to design networks. This work is part of the Complex Adaptive System Composition and Design Environment project being done by Metron Scientific Solutions and managed by John Paschkewitz at DARPA.

I’ve written about this work before:

• Complex Adaptive Systems: Part 1, Part 2, Part 3, Part 4, Part 5, Part 6.

But we’ve done a lot more, and my blog articles are having trouble keeping up! So I’d like to sketch out the big picture as it stands today.

If I had to summarize, I’d say we’ve developed a formalism for step-by-step compositional design and tasking, using commitment networks. But this takes a while to explain.

Here’s a very simple example of a commitment network:

It has four nodes, which represent agents: a port, a helicopter, a UAV (an unmanned aerial vehicle, or drone) and a target. The edges between these notes describe relationships between these agents. Some of these relationships are ‘commitments’. For example, the edges labelled ‘SIR’ say that one agent should ‘search, intervene and rescue’ the other.

Our framework for dealing with commitment networks has some special features. It uses operads, but this isn’t really saying much. An ‘operad’ is a bunch of ways of sticking things together. An ‘algebra’ of the operad gives a particular collection of these things, and says what we get when we stick them together. These concepts are extremely general, so there’s a huge diversity of operads, each with a huge diversity of algebras. To say one is using operads to solve a problem is a bit like saying one is using math. What matters more is the specific kind of operad one is using, and how one is using it.

For our work, we needed to develop a new class of operads called network operads, which are described here:

• John Baez, John Foley, Joseph Moeller and Blake Pollard, Network models.

In this paper we mainly discuss communication networks. Subsequently we’ve been working on a special class of network operads that describe how to build commitment networks.

Here are some of key ideas:

• Using network operads we can build bigger networks from smaller ones by overlaying them. David Spivak’s operad of wiring diagrams only let us ‘wire together’ smaller networks to form bigger ones:

Here networks X1, X2 and X3 are being wired together to form Y.

Network operads also let us wire together networks, but in addition they let us take one network:

and overlay another:

to create a larger network:

This is a new methodology for designing systems. We’re all used to building systems by wiring together subsystems: anyone who has a home stereo system has done this. But overlaying systems lets us do more. For example, we can take two plans of action involving the same collection of agents, and overlay them to get a new plan. We’ve all done this, too: you tell a bunch of people to do things… and then tell the same people, or an overlapping set of people, to do some other things. But lots of problems can arise if you aren’t careful. A mathematically principled approach can avoid some of these problems.

• The nodes of our networks represent agents of various types. The edges represent various relationships between agents. For example, they can represent communication channels. But more interestingly, they can represent commitments. For example, we can have an edge from A to B saying that agent A has to go rescue agent B. We call this kind of network a commitment network.

• By overlaying commitment networks, we can not only build systems out of smaller pieces but also build complicated plans by overlaying smaller pieces of plans. Since ‘tasking’ means telling a system what to do, we call this compositional tasking.

• If one isn’t careful, overlaying commitment networks can produce conflicts. Suppose we have a network with an edge saying that agent A has to rescue agent B. On top of this we overlay a network with an edge saying that A has to rescue agent C. If A can’t do both of these tasks at once, what should A do? There are various choices. We need to build a specific choice into the framework, so we can freely overlay commitment networks and get a well-defined result that doesn’t overburden the agents involved. We call this automatic deconflicting.

• Our approach to automatic deconflicting uses an idea developed by the famous category theorist Bill Lawvere: graphic monoids. I’ll explain these later, along with some of their side-benefits.

• Networks operads should let us do step-by-step compositional tasking. In other words, they should let us partially automate the process of tasking networks of agents, both

1) compositionally: tasking smaller networks and then sticking them together, e.g. by overlaying them, to get larger networks,


2) in a step-by-step way, starting at a low level of detail and then increasing the amount of detail.

To do this we need not just operads but their algebras.

• Remember, a network operad is a bunch of ways to stick together networks of some kind, e.g. by overlaying them. An algebra of this operad specifies a particular collection of networks of this kind, and says what we actually get when we stick them together.

So, a network operad can have one algebra in which things are described in a bare-bones, simplified way, and another algebra in which things are described in more detail. Indeed it will typically have many algebras, corresponding to many levels of detail, but for simplicity let’s just think about two.

When we have a ‘less detailed’ algebra A and a ‘more detailed’ algebra A', they will typically be related by a map

f \colon A' \to A

which ‘forgets the extra details’. This sort of map is called a homomorphism of algebras. We give examples in our paper Network models.

But what we usually want to do, when designing a system, is not forget extra detail, but rather add extra detail to a rough specification. There is not always a systematic way to do this. If there is, then we may have a homomorphism

g \colon A \to A'

going back the other way. This lets us automate the process of filling in the details. But we can’t usually count on being able to do this. So, often we may have to start with an element of A and search for an element of A' that is mapped to it by f \maps A' \to A. And typically we want this element to be optimal, or at least ‘good enough’, according to some chosen criteria. Expressing this idea formally helps us figure out how to automate the search. John Foley, in particular, has been working on this.

That’s an overview of our ideas.

Next, for the mathematically inclined, I want to give a few more details on one of the new elements not mentioned in our Network models paper: ‘graphic monoids’.

Graphic monoids

In our paper Network models we explain how the ‘overlay’ operation makes the collection of networks involving a given set of agents into a monoid. A monoid is a set M with a product that is associative and has an identity element 1:

(xy)z = x(yz)
1 x = x = x 1

In our application, this product is overlaying two networks.

A graphic monoid is one in which the graphic identity

x y x = x y

holds for all x,y.

To understand this identity, let us think of the elements of the monoid as “commitments”. The product x y means “first committing to do x, then committing to do y”. The graphic identity says that if we first commit to do x, then y, and then x again, it’s the same as first committing to do x and then y. Committing to do x again doesn’t change anything!

In particular, in any graphic monoid we have

xx = x 1 x = x 1 = x

so making the same commitment twice is the same as making it once. Mathematically we say every element x of a graphic monoid is idempotent:

x^2 = x

A commutative monoid obeying this law x^2 = x automatically obeys the graphic identity, since then

x y x = x^2 y = x y

But for a noncommutative monoid, the graphic identity is stronger than x^2 = x. It says that after committing to x, no matter what intervening commitments one might have made, committing to x again has no further effect. In other words: the intervening commitments did not undo the original commitment, so making the original commitment a second time has no effect! This captures the idea of how promises should behave.

As I said, for any network model, the set of all networks involving a fixed set of agents is a monoid. In a commitment network model, this monoid is required to be a graphic monoid. Joseph Moeller is writing a paper that shows how to construct a large class of commitment network models. We will follow this with a paper illustrating how to use these in compositional tasking.

For now, let me just mention a side-benefit. In any graphic monoid we can define a relation x \le y by

x \le y  \; \iff \; x a = y $ for some a

This makes the graphic monoid into a partially ordered set, meaning that these properties hold:

reflexivity: x \le x

transitivity: x \le y , y \le z \; \implies \; x \le z

antisymmetry: x \le y, y \le x \; \implies x = y

In the context of commitment networks, x \le y means that starting from x we can reach y by making some further commitment a: that is, x a = y for some a. So, as we ‘task’ a collection of agents by giving them more and more commitments, we move up in this partial order.

by John Baez at February 19, 2018 01:09 AM

February 18, 2018

Tommaso Dorigo - Scientificblogging

Searching For Light Dark Matter: A Gedankenexperiment
Dark Matter (DM), the mysterious substance that vastly dominates the total mass of our universe, is certainly one of the most surprising and tough puzzles of contemporary science. We do not know what DM is, but on the other hand we have a large body of evidence that there must be "something" in the universe that causes a host of effects we observe and which would have no decent explanation otherwise. 

read more

by Tommaso Dorigo at February 18, 2018 11:06 AM

February 17, 2018

Peter Coles - In the Dark

Today’s Earthquake in Wales

Just after 2.30 this afternoon I felt a vibration of my house in Cardiff, initially like a heavy truck going outside but then a distinct ‘bump’. The whole house moved, but only for less than a second and no damage ensued.

I thought it was a minor tremor, and out of curiosity I looked on Twitter to see how widespread were the reports. The answer was very:

It seems it was an earthquake of Magnitude 4.7, centered near (or, presumably, under) Neath. That’s actually pretty big by UK standards.

Thankfully I don’t think anyone has been hurt.

Anyone else feel it?

P. S. I learned today that the Welsh word for ‘earthquake’ is daeargryn..

by telescoper at February 17, 2018 03:33 PM

Peter Coles - In the Dark

Flying back to Wales

Today, as usual, I took the morning flight back from Dublin to Cardiff. This was the first time this year it’s been clear enough and light enough to see anything from the cabin so I took this snap our of the window as we reached the Welsh coast. You can see the curve of Cardigan Bay reasonably well.

The propeller was working, by the way…

by telescoper at February 17, 2018 12:54 PM

John Baez - Azimuth

Applied Category Theory at NIST

I think it’s really cool how applied category theory is catching on. My former student Blake Pollard is working at the National Institute of Standards and Technology on applications of category theory to electrical engineering. He’s working with Spencer Breiner… and now Breiner is running a workshop on this stuff:

• Applied Category Theory: Bridging Theory & Practice, March 15–16, 2018, NIST, Gaithersburg, Maryland, USA.

It’s by invitation only, but I can’t resist mentioning its existence. Here’s the idea:

What: The Information Technology Laboratory at NIST is pleased to announce a workshop on Applied Category Theory to be held at NIST’s Gaithersburg, Maryland campus on March 15 & 16, 2018. The meeting will focus on practical avenues for introducing methods from category theory into real-world applications, with an emphasis on examples rather than theorems.

Who: The workshop aims to bring together two distinct groups. First, category theorists interested in pursuing applications outside of the usual mathematical fields. Second, domain experts and research managers from industry, government, science and engineering who have in mind potential domain applications for categorical methods.

Intended Outcomes: A proposed landscape of potential CT applications and the infrastructure needed to realize them, together with a 5-10 year roadmap for developing the field of applied category theory. This should include perspectives from industry, academia and government as well as major milestones, potential funding sources, avenues for technology transfer and necessary improvements in tool support and methodology. Exploratory collaborations between category theorists and domain experts. We will ask that each group come prepared to meet the other side. Mathematicians should be prepared with concrete examples that demonstrate practical applications of CT in an intuitive way. Domain experts should bring to the table specific problems to which they can devote time and/or funding as well as some reasons about why they think CT might be relevant to this application.

Invited Speakers:
John Baez (University of California at Riverside) and John Foley (Metron Scientific Solutions).
Bob Coecke (University of Oxford).
Dusko Pavlovic (University of Hawaii).

Some other likely participants include Chris Boner (Metron), Arquimedes Canedo (Siemens at Princeton), Stephane Dugowson (Supméca), William Edmonson (North Carolina A&T), Brendan Fong (MIT), Mark Fuge (University of Maryland), Jack Gray (Penumbra), Steve Huntsman (BAE Systems), Patrick Johnson (Dassault Systèmes), Al Jones (NIST), Cliff Joslyn (Pacific Northwest National Laboratory), Richard Malek (NSF), Tom Mifflin (Metron), Ira Monarch (Carnegie Mellon), John Paschkewitz (DARPA), Evan Patterson (Stanford), Blake Pollard (NIST), Emilie Purvine (Pacific Northwest National Laboratory), Mark Raugas (Pacific Northwest National Laboratory), Bill Regli (University of Maryland), Michael Robinson (American U.) Alberto Speranzon (Honeywell Aerospace), David Spivak (MIT), Eswaran Subrahmanian (Carnegie Mellon), Jamie Vicary (Birmingham and Oxford), and Ryan Wisnesky (Categorical Informatics).

A bunch of us will stay on into the weekend and talk some more. I hope we make a lot of progress—and I plan to let you know how it goes!

by John Baez at February 17, 2018 05:35 AM

February 16, 2018

Lubos Motl - string vacua and pheno

Does neutron decay to dark matter?
Three days ago, the Quanta Magazine published a playful simple article on particle physics
Neutron Lifetime Puzzle Deepens, but No Dark Matter Seen
The neutron's lifetime is some 15 minutes but there seems to be a cool, increasingly sharp discrepancy. If you measure how many neutrons are left in a "bottle" after time \(t\), it seems that there's one decay in 14:39 minutes. But if you measure a neutron "beam" and the protons that appear, it seems that they're being converted at the rate of one new proton per 14:48 minutes.

This neutron's logo is actually from some cryptocurrency network.

So the neutrons are apparently decaying about 1% faster than the protons are born. No other decays of neutrons are known. Relativistic effects for the beam are negligible.

If this discrepancy is real, there seems to be 1% of the neutron decays that go to something else. One month ago, a paper by Bartosz Fornal, Benjamin Grinstein promoted the idea that the neutron could very well decay to new, invisible particles.

The decay would be\[

n\to \chi\gamma

\] where \(\chi\) is a new, dark, spin-1/2 fermion, and \(\gamma\) is a photon. The mass of \(\chi\) has to be just a slightly smaller than the neutron mass – a coincidance ;-) needed to avoid some sick decays in nuclear physics – and the photons \(\gamma\) from such decays must have a completely universal, fixed energy (calculable from the \(\chi\) mass) which is a universal constant that may a priori belong to an interval and is of order \(1\MeV\).

In the effective Lagrangian, the decays are enabled by the quadratic and cubic terms \(\bar n \chi\) and \(\bar n \chi \gamma\), i.e. by a neutron-newfermion mixing; and by a cubic interaction that looks like the mixing with the extra photon (you need the photon to be represented by the whole \(F_{\mu\nu}\) in this cubic term, not just \(A_\mu\), by gauge invariance). So for a while, by the mixing interaction, the neutron changes to a virtual \(\chi\), and the \(\chi\) decays to a real neutron \(n\) and a real photon \(\gamma\).

The authors claim that such a model is consistent with everything we know.

One week ago, an experimental preprint claimed that they're sure that these photons of energy comparable to \(1\MeV\) don't exist so the theory is ruled out.

Well, maybe the visible photon among the decay products should be replaced with some dark boson as well? ;-) At any rate, it's an intriguing anomaly and an equally attractive (albeit obvious) strategy to explain it.

by Luboš Motl ( at February 16, 2018 06:47 PM

ZapperZ - Physics and Physicists

Observation of 3-Photon Bound States
They seem to be making a steady and impressive success along this line.

A new paper in Science[1] has shown an impressive result of the possibility of causing 3 different photons to be "bound" or entangled with one another after traversing through a cold rubidium atom gas.

In controlled experiments, the researchers found that when they shone a very weak laser beam through a dense cloud of ultracold rubidium atoms, rather than exiting the cloud as single, randomly spaced photons, the photons bound together in pairs or triplets, suggesting some kind of interaction — in this case, attraction — taking place among them.

Now, without going overboard with the superlatives, it must be stressed that this does not occur in vacuum, i.e. 3 photons just don't say hi to one another and decide to hang out together. The presence of the cold rubidium gas is essential for a photon to bound with one of the atoms to form a polariton:

The researchers then developed a hypothesis to explain what might have caused the photons to interact in the first place. Their model, based on physical principles, puts forth the following scenario: As a single photon moves through the cloud of rubidium atoms, it briefly lands on a nearby atom before skipping to another atom, like a bee flitting between flowers, until it reaches the other end.

If another photon is simultaneously traveling through the cloud, it can also spend some time on a rubidium atom, forming a polariton — a hybrid that is part photon, part atom. Then two polaritons can interact with each other via their atomic component. At the edge of the cloud, the atoms remain where they are, while the photons exit, still bound together. The researchers found that this same phenomenon can occur with three photons, forming an even stronger bond than the interactions between two photons.

This has almost the same flavor as the "attraction" between two electrons in a superconductor to form the bound Cooper pairs, which requires a background of lattice ion vibration or virtual phonons to mediate the coupling.

So photons can talk to one another, and in this case, 3 of them can hang out together. They just need a matchmaker as an intermediary, since they are just way too shy to do it on their own.

And with that sugary concoction, I think I need more coffee this morning.


[1] Q-Y Liang et al., Science v.359, p.783 (2018).

by ZapperZ ( at February 16, 2018 02:16 PM

February 15, 2018

Emily Lakdawalla - The Planetary Society Blog

Simulating Mars in the Middle East
The Austrian Space Forum is leading a four-week Mars mission in Oman's Dhofar Desert.

February 15, 2018 12:00 PM

The n-Category Cafe

Physics and 2-Groups

The founding vision of this blog, higher gauge theory, progresses. Three physicists have just brought out Exploring 2-Group Global Symmetries . Two of the authors have worked with Nathan Seiberg, a highly influential physicist, who, along with three others, had proposed in 2014 a program to study higher form field symmetries in QFT (Generalized Global Symmetries), without apparently being aware that this idea was considered before.

Eric Sharpe then published an article the following year, Notes on generalized global symmetries in QFT, explaining how much work had already been done along these lines:

The recent paper [1] proposed a more general class of symmetries that should be studied in quantum field theories: in addition to actions of ordinary groups, it proposed that we should also consider ‘groups’ of gauge fields and higher-form analogues. For example, Wilson lines can act as charged objects under such symmetries. By using various defects, the paper [1] described new characterizations of gauge theory phases.

Now, one can ask to what extent it is natural for n-forms as above to form a group. In particular, because of gauge symmetries, the group multiplication will not be associative in general, unless perhaps one restricts to suitable equivalence classes, which does not seem natural in general. A more natural understanding of such symmetries is in terms of weaker structures known as 2-groups and higher groups, in which associativity is weakened to hold only up to isomorphisms.

There are more 2-groups and higher groups than merely, ‘groups’ of gauge fields and higher-form tensor potentials (connections on bundles and gerbes), and in this paper we will give examples of actions of such more general higher groups in quantum field theory and string theory. We will also propose an understanding of certain anomalies as transmutations of symmetry groups of classical theories into higher group actions on quantum theories.

In the new paper by Cordova, Dumitrescu, and Intriligator, along with references to some early work by John and Urs, Sharpe’s article is acknowledged, and yet taken to be unusual in considering more than discrete 2-groups:

The continuous 2-group symmetries analyzed in this paper have much in common with their discrete counterparts. Most discussions of 2-groups in the literature have focused on the discrete case (an exception is [17]). (p. 26)

Maybe it’s because they’ve largely encountered people extending Dijkgraaf-Witten theory, such as in Higher symmetry and gapped phases of gauge theories where the authors:

… study topological field theory describing gapped phases of gauge theories where the gauge symmetry is partially Higgsed and partially confined. The TQFT can be formulated both in the continuum and on the lattice and generalizes Dijkgraaf-Witten theory by replacing a finite group by a finite 2-group,

but then Sharpe had clearly stated that

The purpose of this paper is to merely to link the recent work [1] to other work on 2-groups, to review a few highlights, and to provide a few hopefully new results, proposals, and applications,

and he refers to a great many items on higher gauge theory, much of it using continuous, and smooth, 2-groups and higher groups.

Still, even if communication channels aren’t as fast as they might be, things do seem to be moving.

by david ( at February 15, 2018 09:40 AM

February 14, 2018

ZapperZ - Physics and Physicists

Light From A Single Strontium Atom
The image of light from a single strontium atom in an atom trap has won the Engineering and Physical Sciences Research Council photography competition.

You can see a more detailed photo of it on Science Alert.

Unfortunately, there is a bit of misconception going on here. You are not actually seeing the single strontium atom, because it highly depends on what you mean by "seeing". The laser excites the single strontium atom, and then the strontium atom relaxes and releases energy in the form of light. This is the light that you are seeing, and it is probably a result of one or more atomic transition in the atom, but certainly not all of it.

So you're seeing light due to the atomic transition of the atom. You are not actually seeing the atom itself, as proclaimed by some website. This is the nasty obstacle that the general public has to wade through when reading something like this. We need to make it very clear when we report this to the media on what it really is in no uncertain terms, because they WILL try to sensationalize it as much as they can.


by ZapperZ ( at February 14, 2018 03:42 PM

February 13, 2018

ZapperZ - Physics and Physicists

What's So Important About The g-2 Experiment?
If it is covered in CNN, then it has to be a big-enough news. :)

I mentioned earlier that the g-2 experiment at Fermilab was about to start (it has started now), which is basically a continuation and refinement of what was done several years ago at Brookhaven. In case the importance of this experiment escapes you, Don Lincoln of Fermilab has written a piece on the CNN website on this experiment and why it is being done.

If you are not in science, you need to keep in mind this important theme: scientists, and definitely physicists, like it A LOT when we see hints at something that somehow does not fit with our current understanding. We like it when we see discrepancies of our results with the things that we already know.

This may sound odd to many people, but it is true! This is because this is why many of us get into this field in the first place: to explore new and uncharted territories! Results that do not fit with our current understanding give hints at new physics, something beyond what we already know. This is exploration in the truest sense.

This is why there were people who actually were disappointed that we saw the Higgs, and within the energy range that the Standard Model predicted. It is why many, especially theorists working on Supersymmetry, are disappointed that the results out of the LHC so far are within what the Standard Model has predicted.


by ZapperZ ( at February 13, 2018 03:42 PM

ZapperZ - Physics and Physicists

Shedding Light On Radiation Reaction
This is basically an inverse Compton scattering. The latest experiment that studies this has been getting a bit of a press, because of the sensationalistic claims of light "stopping" electrons in their tracks.

A review of the experiment, and the theory behind this, is sufficiently covered in APS Physics, and you do get free access to the actually paper itself in PRX. But after all the brouhaha, this is the conclusion we get:

The differing conclusions in these papers serve as a call to improve the quantum theory for radiation reaction. But it must be emphasized that the new data are too statistically weak to claim evidence of quantum radiation reaction, let alone to decide that one existing model is better than the others. Progress on both fronts will come from collecting more collision events and attaining a more stable electron bunch from laser-wakefield acceleration. Additional information could come from pursuing complementary experimental approaches to observing radiation reaction (for example, Ref. [7]), which may be possible with the next generation of high-intensity laser systems [8]. In the meantime, experiments like those from the Mangles and Zepf teams are ushering in a new era in which the interaction between matter and ultraintense laser light is being used to investigate fundamental phenomena, some of which have never before been studied in the lab.

I know that they need very high-energy electron beam, but the laser wakefield technique that they used seem to be providing a larger spread in energy than what they can resolve:

Both experiments obtained only a small number of such successful events, mainly because it was difficult to achieve a good spatiotemporal overlap between the laser pulse and the electron bunch, each of which has a duration of only a few tens of femtoseconds and is just a few micrometers in width. A further complication was that the average energy of the laser-wakefield-accelerated electrons fluctuated by an amount comparable to the energy loss from radiation reaction.

I suppose this is the first step in trying to sort this out, and I have no doubt that there will be an improvement in such an experiment soon.


by ZapperZ ( at February 13, 2018 02:07 PM

February 12, 2018

CERN Bulletin


Cooperative open to international civil servants. We welcome you to discover the advantages and discounts negotiated with our suppliers either on our website or at our information office located at CERN, on the ground floor of bldg. 504, open Monday through Friday from 12.30 to 15.30.

February 12, 2018 05:02 PM

CERN Bulletin

CERN Bulletin

The ILOAT celebrated its 90th anniversary!

The beginning of a new year is an excellent opportunity to look back on the highlights of the previous year. The fast pace at which the CERN Staff Association has to handle current issues unfortunately left us with no time to present a jubilee that deserves to be brought to your attention, the 90th anniversary of the International Labour Organisation Administrative Tribunal (ILOAT), celebrated on 5 May 2017. [1]







Symposium in honour of the 90th anniversary of the Tribunal
Image: Aung Lwin (International Labour Organization Photo Collection)

On this occasion, two representatives of the CERN Staff Association participated in the symposium on the ”90 years of contribution to the creation of international civil service law”.

Since 1947, the International Labour Organisation Administrative Tribunal (ILOAT) has heard complaints from serving and former officials of international organisations that have recognised its jurisdiction. This is the case for CERN, which states in its Rules and Regulations that a decision may be challenged by filing a complaint to the ILOAT, when the decision is final, i.e. if a decision cannot be challenged internally within the Organization, or when internal procedures have been exhausted (S VI 1.03 Procedures for the settlement of disputes). For the members of the personnel, the ILOAT is therefore the last recourse to assert their rights.

ILOAT is currently open to more than 58 000 international civil servants and former civil servants of 62 international organisations. Following its 125th session in January 2018, its jurisprudence is comprised of more than 3 981 judgements available in both French and English.

Now, let us go back to the symposium held on 5 May 2017.

Among the 200 participants there were panellists including lawyers representing staff unions, complainants, international organisations, as well as judges and legal officers of other courts of justice and administrative tribunals.

The agenda of the day was structured into three sessions:

  • The various facets of the law governing the relations between international civil servants and international organisations;
  • General principles of law applied by the Tribunal;
  • Milestones of the Tribunal’s case law.

Several general principles of law of the international civil service were thus brought forward and illustrated with examples taken from various judgments of the Tribunal. Among these principles, our representatives found the following particularly interesting:

  • duty of care,
  • dignity and undue or unnecessary hardship,
  • good faith and principle of proportionality,
  • principle of equality of arms,
  • duty to protect and assist.

Translating general principles into practice in cases of harassment, the Tribunal’s competence in salary related issues, and the respect of due process of law when exhausting internal remedies were also among the topics that caught their attention.

Moreover, we would like to mention the presentation on the “Principle of acquired rights” by colleagues of the CERN Legal Service (Eva-Maria Gröniger-Voss, Kirsten Baxter, Arthur Nguyen), who were invited by the ILOAT among other global experts of international law. Our representatives were particularly interested in the concept of “cumulative impact of measures” (so-called “salami” tactic), which may ultimately give rise to a violation of acquired rights – is this a topic of relevance at CERN?

The CERN Staff Association wishes the International Labour Organisation Administrative Tribunal, an institution that is essential for the good functioning of international organisations, a long life and every success in its highly delicate mission of “Promoting jobs, protecting people”.

The full publication based on the symposium is available online:

[1] ILOAT website:

Translated from the French original

February 12, 2018 04:02 PM

CERN Bulletin

Crèche and School: Open Day on Saturday, 3 March

Open Day at Crèche and School of the CERN Staff Association

Are you considering enrolling your child to the Crèche and School of the CERN Staff Association?

If you work at CERN, then this event is for you: come visit the school and meet the Management

on Saturday 3 March 2018 from 10 to 12 am

It will be our pleasure to present to you our structure, its projects and premises, and answer any questions you may have.

Please sign up for one of the two sessions via Doodle before Wednesday 28 February 2018:

February 12, 2018 04:02 PM

February 11, 2018

Clifford V. Johnson - Asymptotia

Conversation Piece

I wrote a piece for The Conversation recently that is making the rounds, drawing on lots of research sources (including reading some comics from the 1960s!). You might like it. Here it is:

The hidden superpower of 'Black Panther': Scientist role models

File 20180207 74473 zbs0ny.jpg?ixlib=rb 1.1 King of a technologically advanced country, Black Panther is a scientific genius.
Marvel Studios

Clifford Johnson, University of Southern California – Dornsife College of Letters, Arts and Sciences

I’m not the first to say that the upcoming Marvel movie “Black Panther” will be an important landmark. Finally a feature film starring a black superhero character will be part of the Marvel Cinematic Universe – a successful run of intertwined movies that began with “Iron Man” in 2008. While there have been other superhero movies with a black lead character – “Hancock” (2008), “Blade” (1998), “Spawn” (1997) or even “The Meteor Man” (1993) – this film is significant because of the recent remarkable rise of the superhero film from the nerdish fringe to part of mainstream culture.

Huge audiences will see a black lead character – not a sidekick or part of a team – in a superhero movie by a major studio, with a black director (Ryan Coogler), black writers and a majority black cast. This is a significant step toward diversifying our culture by improving the lackluster representation of minorities in our major media. It’s also a filmmaking landmark because black creators have been given access to the resources and platforms needed to bring different storytelling perspectives into our mainstream culture.

Last year’s “Wonder Woman” forged a similar path. In that case, a major studio finally decided to commit resources to a superhero film headlined by a female character and directed by a woman, Patty Jenkins. Female directors are a minority in the movie industry. Jenkins brought a new perspective to this kind of action movie, and there was a huge positive response from audiences in theaters worldwide.

Above and beyond all this, “Black Panther” also has the potential to break additional ground in a way most people may not realize: In the comics, [...] Click to continue reading this post

The post Conversation Piece appeared first on Asymptotia.

by Clifford at February 11, 2018 03:14 PM

John Baez - Azimuth

Linguistics Using Category Theory


Now students in the Applied Category Theory 2018 school are reading about categories applied to linguistics. Read the blog article here for more:

• Jade Master and Cory Griffith, Linguistics using category theory, The n-Category Café, 6 February 2018.

This was written by my grad student Jade Master along with Cory Griffith, an undergrad at Stanford. Jade is currently working with me on the semantics of open Petri nets.

What’s the basic idea of this linguistics and category theory stuff? I don’t know much about this, but I can say a bit.

Since category theory is great for understanding the semantics of programming languages, it makes sense to try it for human languages, even though they’re much harder. The first serious attempt I know was by Jim Lambek, who introduced pregroup grammars in 1958:

• Joachim Lambek, The mathematics of sentence structure, Amer. Math. Monthly 65 (1958), 154–170.

In this article he hid the connection to category theory. But when you start diagramming sentences or phrases using his grammar, as below, you get planar string diagrams as shown above. So it’s not surprising—if you’re in the know—that he’s secretly using monoidal categories where every object has a right dual and, separately, a left dual.

This fact is just barely mentioned in the Wikipedia article:

Pregroup grammar.

but it’s explained in more detail here:

• A. Preller and J. Lambek, Free compact 2-categories, Mathematical Structures in Computer Science 17 (2005), 309-340.

This stuff is hugely fun, so I’m wondering why I never looked into it before! When I talked to Lambek, who is sadly no longer with us, it was mainly about his theories relating particle physics to quaternions.

Recently Mehrnoosh Sadrzadeh and Bob Coecke have taken up Lambek’s ideas, relating them to the category of finite-dimensional vector spaces. Choosing a monoidal functor from a pregroup grammar to this category allows one to study linguistics using linear algebra! This simplifies things, perhaps a bit too much—but it makes it easy to do massive computations, which is very popular in this age of “big data” and machine learning.

It also sets up a weird analogy between linguistics and quantum mechanics, which I’m a bit suspicious of. While the category of finite-dimensional vector spaces with its usual tensor product is monoidal, and has duals, it’s symmetric, so the difference between writing a word to the left of another and writing it to the right of another gets washed out! I think instead of using vector spaces one should use modules of some noncommutative Hopf algebra, or something like that. Hmm… I should talk to those folks.

To discuss this, please visit The n-Category Café, since there’s a nice conversation going on there and I don’t want to split it. There has also been a conversation on Google+, and I’ll quote some of it here, so you don’t have to run all over the internet.

Noam Zeilberger wrote:

You might have been simplifying things for the post, but a small comment anyways: what Lambek introduced in his original paper are these days usually called “Lambek grammars”, and not exactly the same thing as what Lambek later introduced as “pregroup grammars”. Lambek grammars actually correspond to monoidal biclosed categories in disguise (i.e., based on left/right division rather than left/right duals), and may also be considered without a unit (as in his original paper). (I only have a passing familiarity with this stuff, though, and am not very clear on the difference in linguistic expressivity between grammars based on division vs grammars based on duals.)

Noam Zeilberger wrote:

If you haven’t seen it before, you might also like Lambek’s followup paper “On the calculus of syntactic types”, which generalized his original calculus by dropping associativity (so that sentences are viewed as trees rather than strings). Here are the first few paragraphs from the introduction:

…and here is a bit near the end of the 1961 paper, where he made explicit how derivations in the (original) associative calculus can be interpreted as morphisms of a monoidal biclosed category:

John Baez wrote:

Noam Zeilberger wrote: “what Lambek introduced in his original paper are these days usually called “Lambek grammars”, and not exactly the same thing as what Lambek later introduced as “pregroup grammars”.”

Can you say what the difference is? I wasn’t simplifying things on purpose; I just don’t know this stuff. I think monoidal biclosed categories are great, and if someone wants to demand that the left or right duals be inverses, or that the category be a poset, I can live with that too…. though if I ever learned more linguistics, I might ask why those additional assumptions are reasonable. (Right now I have no idea how reasonable the whole approach is to begin with!)

Thanks for the links! I will read them in my enormous amounts of spare time. :-)

Noam Zeilberger wrote:

As I said it’s not clear to me what the linguistic motivations are, but the way I understand the difference between the original “Lambek” grammars and (later introduced by Lambek) pregroup grammars is that it is precisely analogous to the difference between a monoidal category with left/right residuals and a monoidal category with left/right duals. Lambek’s 1958 paper was building off the idea of “categorial grammar” introduced earlier by Ajdukiewicz and Bar-Hillel, where the basic way of combining types was left division A\B and right division B/A (with no product).

Noam Zeilberger wrote:

At least one seeming advantage of the original approach (without duals) is that it permits interpretations of the “semantics” of sentences/derivations in cartesian closed categories. So it’s in harmony with the approach of “Montague semantics” (mentioned by Richard Williamson over at the n-Cafe) where the meanings of natural language expressions are interpreted using lambda calculus. What I understand is that this is one of the reasons Lambek grammar started to become more popular in the 80s, following a paper by Van Benthem where he observed that such such lambda terms denoting the meanings of expressions could be computed via “homomorphism” from syntactic derivations in Lambek grammar.

Jason Nichols wrote:

John Baez, as someone with a minimal understanding of set theory, lambda calculus, and information theory, what would you recommend as background reading to try to understand this stuff?

It’s really interesting, and looks relevant to work I do with NLP and even abstract syntax trees, but I reading the papers and wiki pages, I feel like there’s a pretty big gap to cross between where I am, and where I’d need to be to begin to understand this stuff.

John Baez wrote:

Jason Nichols: I suggest trying to read some of Lambek’s early papers, like this one:

• Joachim Lambek, The mathematics of sentence structure, Amer. Math. Monthly 65 (1958), 154–170.

(If you have access to the version at the American Mathematical Monthly, it’s better typeset than this free version.) I don’t think you need to understand category theory to follow them, at least not this first one. At least for starters, knowing category theory mainly makes it clear that the structures he’s trying to use are not arbitrary, but “mathematically natural”. I guess that as the subject develops further, people take more advantage of the category theory and it becomes more important to know it. But anyway, I recommend Lambek’s papers!

Borislav Iordanov wrote:

Lambek was an amazing teacher, I was lucky to have him in my ungrad. There is a small and very approachable book on his pregroups treatment that he wrote shortly before he passed away: “From Word to Sentence: a computational algebraic approach to grammar”. It’s plain algebra and very fun. Sadly looks like out of print on Amazon, but if you can find it, well worth it.

Andreas Geisler wrote:

One immediate concern for me here is that this seems (don’t have the expertise to be sure) to repeat a very old mistake of linguistics, long abandoned :

Words do not have atomic meanings. They are not a part of some 1:1 lookup table.

The most likely scenario right now is that our brains store meaning as a continuously accumulating set of connections that ultimately are impacted by every instance of a form we’ve ever heard/seen.

So, you shall know a word by all the company you’ve ever seen it in.

Andreas Geisler wrote:

John Baez I am a linguist by training, you’re welcome to borrow my brain if you want. You just have to figure out the words to use to get my brain to index what you need, as I don’t know the category theory stuff at all.

It’s a question of interpretation. I am also a translator, so i might be of some small assistance there as well, but it’s not going to be easy either way I am afraid.

John Baez wrote:

Andreas Geisler wrote: “I might be of some small assistance there as well, but it’s not going to be easy either way I am afraid.”

No, it wouldn’t. Alas, I don’t really have time to tackle linguistics myself. Mehrnoosh Sadrzadeh is seriously working on category theory and linguistics. She’s one of the people leading a team of students at this Applied Category Theory 2018 school. She’s the one who assigned this paper by Lambek, which 2 students blogged about. So she would be the one to talk to.

So, you shall know a word by all the company you’ve ever seen it in.

Yes, that quote appears in the blog article by the students, which my post here was merely an advertisement for.

by John Baez at February 11, 2018 12:26 AM

February 10, 2018

The n-Category Cafe

Linguistics Using Category Theory

guest post by Cory Griffith and Jade Master

Most recently, the Applied Category Theory Seminar took a step into linguistics by discussing the 2010 paper Mathematical Foundations for a Compositional Distributional Model of Meaning, by Bob Coecke, Mehrnoosh Sadrzadeh, and Stephen Clark.

Here is a summary and discussion of that paper.

In recent years, well known advances in AI, such as the development of AlphaGo and the ongoing development of self driving cars, have sparked interest in the general idea of machines examining and trying to understand complex data. In particular, a variety of accounts of successes in natural language processing (NLP) have reached wide audiences (see, for example, The Great AI Awakening).

One key tool for NLP practitioners is the concept of distributional semantics. There is a saying due to Firth that is so often repeated in NLP papers and presentations that even mentioning its ubiquity has become a cliche:

“You shall know a word by the company it keeps.”

The idea is that if we want to know if two words have similar meanings, we should examine the words they are used in conjunction with, and in some way measure how much overlap there is. While direct ancestry of this concept can be traced at least back to Wittgenstein, and the idea of characterizing an object by its relationship with other objects is one category theorists are already fond of, distributional semantics is distinguished by its essentially statistical methods. The variations are endless and complex, but in the cases relevant to our discussion, one starts with a corpus, a suitable way of determining what the context of a word is (simply being nearby, having a grammatical relationship, being in the same corpus at all, etc) and ends up with a vector space in which the words in the corpus each specify a point. The distance between vectors (for an appropriate definition of distance) then correspond to relationships in meaning, often in surprising ways. The creators of the GloVe algorithm give the example of a vector space in which <semantics>kingman+woman=queen<annotation encoding="application/x-tex">king - man + woman = queen</annotation></semantics>.

There is also a “top down,” relatively syntax oriented analysis of meaning called categorial grammar. Categorial grammar has no accepted formal definition, but the underlying philosophy, called the principle of compositionality, is this: a meaningful sentence is composed of parts, each of which itself has a meaning. To determine the meaning of the sentence as a whole, we may combine the meanings of the constituent parts according to rules which are specified by the syntax of the sentence. Mathematically, this amounts to constructing some algebraic structure which represents grammatical rules. When this algebraic structure is a category, we call it a grammar category.

The Paper


Pregroups are the algebraic structure that this paper uses to model grammar. A pregroup P is a type of partially ordered monoid. Writing <semantics>xy<annotation encoding="application/x-tex">x \to y</annotation></semantics> to specify that <semantics>xy<annotation encoding="application/x-tex">x \leq y</annotation></semantics> in the order relation, we require the following additional property: for each <semantics>pP<annotation encoding="application/x-tex">p \in P</annotation></semantics>, there exists a left adjoint <semantics>p l<annotation encoding="application/x-tex">p^l</annotation></semantics> and a right adjoint <semantics>p r<annotation encoding="application/x-tex">p^r</annotation></semantics>, such that <semantics>p lp1pp r<annotation encoding="application/x-tex">p^l p \to 1 \to p p^r</annotation></semantics> and <semantics>pp r1p rp<annotation encoding="application/x-tex">p p^r \to 1 \to p^r p</annotation></semantics>. Since pregroups are partial orders, we can regard them as categories. The monoid multiplication and adjoints then upgrade the category of a pregroup to compact closed category. The equations referenced above are exactly the snake equations.

We can define a pregroup generated by a set <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> by freely adding adjoints, units and counits to the free monoid on <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics>. Our grammar categories will be constructed as follows: take certain symbols, such as <semantics>n<annotation encoding="application/x-tex">n</annotation></semantics> for noun and <semantics>s<annotation encoding="application/x-tex">s</annotation></semantics> for sentence, to be primitive. We call these “word classes.” Generate a pregroup from them. The morphisms in the resulting category represent “grammatical reductions” of strings of word classes, with a particular string being deemed “grammatical” if it reduces to the word class <semantics>s<annotation encoding="application/x-tex">s</annotation></semantics>. For example, construct the pregroup <semantics>Preg({n,s})<annotation encoding="application/x-tex">Preg( \{n,s\})</annotation></semantics> generated by <semantics>n<annotation encoding="application/x-tex">n</annotation></semantics> and <semantics>s<annotation encoding="application/x-tex">s</annotation></semantics>. A transitive verb can be thought of as accepting two nouns, one on the left and one on the right, and returning a sentence. Using the powerful graphical language for compact closed categories, we can represent this as

Using the adjunctions, we can turn the two inputs into outputs to get

Therefore the type of a verb is <semantics>n rsn l<annotation encoding="application/x-tex">n^r s n^l</annotation></semantics>. Multiplying this on the left and right by <semantics>n<annotation encoding="application/x-tex">n</annotation></semantics> allows us to apply the counits of <semantics>n<annotation encoding="application/x-tex">n</annotation></semantics> to reduce <semantics>n(n rsn l)n<annotation encoding="application/x-tex">n \cdot (n^r s n^l) \cdot n</annotation></semantics> to the type <semantics>s<annotation encoding="application/x-tex">s</annotation></semantics>, as witnessed by

Let <semantics>(FVect,,)<annotation encoding="application/x-tex">(\mathbf{FVect},\otimes, \mathbb{R})</annotation></semantics> be the symmetric monoidal category of finite dimensional vector spaces and linear transformations with the standard tensor product. Since any vector space we use in our applications will always come equipped with a basis, these vector spaces are all endowed with an inner product. Note that <semantics>FVect<annotation encoding="application/x-tex">\mathbf{FVect}</annotation></semantics> has a compact closed structure. The counit is the diagonal

<semantics>η l=η r: VV 1 ie ie i<annotation encoding="application/x-tex">\begin{array}{cccc} \eta_l = \eta_r \colon & \mathbb{R} & \to &V \otimes V \\ &1 &\mapsto & \sum_i \overrightarrow{e_i} \otimes \overrightarrow{e_i} \end{array}</annotation></semantics>

and the unit is a linear extension of the inner product

<semantics>ϵ l=ϵ r: VV i,jc ijv iw j i,jc ijv i,w j.<annotation encoding="application/x-tex">\begin{array}{cccc} \epsilon^l = \epsilon^r \colon &V \otimes V &\to& \mathbb{R} \\ & \sum_{i,j} c_{i j} \vec{v_{i}} \otimes \vec{w_j} &\mapsto& \sum_{i,j} c_{i j} \langle \vec{v_i}, \vec{w_j} \rangle. \end{array} </annotation></semantics>

The Model of Meaning

Let <semantics>(P,)<annotation encoding="application/x-tex">(P, \cdot)</annotation></semantics> be a pregroup. The ingenious idea that the authors of this paper had was to combine categorial grammar with distributional semantics. We can rephrase their construction in more general terms by using a compact closed functor

<semantics>F:(P,)(FVect,,).<annotation encoding="application/x-tex">F \colon (P, \cdot) \to (\mathbf{FVect}, \otimes, \mathbb{R}) .</annotation></semantics>

Unpacking this a bit, we assign each word class a vector space whose basis is a chosen finite set of context words. To each type reduction in <semantics>P<annotation encoding="application/x-tex">P</annotation></semantics>, we assign a linear transformation. Because <semantics>F<annotation encoding="application/x-tex">F</annotation></semantics> is strictly monoidal, a string of word classes <semantics>p 1p 2p n<annotation encoding="application/x-tex">p_1 p_2 \cdots p_n</annotation></semantics> maps to a tensor product of vector spaces <semantics>V 1V 2V n<annotation encoding="application/x-tex">V_1 \otimes V_2 \otimes \cdots \otimes V_n</annotation></semantics>.

To compute the meaning of a string of words you must:

  1. Assign to each word a string of symbols <semantics>p 1p 2p n<annotation encoding="application/x-tex">p_1 p_2 \cdots p_n</annotation></semantics> according to the grammatical types of the word and your choice of pregroup formalism. This is nontrivial. For example, many nouns can also be used as adjectives.

  2. Compute the correlations between each word in your string and the context words of the chosen vector space (see the example below) to get a vector <semantics>v 1v nV 1V n<annotation encoding="application/x-tex">v_1 \otimes \cdots \otimes v_n \in V_1 \otimes \cdots \otimes V_n</annotation></semantics>,

  3. choose a type reduction <semantics>f:p 1p 2p nq 1q 2q n<annotation encoding="application/x-tex">f \colon p_1 p_2 \cdots p_n \to q_1 q_2 \cdots q_n</annotation></semantics> in your grammar category (there may not always be a unique type reduction) and,

  4. apply <semantics>F(f)<annotation encoding="application/x-tex">F(f)</annotation></semantics> to your vector <semantics>v 1v n<annotation encoding="application/x-tex">v_1 \otimes \cdots \otimes v_n</annotation></semantics>.

  5. You now have a vector in whatever space you reduced to. This is the “meaning” of the string of words, according the your model.

This sweeps some things under the rug, because A. Preller proved that strict monoidal functors from a pregroup to <semantics>FVect<annotation encoding="application/x-tex">\mathbf{FVect}</annotation></semantics> actually force the relevant spaces to have dimension at most one. So for each word type, the best we can do is one context word. This is bad news, but the good news is that this problem disappears when more complicated grammar categories are used. In Lambek vs. Lambek monoidal bi-closed categories are used, which allow for this functorial description. So even though we are not really dealing with a functor when the domain is a pregroup, it is a functor in spirit and thinking of it this way will allow for generalization into more complicated models.

An Example

As before, we use the pregroup <semantics>Preg({n,s})<annotation encoding="application/x-tex">Preg(\{n,s\})</annotation></semantics>. The nouns that we are interested in are

<semantics>{Maria,John,Cynthia}<annotation encoding="application/x-tex"> \{ Maria, John, Cynthia \}</annotation></semantics>

These nouns form the basis vectors of our noun space. In the order they are listed, they can be represented as

<semantics>[1 0 0],[0 1 0],[0 0 1].<annotation encoding="application/x-tex"> \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix}, \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix}, \begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix}. </annotation></semantics>

The “sentence space” <semantics>F(s)<annotation encoding="application/x-tex">F(s)</annotation></semantics> is taken to be a one dimensional space in which <semantics>0<annotation encoding="application/x-tex">0</annotation></semantics> corresponds to false and the basis vector <semantics>1 S<annotation encoding="application/x-tex">1_S</annotation></semantics> corresponds to true. As before, transitive verbs have type <semantics>n rsn l<annotation encoding="application/x-tex">n^r s n^l</annotation></semantics>, so using our functor <semantics>F<annotation encoding="application/x-tex">F</annotation></semantics>, verbs will live in the vector space <semantics>NSN<annotation encoding="application/x-tex">N \otimes S \otimes N</annotation></semantics>. In particular, the verb “like” can be expressed uniquely as a linear combination of its basis elements. With knowledge of who likes who, we can encode this information into a matrix where the <semantics>ij<annotation encoding="application/x-tex">ij</annotation></semantics>-th entry corresponds to the coefficient in front of <semantics>v i1 sv j<annotation encoding="application/x-tex">v_i \otimes 1_s \otimes v_j</annotation></semantics>. Specifically, we have

<semantics>[1 0 1 1 1 0 1 0 1].<annotation encoding="application/x-tex"> \begin{bmatrix} 1 & 0 & 1 \\ 1 & 1 & 0 \\ 1 & 0 & 1 \end{bmatrix}. </annotation></semantics>

The <semantics>ij<annotation encoding="application/x-tex">ij</annotation></semantics>-th entry is <semantics>1<annotation encoding="application/x-tex">1</annotation></semantics> if person <semantics>i<annotation encoding="application/x-tex">i</annotation></semantics> likes person <semantics>j<annotation encoding="application/x-tex">j</annotation></semantics> and <semantics>0<annotation encoding="application/x-tex">0</annotation></semantics> otherwise. To compute the meaning of the sentence “Maria likes Cynthia”, you compute the matrix product

<semantics>[1 0 0][1 0 1 1 1 0 1 0 1][0 0 1]=1<annotation encoding="application/x-tex"> \begin{bmatrix} 1 & 0 & 0 \end{bmatrix} \begin{bmatrix} 1 & 0 & 1 \\ 1 & 1 & 0 \\ 1 & 0 & 1 \end{bmatrix} \begin{bmatrix} 0\\ 0\\ 1 \end{bmatrix} =1 </annotation></semantics>

This means that the sentence “Maria likes Cynthia” is true.

Food for Thought

As we said above, this model does not always give a unique meaning to a string of words, because at various points there are choices that need to be made. For example, the phrase “squad helps dog bite victim” has a different meaning depending on whether you take “bite” to be a verb or a noun. Also, if you reduce “dog bite victim” before applying it to the verb, you will get a different meaning than if you reduce “squad helps dog” and apply it to the verb “bite”. On the one hand, this a good thing because those sentences should have different meanings. On the other hand, the presence of choices makes it harder use this model in a practical algorithm.

Some questions arose which we did not have a clear way to address. Tensor products of spaces of high dimension quickly achieve staggering dimensionality — can this be addressed? How would one actually fit empirical data into this model? The “likes” example, which required us to know exactly who likes who, illustrates the potentially inaccessible information that seems to be necessary to assign vectors to words in a way compatible with the formalism. Admittedly, this is a necessary consequence of the fact the evaluation is of the truth or falsity of the statement, but the issue also arises in general cases. Can this be resolved? In the paper, the authors are concerned with determining the meaning of grammatical sentences (although we can just as easily use non-grammatical strings of words), so that the computed meaning is always a vector in the sentence space <semantics>F(s)<annotation encoding="application/x-tex">F(s)</annotation></semantics>. What are the useful choices of structure for the sentence space?

This paper was not without precedent — suggestions and models related its concepts of this paper had been floating around beforehand, and could be helpful in understanding the development of the central ideas. For example, Aerts and Gabora proposed elaborating on vector space models of meaning, incidentally using tensors as part of an elaborate quantum mechanical framework. Notably, they claimed their formalism solved the “pet fish” problem - English speakers rate goldfish as very poor representatives of fish as a whole, and of pets as a whole, but consider goldfish to be excellent representatives of “pet fish.” Existing descriptions of meaning in compositional terms struggled with this. In The Harmonic Mind, first published in 2005, Smolensky and Legendre argued for the use of tensor products in marrying linear algebra and formal grammar models of meaning. Mathematical Foundations for a Compositional Distributional Model of Meaning represents a crystallization of all this into a novel and exciting construction, which continues to be widely cited and discussed.

We would like to thank Martha Lewis, Brendan Fong, Nina Otter, and the other participants in the seminar.

by john ( at February 10, 2018 12:01 AM

February 08, 2018

Sean Carroll - Preposterous Universe

Why Is There Something, Rather Than Nothing?

A good question!

Or is it?

I’ve talked before about the issue of why the universe exists at all (1, 2), but now I’ve had the opportunity to do a relatively careful job with it, courtesy of Eleanor Knox and Alastair Wilson. They are editing an upcoming volume, the Routledge Companion to the Philosophy of Physics, and asked me to contribute a chapter on this topic. Final edits aren’t done yet, but I’ve decided to put the draft on the arxiv:

Why Is There Something, Rather Than Nothing?
Sean M. Carroll

It seems natural to ask why the universe exists at all. Modern physics suggests that the universe can exist all by itself as a self-contained system, without anything external to create or sustain it. But there might not be an absolute answer to why it exists. I argue that any attempt to account for the existence of something rather than nothing must ultimately bottom out in a set of brute facts; the universe simply is, without ultimate cause or explanation.

As you can see, my basic tack hasn’t changed: this kind of question might be the kind of thing that doesn’t have a sensible answer. In our everyday lives, it makes sense to ask “why” this or that event occurs, but such questions have answers only because they are embedded in a larger explanatory context. In particular, because the world of our everyday experience is an emergent approximation with an extremely strong arrow of time, such that we can safely associate “causes” with subsequent “effects.” The universe, considered as all of reality (i.e. let’s include the multiverse, if any), isn’t like that. The right question to ask isn’t “Why did this happen?”, but “Could this have happened in accordance with the laws of physics?” As far as the universe and our current knowledge of the laws of physics is concerned, the answer is a resounding “Yes.” The demand for something more — a reason why the universe exists at all — is a relic piece of metaphysical baggage we would be better off to discard.

This perspective gets pushback from two different sides. On the one hand we have theists, who believe that they can answer why the universe exists, and the answer is God. As we all know, this raises the question of why God exists; but aha, say the theists, that’s different, because God necessarily exists, unlike the universe which could plausibly have not. The problem with that is that nothing exists necessarily, so the move is pretty obviously a cheat. I didn’t have a lot of room in the paper to discuss this in detail (in what after all was meant as a contribution to a volume on the philosophy of physics, not the philosophy of religion), but the basic idea is there. Whether or not you want to invoke God, you will be left with certain features of reality that have to be explained by “and that’s just the way it is.” (Theism could possibly offer a better account of the nature of reality than naturalism — that’s a different question — but it doesn’t let you wiggle out of positing some brute facts about what exists.)

The other side are those scientists who think that modern physics explains why the universe exists. It doesn’t! One purported answer — “because Nothing is unstable” — was never even supposed to explain why the universe exists; it was suggested by Frank Wilczek as a way of explaining why there is more matter than antimatter. But any such line of reasoning has to start by assuming a certain set of laws of physics in the first place. Why is there even a universe that obeys those laws? This, I argue, is not a question to which science is ever going to provide a snappy and convincing answer. The right response is “that’s just the way things are.” It’s up to us as a species to cultivate the intellectual maturity to accept that some questions don’t have the kinds of answers that are designed to make us feel satisfied.

by Sean Carroll at February 08, 2018 05:19 PM

February 07, 2018

Tommaso Dorigo - Scientificblogging

Roberto Carlin To Lead CMS Experiment In 2019-20
Great news for the CMS experiment - and for Italy, and for my institution, Padova, where I coordinate accelerator-based physics research for INFN. Professor Roberto Carlin, a longtime member of the CMS experiment, where he has taken many important roles in the construction and operations of the experiment, and recently was deputy spokesperson, has now been elected spokesperson. This consolidates a "rule" which sees Italian physicists at the lead of the experiment every other term, after Tonelli (2010-12) and Camporesi (2014-16). 

read more

by Tommaso Dorigo at February 07, 2018 09:43 PM

Axel Maas - Looking Inside the Standard Model

How large is an elementary particle?
Recently, in the context of a master thesis, our group has begun to determine the size of the W boson. The natural questions on this project is: Why do you do that? Do we not know it already? And does elementary particles have a size at all?

It is best to answer these questions in reverse order.

So, do elementary particles have a size at all? Well, elementary particles are called elementary as they are the most basic constituents. In our theories today, they start out as pointlike. Only particles made from other particles, so-called bound states like a nucleus or a hadron, have a size. And now comes the but.

First of all, we do not yet know whether our elementary particles are really elementary. They may also be bound states of even more elementary particles. But in experiments we can only determine upper bounds to the size. Making better experiments will reduce this upper bound. Eventually, we may see that a particle previously thought of as point-like has a size. This has happened quite frequently over time. It always opened up a new level of elementary particle theories. Therefore measuring the size is important. But for us, as theoreticians, this type of question is only important if we have an idea about what could be the more elementary particles. And while some of our research is going into this direction, this project is not.

The other issue is that quantum effects give all elementary particles an 'apparent' size. This comes about by how we measure the size of a particle. We do this by shooting some other particle at it, and measure how strongly it becomes deflected. A truly pointlike particle has a very characteristic reflection profile. But quantum effects allow for additional particles to be created and destroyed in the vicinity of any particle. Especially, they allow for the existence of another particle of the same type, at least briefly. We cannot distinguish whether we hit the original particle or one of these. Since they are not at the same place as the original particle, their average distance looks like a size. This gives even a pointlike particle an apparent size, which we can measure. In this sense even an elementary particle has a size.

So, how can we then distinguish this size from an actual size of a bound state? We can do this by calculations. We determine the apparent size due to the quantum fluctuations and compare it to the measurement. Deviations indicate an actual size. This is because for a real bound state we can scatter somewhere in its structure, and not only in its core. This difference looks pictorially like this:

So, do we know the size already? Well, as said, we can only determine upper limits. Searching for them is difficult, and often goes via detours. One of such detours are so-called anomalous couplings. Measuring how they depend on energy provides indirect information on the size. There is an active program at CERN underway to do this experimentally. The results are so far say that the size of the W is below 0.0000000000000001 meter. This seems tiny, but in the world of particle physics this is not that strong a limit.

And now the interesting question: Why do we do this? As written, we do not want to make the W a bound state of something new. But one of our main research topics is driven by an interesting theoretical structure. If the standard model is taken seriously, the particle which we observe in an experiment and call the W is actually not the W of the underlying theory. Rather, it is a bound state, which is very, very similar to the elementary particle, but actually build from the elementary particles. The difference has been so small that identifying one with the other was a very good approximation up to today. But with better and better experiments may change. Thus, we need to test this.

Because then the thing we measure is a bound state it should have a, probably tiny, size. This would be a hallmark of this theoretical structure. And that we understood it. If the size is such that it could be actually measured at CERN, then this would be an important test of our theoretical understanding of the standard model.

However, this is not a simple quantity to calculate. Bound states are intrinsically complicated. Thus, we use simulations for this purpose. In fact, we actually go over the same detour as the experiments, and will determine an anomalous coupling. From this we then infer the size indirectly. In addition, the need to perform efficient simulations forces us to simplify the problem substantially. Hence, we will not get the perfect number. But we may get the order of magnitude, or be perhaps within a factor of two, or so. And this is all we need to currently say whether a measurement is possible, or whether this will have to wait for the next generation of experiments. And thus whether we will know whether we understood the theory within a few years or within a few decades.

by Axel Maas ( at February 07, 2018 11:18 AM

Lubos Motl - string vacua and pheno

Let's build a 500 TeV collider under the sea
In his text Unreasonably Big Physics, Tetragraviton classifies the Texan SSC collider as marginally reasonable but other proposed projects are said to be unreasonable.

They include a wonderful 2017 collider proposal in the Gulf of Mexico. The structure would host some new, potentially clever 4-tesla dipoles and would be located 100 meters under the sea level between Houston and Merida.

The collision energy would be intriguing \(2\times 250\TeV=500\TeV\), almost 40 times higher than the current LHC beam, and the luminosity would trump the LHC by orders of magnitude, too. The depth is high enough not to annoy fish and to protect the tunnel against the hurricanes and the radius-of-300-kilometers ring would be far enough from beaches not to interfere with shipping. Quite generally, I think that the potentially brilliant idea that sea colliders could be more practical than the underground colliders should be honestly considered.

The cost is supposed to be comparable to the planned Chinese or European colliders – which means it's supposed to be very cheap. The adjective "cheap" is mine and unavoidably involves some subjective judgement. But I simply think that if someone finds a collider of this energy and this price "expensive", then he dislikes particle physics and it's bad if Tetragraviton belongs to that set.

Tetragraviton also mentioned a 15-year-old Japanese proposal for a very strong neutrino beam that would cost hundreds of billions of dollars and that will prematurely detonate the nuclear bombs across the world. It's handy. ;-)

I don't know exactly why you would want to do that but I know why we want a \(500\TeV\) collider. Every child knows why we want a \(500\TeV\) collider (or a plastic pony for Missy).

Well, I completely disagree with Tetragraviton that the Gulf of Mexico collider is unreasonable or impossible. If the calculations are right, it's actually a proposal you can't refuse. For the funds that only exceed the cost of the LHC by a small factor, we could increase the energy by a factor of 40. Isn't it wonderful?

He's not terribly specific about the arguments for his criticism but in between the lines, it seems that he finds tens of billions of dollars to be too much. Those amounts may be higher than his wealth but he's not supposed to pay for the whole thing. The world's GDP approaches $100 trillion a year. It's around $250 billion a day – including weekends – or $10 billion per hour.

Every hour, the world produces the wealth equal to the cost of the LHC dollider so the Gulf of Mexico collider could be equivalent just to few hours of the mankind's productive activity. Of course, some people may claim that it's arrogant to assume that the whole mankind contributes to something as esoteric as particle physics.

First of all, it's not arrogant – on the contrary, it's arrogant for someone to suggest that a human being could ignore particle physics. Take into the account that the extraterrestrials are watching us: Wouldn't you be terribly ashamed of the human race if it acts as a bunch of stinky pigs who won't dedicate even a few hours of their work to such groundbreaking projects of the global importance? Second of all, even if you compare the tens of billions of dollars to the funding for science only, it's small. Science may be getting roughly 1% of the global GDP which is one trillion dollars per year (globally). So such a unique project could still be equivalent just to weeks of the global spending for science.

It's totally counterproductive for Tetragraviton to spread his small-aß sentiments indicating that science shouldn't deserve tens-of-billions-of-dollars scientific projects. The mankind is getting richer, the rich enough countries can surely feed everybody and the poor countries may join as well, and there will be an increasing pile of excess cash (and workers who want some well-defined job).

It's natural for creative people and especially dreamers to have increasingly demanding visions and unless we screw something a big time, it should be increasingly easy to make these dreams come true. On top of that, every investment should compare costs and benefits – their differences and ratios. If a collider project increases the center-of-mass energy much more significantly than costs, then it simply deserves the particle physicists', engineers', and sponsors' attention.

Pure science will probably not get above $100 billion projects soon. But if you had some big project that would be somewhat scientific but also apparently very useful for lots of people or nations, I do believe that even multi-trillion projects should be possible.

The whole Apollo Program (whose outcome were all the men on the Moon) cost $25 billion of 1973 dollars which is translated to $110 billion of 2018 dollars. NASA's spending as a percentage of the U.S. Fed government's expenses peaked in 1966, under Lyndon Johnson's watch, when it was 4.41% or $6 (old big) billion. That one-year spending for one "applied scientific" institution already trumps the cost of the LHC when you convert it to current dollars.

Lunar missions have become boring for the taxpayers but other things may get hot again. Maybe there are great reasons to drill a hole through the Earth, build a tunnel around the Earth's circumference, or bring the ocean to the middle of Sahara, among hundreds of similar things I could generate effectively. Tetragraviton represents a textbook example of what Czechs call a near-wall-šitter (přizdisráč), a frightened man without self-confidence and ambitions. The Academia is full of this attitude, especially if you look at some typical bureaucrats in the scientific environment (who got to their chair mostly for their invisibility). But that's not the right attitude for those who should make similar big decisions. That's not the attitude of the men who change the world. That's not the men whom I really admire.

by Luboš Motl ( at February 07, 2018 05:52 AM

John Baez - Azimuth

A Categorical Semantics for Causal Structure


The school for Applied Category Theory 2018 is up and running! Students are blogging about papers! The first blog article is about a diagrammatic method for studying causality:

• Joseph Moeller and Dmitry Vagner, A categorical semantics for causal structure, The n-Category Café, 22 January 2018.

Make sure to read the whole blog conversation, since it helps a lot. People were confused about some things at first.

Joseph Moeller is a grad student at U.C. Riverside working with me and a company called Metron Scientific Solutions on “network models”—a framework for designing networks, which the Coast Guard is already interested in using for their search and rescue missions:

• John C. Baez, John Foley, Joseph Moeller and Blake S. Pollard, Network Models. (Blog article here.)

Dmitry Vagner is a grad student at Duke who is very enthusiastic about category theory. Dmitry has worked with David Spivak and Eugene Lerman on open dynamical systems and the operad of wiring diagrams.

It’s great to see these students digging into the wonderful world of category theory and its applications.

To discuss this stuff, please go to The n-Category Café.

by John Baez at February 07, 2018 05:29 AM

February 05, 2018

Matt Strassler - Of Particular Significance

In Memory of Joe Polchinski, the Brane Master

This week, the community of high-energy physicists — of those of us fascinated by particles, fields, strings, black holes, and the universe at large — is mourning the loss of one of the great theoretical physicists of our time, Joe Polchinski. It pains me deeply to write these words.

Everyone who knew him personally will miss his special qualities — his boyish grin, his slightly wicked sense of humor, his charming way of stopping mid-sentence to think deeply, his athleticism and friendly competitiveness. Everyone who knew his research will feel the absence of his particular form of genius, his exceptional insight, his unique combination of abilities, which I’ll try to sketch for you below. Those of us who were lucky enough to know him both personally and scientifically — well, we lose twice.

Image result for joe polchinski

Polchinski — Joe, to all his colleagues — had one of those brains that works magic, and works magically. Scientific minds are as individual as personalities. Each physicist has a unique combination of talents and skills (and weaknesses); in modern lingo, each of us has a superpower or two. Rarely do you find two scientists who have the same ones.

Joe had several superpowers, and they were really strong. He had a tremendous knack for looking at old problems and seeing them in a new light, often overturning conventional wisdom or restating that wisdom in a new, clearer way. And he had prodigious technical ability, which allowed him to follow difficult calculations all the way to the end, on paths that would have deterred most of us.

One of the greatest privileges of my life was to work with Joe, not once but four times. I think I can best tell you a little about him, and about some of his greatest achievements, through the lens of that unforgettable experience.

[To my colleagues: this post was obviously written in trying circumstances, and it is certainly possible that my memory of distant events is foggy and in error.  I welcome any corrections that you might wish to suggest.]

Our papers between 1999 and 2006 were a sequence of sorts, aimed at understanding more fully the profound connection between quantum field theory — the language of particle physics — and string theory — best-known today as a candidate for a quantum theory of gravity. In each of those papers, as in many thousands of others written after 1995, Joe’s most influential contribution to physics played a central role. This was the discovery of objects known as “D-branes”, which he found in the context of string theory. (The term is a generalization of the word `membrane’.)

I can already hear the Lee Smolins and Peter Woits of the world screaming at me. ‘A discovery in string theory,’ some will shout, pounding the table, ‘an untested theory that’s not even wrong, should not be called a discovery in physics.’ Pay them no mind; they’re not even close, as you’ll see by the end of my remarks.

The Great D-scovery

In 1989, Joe, working with two young scientists, Jin Dai and Rob Leigh, was exploring some details of string theory, and carrying out a little mathematical exercise. Normally, in string theory, strings are little lines or loops that are free to move around anywhere they like, much like particles moving around in this room. But in some cases, particles aren’t in fact free to move around; you could, for instance, study particles that are trapped on the surface of a liquid, or trapped in a very thin whisker of metal. With strings, there can be a new type of trapping that particles can’t have — you could perhaps trap one end, or both ends, of the string within a surface, while allowing the middle of the string to move freely. The place where a string’s end may be trapped — whether a point, a line, a surface, or something more exotic in higher dimensions — is what we now call a “D-brane”.  [The `D’ arises for uninteresting technical reasons.]

Joe and his co-workers hit the jackpot, but they didn’t realize it yet. What they discovered, in retrospect, was that D-branes are an automatic feature of string theory. They’re not optional; you can’t choose to study string theories that don’t have them. And they aren’t just surfaces or lines that sit still. They’re physical objects that can roam the world. They have mass and create gravitational effects. They can move around and scatter off each other. They’re just as real, and just as important, as the strings themselves!


Fig. 1: D branes (in green) are physical objects on which a fundamental string (in red) can terminate.

It was as though Joe and his collaborators started off trying to understand why the chicken crossed the road, and ended up discovering the existence of bicycles, cars, trucks, buses, and jet aircraft.  It was that unexpected, and that rich.

And yet, nobody, not even Joe and his colleagues, quite realized what they’d done. Rob Leigh, Joe’s co-author, had the office next to mine for a couple of years, and we wrote five papers together between 1993 and 1995. Yet I think Rob mentioned his work on D-branes to me just once or twice, in passing, and never explained it to me in detail. Their paper had less than twenty citations as 1995 began.

In 1995 the understanding of string theory took a huge leap forward. That was the moment when it was realized that all five known types of string theory are different sides of the same die — that there’s really only one string theory.  A flood of papers appeared in which certain black holes, and generalizations of black holes — black strings, black surfaces, and the like — played a central role. The relations among these were fascinating, but often confusing.

And then, on October 5, 1995, a paper appeared that changed the whole discussion, forever. It was Joe, explaining D-branes to those of us who’d barely heard of his earlier work, and showing that many of these black holes, black strings and black surfaces were actually D-branes in disguise. His paper made everything clearer, simpler, and easier to calculate; it was an immediate hit. By the beginning of 1996 it had 50 citations; twelve months later, the citation count was approaching 300.

So what? Great for string theorists, but without any connection to experiment and the real world.  What good is it to the rest of us? Patience. I’m just getting to that.

What’s it Got to Do With Nature?

Our current understanding of the make-up and workings of the universe is in terms of particles. Material objects are made from atoms, themselves made from electrons orbiting a nucleus; and the nucleus is made from neutrons and protons. We learned in the 1970s that protons and neutrons are themselves made from particles called quarks and antiquarks and gluons — specifically, from a “sea” of gluons and a few quark/anti-quark pairs, within which sit three additional quarks with no anti-quark partner… often called the `valence quarks’.  We call protons and neutrons, and all other particles with three valence quarks, `baryons”.   (Note that there are no particles with just one valence quark, or two, or four — all you get is baryons, with three.)

In the 1950s and 1960s, physicists discovered short-lived particles much like protons and neutrons, with a similar sea, but which  contain one valence quark and one valence anti-quark. Particles of this type are referred to as “mesons”.  I’ve sketched a typical meson and a typical baryon in Figure 2.  (The simplest meson is called a “pion”; it’s the most common particle produced in the proton-proton collisions at the Large Hadron Collider.)



Fig. 2: Baryons (such as protons and neutrons) and mesons each contain a sea of gluons and quark-antiquark pairs; baryons have three unpaired “valence” quarks, while mesons have a valence quark and a valence anti-quark.  (What determines whether a quark is valence or sea involves subtle quantum effects, not discussed here.)

But the quark/gluon picture of mesons and baryons, back in the late 1960s, was just an idea, and it was in competition with a proposal that mesons are little strings. These are not, I hasten to add, the “theory of everything” strings that you learn about in Brian Greene’s books, which are a billion billion times smaller than a proton. In a “theory of everything” string theory, often all the types of particles of nature, including electrons, photons and Higgs bosons, are tiny tiny strings. What I’m talking about is a “theory of mesons” string theory, a much less ambitious idea, in which only the mesons are strings.  They’re much larger: just about as long as a proton is wide. That’s small by human standards, but immense compared to theory-of-everything strings.

Why did people think mesons were strings? Because there was experimental evidence for it! (Here’s another example.)  And that evidence didn’t go away after quarks were discovered. Instead, theoretical physicists gradually understood why quarks and gluons might produce mesons that behave a bit like strings. If you spin a meson fast enough (and this can happen by accident in experiments), its valence quark and anti-quark may separate, and the sea of objects between them forms what is called a “flux tube.” See Figure 3. [In certain superconductors, somewhat similar flux tubes can trap magnetic fields.] It’s kind of a thick string rather than a thin one, but still, it shares enough properties with a string in string theory that it can produce experimental results that are similar to string theory’s predictions.


Fig. 3: One reason mesons behave like strings in experiment is that a spinning meson acts like a thick string, with the valence quark and anti-quark at the two ends.

And so, from the mid-1970s onward, people were confident that quantum field theories like the one that describes quarks and gluons can create objects with stringy behavior. A number of physicists — including some of the most famous and respected ones — made a bolder, more ambitious claim: that quantum field theory and string theory are profoundly related, in some fundamental way. But they weren’t able to be precise about it; they had strong evidence, but it wasn’t ever entirely clear or convincing.

In particular, there was an important unresolved puzzle. If mesons are strings, then what are baryons? What are protons and neutrons, with their three valence quarks? What do they look like if you spin them quickly? The sketches people drew looked something like Figure 3. A baryon would perhaps become three joined flux tubes (with one possibly much longer than the other two), each with its own valence quark at the end.  In a stringy cartoon, that baryon would be three strings, each with a free end, with the strings attached to some sort of junction. This junction of three strings was called a “baryon vertex.”  If mesons are little strings, the fundamental objects in a string theory, what is the baryon vertex from the string theory point of view?!  Where is it hiding — what is it made of — in the mathematics of string theory?


Fig. 4: A fast-spinning baryon looks vaguely like the letter Y — three valence quarks connected by flux tubes to a “baryon vertex”.  A cartoon of how this would appear from a stringy viewpoint, analogous to Fig. 3, leads to a mystery: what, in string theory, is this vertex?!

[Experts: Notice that the vertex has nothing to do with the quarks. It’s a property of the sea — specifically, of the gluons. Thus, in a world with only gluons — a world whose strings naively form loops without ends — it must still be possible, with sufficient energy, to create a vertex-antivertex pair. Thus field theory predicts that these vertices must exist in closed string theories, though they are linearly confined.]


The baryon puzzle: what is a baryon from the string theory viewpoint?

No one knew. But isn’t it interesting that the most prominent feature of this vertex is that it is a location where a string’s end can be trapped?

Everything changed in the period 1997-2000. Following insights from many other physicists, and using D-branes as the essential tool, Juan Maldacena finally made the connection between quantum field theory and string theory precise. He was able to relate strings with gravity and extra dimensions, which you can read about in Brian Greene’s books, with the physics of particles in just three spatial dimensions, similar to those of the real world, with only non-gravitational forces.  It was soon clear that the most ambitious and radical thinking of the ’70s was correct — that almost every quantum field theory, with its particles and forces, can alternatively be viewed as a string theory. It’s a bit analogous to the way that a painting can be described in English or in Japanese — fields/particles and strings/gravity are, in this context, two very different languages for talking about exactly the same thing.

The saga of the baryon vertex took a turn in May 1998, when Ed Witten showed how a similar vertex appears in Maldacena’s examples. [Note added: I had forgotten that two days after Witten’s paper, David Gross and Hirosi Ooguri submitted a beautiful, wide-ranging paper, whose section on baryons contains many of the same ideas.] Not surprisingly, this vertex was a D-brane — specifically a D-particle, an object on which the strings extending from freely-moving quarks could end. It wasn’t yet quite satisfactory, because the gluons and quarks in Maldacena’s examples roam free and don’t form mesons or baryons. Correspondingly the baryon vertex isn’t really a physical object; if you make one, it quickly diffuses away into nothing. Nevertheless, Witten’s paper made it obvious what was going on. To the extent real-world mesons can be viewed as strings, real-world protons and neutrons can be viewed as strings attached to a D-brane.


The baryon puzzle, resolved.  A baryon is made from three strings and a point-like D-brane. [Note there is yet another viewpoint in which a baryon is something known as a skyrmion, a soliton made from meson fields — but that is an issue for another day.]

It didn’t take long for more realistic examples, with actual baryons, to be found by theorists. I don’t remember who found one first, but I do know that one of the earliest examples showed up in my first paper with Joe, in the year 2000.


Working with Joe

That project arose during my September 1999 visit to the KITP (Kavli Institute for Theoretical Physics) in Santa Barbara, where Joe was a faculty member. Some time before that I happened to have studied a field theory (called N=1*) that differed from Maldacena’s examples only slightly, but in which meson-like objects do form. One of the first talks I heard when I arrived at KITP was by Rob Myers, about a weird property of D-branes that he’d discovered. During that talk I made a connection between Myers’ observation and a feature of the N=1* field theory, and I had one of those “aha” moments that physicists live for. I suddenly knew what the string theory that describes the N=1*  field theory must look like.

But for me, the answer was bad news. To work out the details was clearly going to require a very difficult set of calculations, using aspects of string theory about which I knew almost nothing [non-holomorphic curved branes in high-dimensional curved geometry.] The best I could hope to do, if I worked alone, would be to write a conceptual paper with lots of pictures, and far more conjectures than demonstrable facts.

But I was at KITP.  Joe and I had had a good personal rapport for some years, and I knew that we found similar questions exciting. And Joe was the brane-master; he knew everything about D-branes. So I decided my best hope was to persuade Joe to join me. I engaged in a bit of persistent cajoling. Very fortunately for me, it paid off.

I went back to the east coast, and Joe and I went to work. Every week or two Joe would email some research notes with some preliminary calculations in string theory. They had such a high level of technical sophistication, and so few pedagogical details, that I felt like a child; I could barely understand anything he was doing. We made slow progress. Joe did an important warm-up calculation, but I found it really hard to follow. If the warm-up string theory calculation was so complex, had we any hope of solving the full problem?  Even Joe was a little concerned.

Image result for polchinski joeAnd then one day, I received a message that resounded with a triumphant cackle — a sort of “we got ’em!” that anyone who knew Joe will recognize. Through a spectacular trick, he’d figured out how use his warm-up example to make the full problem easy! Instead of months of work ahead of us, we were essentially done.

From then on, it was great fun! Almost every week had the same pattern. I’d be thinking about a quantum field theory phenomenon that I knew about, one that should be visible from the string viewpoint — such as the baryon vertex. I knew enough about D-branes to develop a heuristic argument about how it should show up. I’d call Joe and tell him about it, and maybe send him a sketch. A few days later, a set of notes would arrive by email, containing a complete calculation verifying the phenomenon. Each calculation was unique, a little gem, involving a distinctive investigation of exotically-shaped D-branes sitting in a curved space. It was breathtaking to witness the speed with which Joe worked, the breadth and depth of his mathematical talent, and his unmatched understanding of these branes.

[Experts: It’s not instantly obvious that the N=1* theory has physical baryons, but it does; you have to choose the right vacuum, where the theory is partially Higgsed and partially confining. Then to infer, from Witten’s work, what the baryon vertex is, you have to understand brane crossings (which I knew about from Hanany-Witten days): Witten’s D5-brane baryon vertex operator creates a  physical baryon vertex in the form of a D3-brane 3-ball, whose boundary is an NS 5-brane 2-sphere located at a point in the usual three dimensions. And finally, a physical baryon is a vertex with n strings that are connected to nearby D5-brane 2-spheres. See chapter VI, sections B, C, and E, of our paper from 2000.]

Throughout our years of collaboration, it was always that way when we needed to go head-first into the equations; Joe inevitably left me in the dust, shaking my head in disbelief. That’s partly my weakness… I’m pretty average (for a physicist) when it comes to calculation. But a lot of it was Joe being so incredibly good at it.

Fortunately for me, the collaboration was still enjoyable, because I was almost always able to keep pace with Joe on the conceptual issues, sometimes running ahead of him. Among my favorite memories as a scientist are moments when I taught Joe something he didn’t know; he’d be silent for a few seconds, nodding rapidly, with an intent look — his eyes narrow and his mouth slightly open — as he absorbed the point.  “Uh-huh… uh-huh…”, he’d say.

But another side of Joe came out in our second paper. As we stood chatting in the KITP hallway, before we’d even decided exactly which question we were going to work on, Joe suddenly guessed the answer! And I couldn’t get him to explain which problem he’d solved, much less the solution, for several days!! It was quite disorienting.

This was another classic feature of Joe. Often he knew he’d found the answer to a puzzle (and he was almost always right), but he couldn’t say anything comprehensible about it until he’d had a few days to think and to turn his ideas into equations. During our collaboration, this happened several times. (I never said “Use your words, Joe…”, but perhaps I should have.) Somehow his mind was working in places that language doesn’t go, in ways that none of us outside his brain will ever understand. In him, there was something of an oracle.

Looking Toward The Horizon

Our interests gradually diverged after 2006; I focused on the Large Hadron Collider [also known as the Large D-brane Collider], while Joe, after some other explorations, ended up thinking about black hole horizons and the information paradox. But I enjoyed his work from afar, especially when, in 2012, Joe and three colleagues (Ahmed Almheiri, Don Marolf, and James Sully) blew apart the idea of black hole complementarity, widely hoped to be the solution to the paradox. [I explained this subject here, and also mentioned a talk Joe gave about it here.]  The wreckage is still smoldering, and the paradox remains.

Then Joe fell ill, and we began to lose him, at far too young an age.  One of his last gifts to us was his memoirs, which taught each of us something about him that we didn’t know.  Finally, on Friday last, he crossed the horizon of no return.  If there’s no firewall there, he knows it now.

What, we may already wonder, will Joe’s scientific legacy be, decades from now?  It’s difficult to foresee how a theorist’s work will be viewed a century hence; science changes in unexpected ways, and what seems unimportant now may become central in future… as was the path for D-branes themselves in the course of the 1990s.  For those of us working today, D-branes in string theory are clearly Joe’s most important discovery — though his contributions to our understanding of black holes, cosmic strings, and aspects of field theory aren’t soon, if ever, to be forgotten.  But who knows? By the year 2100, string theory may be the accepted theory of quantum gravity, or it may just be a little-known tool for the study of quantum fields.

Yet even if the latter were to be string theory’s fate, I still suspect it will be D-branes that Joe is remembered for. Because — as I’ve tried to make clear — they’re real.  Really real.  There’s one in every proton, one in every neutron. Our bodies contain them by the billion billion billions. For that insight, that elemental contribution to human knowledge, our descendants can blame Joseph Polchinski.

Thanks for everything, Joe.  We’ll miss you terribly.  You so often taught us new ways to look at the world — and even at ourselves.

Image result for joe polchinski


by Matt Strassler at February 05, 2018 03:59 PM

February 04, 2018

Lubos Motl - string vacua and pheno

Experiments may only measure gauge-invariant, coordinate-independent quantities
I have finally found the time needed to see every page among the 12+7+16 pages of the Japanese papers on the Earth's contribution to muon's \(g-2\) and it's clear that if I had opened all the papers before I wrote my first fast blog post about it, the post wouldn't have been written because the papers are childish and ludicrously wrong. Let me start with superficial observations about their style and background.

First, the authors are clearly not professional particle physicists. You won't find any Feynman diagrams – or the words "loop", "Feynman", "diagram", for that matter – in any of the three papers. Well, particle physicists would generally agree that you need Feynman diagrams – and probably multiloop ones – to discuss the muon's magnetic moment at the state-of-the-art precision.

Instead, we clearly read papers by 2-3 Japanese men who tell us: Look how completely stupid and pretentious particle physicists have been with their Feynman digrams, loop integrals, all this difficult garbage. We can just use some high school and undergraduate physics and find something more interesting. The Japanese men use this simple enough toolkit because they're basically experimenters and their toolkit could be enough.

Well, that outcome would be juicy, indeed, but it's unlikely – and in this case, they haven't beaten the professional techniques by those of the average Joe. People often have other agendas than science – they would love if experimenters trumped theorists in doing theoretical calculations; or if African physicists or women in science trumped the white men. Well, most of the time, they just don't. More generally, you shouldn't put ideological agendas and a wishful thinking above the scientific evidence.

Their calculation of the extra effect that is supposed to cure the muon anomaly is intrinsically a classical, undergraduate calculation and their result is something like\[

\mu_{\rm m}^{\rm eff} \!\simeq\!
(1\!+\!3\phi/c^2)\,\mu_{\rm m}

\] which says that they muon's magnetic moment should be adjusted by a correction proportional to the gravitational potential \(\phi\). And they substitute the Earth's gravitational potential. Well, if I had paid attention to that sentence in the first abstract, I wouldn't have written my TRF blog post, I think. It's silly.

First, as some people have mentioned, if some new effect were proportional to the gravitational potential, it would be the contribution to the potential from the Sun, and not Earth, that would dominate. Please, do this calculation with me. The gravitational potential is equal to \(GM/R\), up a to a sign.

The Sun is heavier (higher \(M\)) but further from us (greater \(R\)) than the center of our planet. Which celestial body wins? Well, the mass ratio is\[

\frac{M_{\rm Sun}}{M_{\rm Earth}} = \frac{2\times 10^{30}\,{\rm kg}}{6\times 10^{24}\,{\rm kg}} \approx 330,000.

\] Similarly,\[

\frac{R_{\rm Sun-Earth}}{R_{\rm Earth}} = \frac{150\times 10^{6}\,{\rm km}}{6,378\,{\rm km}} \approx 23,500.

\] So the mass ratio and distance ratio are \(330,000\) and \(23,500\), respectively. Clearly, the mass ratio dominates, and the gravitational potential from the heavier Sun is therefore larger by a factor of \(330,000/23,500\approx 14\). To claim that an effect is proportional to the gravitational potential, to consider the Earth, but to neglect the Sun is just wrong.

(One could also discuss the contributions to the potential from the location of the Solar System within the galaxy and similar things. The Solar System's speed within galaxy is 230 km per second, 7.5 times higher than the Earth's orbital speed 30 km per second. Because the gravitational potential \(\phi\sim v^2/2\) for circular orbits, the galactic contribution is some 55 times greater than the Sun's. Well, this 55 is probably just a rough estimate that would be accurate if the potential's profile in the galaxy were \(1/r\) which it's not. And by the way, if the dependence on the Sun's gravitational potential mattered, there would be seasonal variations of the muon's magnetic moment and lots of other weird effects.)

Note that the Sun won only because we considered \(GM/R\). If we had a higher integer power of \(R\) in the denominator, the Earth would win because the huge distance of the Sun would "matter" twice or thrice. The effect proportional to \(GM/R^2\) would be proportional to the gravitational acceleration – i.e. the acceleration of the lab bound to the Earth's surface relatively to the freely falling frame. And \(GM/R^3\) is the scaling of the tidal forces (accelerations), and those are some components of the Riemann curvature tensor.

So the Earth could win if we considered gravitational accelerations or tidal forces (=curvatures in GR). Those would be exactly the situations that would be plausible because the claim that an effect depends on the acceleration of the lab or non-uniformity of the gravitational field is compatible with the equivalence principle. (But those corrections would be far too tiny to be seen experimentally.) But the dependence of the muon's magnetic moment on the gravitational potential violates the equivalence principle.

The people defending these Japanese papers tend to say "there can be lots of terms, mess, blah blah blah," and we can surely get the desired terms (they claimed to cancel the 3.6-sigma deviation with an unexpected precision of 0.1 sigma, another point that should raise your eyebrows). However, the very point of the equivalence principle – and other big principles and symmetries of modern physics – is that Nature is not that messy. She's clean if you grab Her by Her pußy properly and some facts hold regardless of the complexity of any experiments.

Haters of physics love to dismiss physicists' knowledge – the world is so complex and mysterious, anyway (here I am basically quoting that young woman from the humanities who has sometimes asked for the translations to English). Well, much of the progress in science is all about the invalidation of these pessimistic claims and physicists have gotten extremely far in that process, indeed. Lots of things have been demystified and decomplexified – the hopeless complexity is just apparent, on the surface, and physicists know how to look beneath the surface even if mortals can't. They can still make lots of very precise and correct statements about things, even seemingly complicated things.

For example, the equivalence principle says that if you perform an experiment inside a small enough and freely falling lab which has no windows, the results don't allow you to figure out whether you're in a gravitational field or not. If the ratio of the electron's and muon's magnetic moments depended on your being near Earth, you could say whether you're near the Earth inside that lab, and the equivalence principle would be violated. That's it.

Aside from their complete misunderstanding the equivalence principle, neglecting the Sun which is actually dominant, and failing to discuss any Feynman diagrams at all, the papers also misstate the ratio of the electric and magnetic fields in the muon \((g-2)\) experiments by three orders of magnitude, and do other things – see phenomenologist Mark Goodsell's comments about that, plus his argumentation why the effect has to cancel. A fraction of these problems is enough to conclude that the Japanese papers are just some mathematical masturbation that you shouldn't read in detail because it's a waste of time. After some time you spend with the paper, you will become almost certain that they just tried to find a formula that numerologically agrees with the muon anomaly. Gravitational potentials looked attractive to them, they have used them, and they cooked (more precisely, kooked) a rationalization afterwards. Well, this research strategy is driven by random guesses and a wishful thinking and it's not surprising that it fails most of the time.

(I agree with phenomenologist Mark Goodsell, however, who says that this "inverse" approach is often right because research is a creative process. Of course I often try to guess the big results first and then complete the "details", too. One just needs to avoid fooling himself.)

But I want to return to the title. Defenders of the paper, like TRF commenter Mike, tell us that experiments usually measure lots of coordinate-dependent effects etc. Well, this is the fundamental claim that shows that Mike – and surely others – completely misunderstands the meaning of the equivalence principle and gauge symmetries in modern physics.

The meaning of coordinate redefinitions in the general theory of relativity is that those are the group of local, gauge transformations of that theory and only "invariant" quantities that are independent of these transformations may be measured! Only the invariant quantities are "real". That's really the point of the adjectives such as "invariant". That's how gauge transformations differ from any other transformation of observable quantities. They're transformations that are purely imaginary and take place in the theorist's imagination – or, more precisely, in the intermediate stages of a theorist's calculation. But the final predictions of the experiments are always coordinate-independent.

In particular, rigid rulers and perfect clocks may only measure the proper distances and proper times. It's totally similar with the coordinate dependence of other quantities beyond distances and times – and with all other gauge invariances in physics.

The case of electromagnetism is completely analogous. Electrodynamics has the \(U(1)\) gauge symmetry. The 4-potential \(A_\mu\) transforms as \[

A_\mu \to A_\mu + \partial_\mu \lambda

\] which is why it depends on the choice of the gauge – or changes after a gauge transformation defined by the parameter \(\lambda(x,y,z,t)\). That's why experiments can't measure it. The choice of \(\lambda\) is up to a theorist's free will. You can choose any gauge you want and experimenters can't measure what you want. They measure the actual object you're thinking about; they don't measure your free will.

On the contrary, the electromagnetic field strength doesn't transform,\[

F_{\mu\nu}\to F_{\mu\nu}.

\] That's why electric and magnetic fields at a known point in the spacetime may be measured by experiments – their magnitude may appear on the displays. Also, the Araronov-Bohm experiment may measure the integral \(\oint A_\mu dx\) over a contour surrounding a solenoid modulo \(2\pi\). That seemingly depends on \(A_\mu\) but the contour integral of \(A_\mu\) modulo \(2\pi\) (in proper units) is actually as invariant as \(F_{\mu\nu}\). After all, \(\oint A_\mu dx=\int F\cdot dS\) is the magnetic flux through the contour. And that's why it can be measured (even if the particle avoids the "bulk" of the contour's interior) – by looking at the phase shift affecting the location of some interference maxima and minima.

The case of coordinate dependence is completely analogous. One can choose coordinates in many ways in GR – they're like the choice of gauges in electromagnetism. But the experiments can't measure artifacts of a choice of coordinates. They can only measure real effects – effects that may be discussed in terms of gauge-invariant (including coordinate-independent) concepts. For example, when LIGO sees seismic noise, the seismic noise isn't just an artifact of someone's choice of coordinates. The seismic activity is a genuine time dependence of the proper distances between rocks inside Earth – and the "time" in this sentence could be defined as some proper time, e.g. one measured by clocks attached to these rocks, too.

I must add a disclaimer. Coordinates may be defined as some proper distances and proper times based on some real-world objects. For example, you may specify a place in continental Europe (at any altitude up to kilometers) as the place's proper distance from a point in Yekaterinburg, in Reykjavik, and Cairo. Three proper distances \((s_Y, s_R, s_C)\) are a good enough replacement for the Cartesian \((x,y,z)\) – one must be careful that such maps aren't always one-to-one, however (for example, points above and below the plane of the triangle have the same value of the three proper distances).

So we could be worried that the three rulers – that measure proper distances from these three cities – are a counterexample to my statement about the coordinate independence of measurable quantities. Experiments directly measured some coordinates. But that's only because we had to define the coordinates to be the proper distances in the first place! So the first step in an experiment, the measurement of the proper distances of an object in Europe from the three cities, shouldn't even be considered a measurement yet. It should be considered a calibration. Experiments that probe laws of physics – e.g. a proposed effect that depends on the place of Europe – may only begin once you measure something else on top of the three proper distances! And note that when coordinates are equal to three proper distances and experiments measure these three numbers for an object, it's still true that "experiments only measure invariant quantities". The quantities are both invariant and (someone's particular chosen) coordinates.

So Mike uses lots of the right words from the physicist's toolkit but the deep statements are just wrong. He completely misunderstands the equivalence principle and gauge symmetries – principles that underlie much of modern physics and allow us to be sure about so many things despite the apparent complexity of many situations and gadgets. One may say lots of things and write Japanese papers but they're virtually guaranteed to be wrong.

The muon \((g-2)\) anomaly hasn't been explained away. Properties of elementary particles measured inside localized labs cannot depend on the gravitational potential. And even if something depended on the gravitational potential, the Sun's contribution would be dominant. If the equivalence principle holds – and there are lots of confirmations and reasons to think it's true – the muon's magnetic moment is a universal constant of Nature so it just cannot depend on the environment (the quantities describing the local gravitational field).

One could still be worried that the experimenters who measure the muon's magnetic moment have made a mistake and they also included some corrections that depend on the gravitational field that they shouldn't, and the terms from the Japanese papers should be interpreted as the "fixes" that subtract these specious corrections. But I don't think that this worry is justified because the equivalence principle says that it's really impossible to see, in a closed lab, whether you're inside the Earth's gravitational field or in outer space.

You can only measure the acceleration (using accelerometers on your smartphone, for example) or the non-uniformities of the field (if the lab is sufficiently large). But those effects only influence the elementary particles to a tiny extent that isn't measurable. So if one believes this statement – or if he believes that the effect should be proportional to a gravitational potential – the muon \((g-2)\) experimenters cannot introduce the gravitational-potential-dependent mistake even if they wanted.

(Well, if they really wanted, they could simply add the error deliberately. "We want the muon magnetic moment to depend on the gravitational potential, so we just adjusted the readings from our apparatuses to bring you a wrong result. Our friend is a NASA astronaut and we wanted to make her and women in science more relevant, so we added her altitude to the muon's magnetic moment." Well, I don't think that they're doing it. They just ignore the gravitational field-related issues and the basic principles of physics show why this strategy is consistent and the results are actually relevant in any gravitational field. It's just wrong to "adjust" results directly measured in closed localized lab by any gravitational potentials and similar factors. This disagreement may have almost "moral" dimensions. Some people would want the experimenters to add lots of corrections which makes it "very scientific", they think. But in good science, experimenters never add corrections of the type they don't fully understand or they don't fully report – they are supposed to be comprehensible and report the readings of their displays or process them in ways that they have completely mastered and described. In this context, experimenters just never adjust any readings by any Earth's gravitational corrections and theorists understand that and why the results obtained in this way are still relevant even outside Earth's gravitational field.)

In some broad sense, the same comments apply to global symmetries, too. The special theory of relativity has the Lorentz and Poincaré symmetries. The Lorentz symmetry (a modern deformation of the Galilean symmetry) guarantees that you can't experimentally determine whether the train where you perform your experiments is moving. So if someone measured the muon magnetic moment inside a uniformly moving train, he couldn't measure any terms proportional to the train speed \(v\) even if he wanted! That's simply impossible by the symmetry – you just can't see any \(v\) or a non-trivial function of \(v\) on your experimental apparatuses' display.

You could only measure \(v\) or things that depend on \(v\) if you did something with results of experiments done both inside the moving train and inside the railway station – and on top of that, these two labs (static and moving) would have to interact or communicate with each other in some way. The muon experimenters (and their colleagues whose results needed to be relied upon) haven't done any experiment outside the Earth's gravitational field where \(\phi\approx 0\), I think, so they couldn't have made any comparison like that, which is why their results – functions of readings on their displays – just can't depend on \(\phi\). Again, the reason is analogous to the reason why the experiments done in the train cannot depend on the train speed.

These are some very general, basic, precious principles underlying modern physics. Every good theoretical physicist loves them, knows them, and appreciates them (and appreciates Albert Einstein who has really brought us this new, powerful way of thinking that's been extended in so many amazing ways). The Japanese papers are an example of ambitious work by self-confident people who don't know these rudiments of modern theoretical physics and who want to pretend that it doesn't matter. But it does matter a great deal. When it comes to the framing of the papers by many, the papers aren't anything else than just another attempted attack on theoretical physics as a discipline. Look how stupid the theoretical physicists are, using all the overly complex mathematical formalism and overlooking the obvious things that may be discussed using the undergraduate formalism. (Tommaso Dorigo makes this excited claim in between the lines and a reader of his blog makes it explicitly.) Except that the graduate school formalism is needed to discuss the magnetic moments of leptons at the state-of-the-art precision.

And after all, it's even untrue that the professional particle physicists – with their Feynman diagrams and loops etc. – are making things more complicated than they need to be and than the Japanese folks. Unlike the Japanese men and similar folks, the competent theoretical particle physicists maximally utilize the deep principles such as symmetries that greatly simplify situations and calculations. It's really the Japanese men who dedicate dozens of pages to the calculation of an effect that is known to be zero by a straightforward symmetry-based argument!

In other words, if you don't know the tools of mathematics and modern theoretical physics, you're bound to spend most of your time by being lost inside seemingly complicated stuff and expressions that the competent folks immediately see to equal zero – or another simple value (and you're almost guaranteed to do lots of errors in that context). Competent theoretical physicists still spend lots of time with complicated stuff, but that's because everything that could be have been simplified has been simplified.

If you want to be a good or at least decent theoretical physicist, you just need to learn those methods and principles and you need to learn them properly. If you think that experiments "usually measure gauge-variant or coordinate-dependent quantities", you have learned almost nothing from modern physics yet.

by Luboš Motl ( at February 04, 2018 10:00 AM

Lubos Motl - string vacua and pheno

Joe Polchinski, 1954-2018
February 2nd was a rather miserable day. Dow sinked by some 2.5%, the worst drop since the Brexit referendum. Even the cryptocurrency believers were feeling lousy – the Bitcoin touched $7,550 and this event has surely convinced many that it could go to zero.

It was a much worse day for theoretical physics because Joe Polchinski died of brain cancer in the morning.

Things were surely more problem free when I took that picture (his main picture on Wikipedia) on a beach in Santa Barbara. I believe that Joe was the captain of a string theory soccer team on that day and that team defeated some condensed matter chaps or someone like that. Polchinski was also a vigorous bike rider in the mountains and other things.

Well, Joe's friend and ex-classmate who is an important member of the TRF community told me about the bad news very early. Now it's everywhere on Twitter (newest) etc. Here is the official letter by the chancellor to the UCSB.

Polchinski was a kind man and a top physicist. I won't try to review his life here because I don't feel enthusiastic for such things now – check a text about Polchinski's memories published half a year ago.

There are 175 TRF blog posts that contain the word "Polchinski" and I have found 756 e-mails with the word in my Gmail archive. Yes, I am still proud that my surname appears 128 times on a web page of Polchinski's. As a part-time guest blogger, Polchinski expanded top ten results of string theory to top twelve. He's been one of the best scientists who sometimes confronted the critics of string theory (see e.g. his string theory to the rescue).

Polchinski wrote one of the most famous string theory textbooks, did the most groundbreaking work to invent D-branes and make them useful. He has played with the black hole information puzzle, wormholes, and similar hard things in quantum gravity a lot – and coined Polchinski's paradox, firewall, and Everett phone in this context. Some of them are believed by most of his colleagues to be wrong but they still showed his passion and creativity.


Among the 187 papers with his name on Inspire, 15 are renowned (above 500 citations). It may be interesting to enumerate them:
Quite a list, and it's just a "demo".

R.I.P., Joe.

by Luboš Motl ( at February 04, 2018 08:04 AM

John Baez - Azimuth


Mike Stay is applying category theory to computation at a new startup called Pyrofex. And this startup has now entered a deal with RChain.

But let me explain why I’m interested. I’m interested in applied category theory… but this is special.

Mike Stay came to work with me at U.C. Riverside after getting a master’s in computer science at the University of Auckland, where he worked with Cristian Calude on algorithmic randomness. For example:

• Cristian S. Calude and Michael A. Stay, From Heisenberg to Gödel via Chaitin, International Journal of Theoretical Physics 44 (2008), 1053–1065.

• Cristian S. Calude and Michael A. Stay, Most programs stop quickly or never halt, Advances in Applied Mathematics, 40 (2008), 295–308.

It seems like ages ago, but I dimly remember looking at his application, seeing the title of the first of these two papers, and thinking “he’s either a crackpot, or I’m gonna like him”.

You see, the lure of relating Gödel’s theorem to Heisenberg’s uncertainty principle is fatally attractive to many people who don’t really understand either. I looked at the paper, decided he wasn’t a crackpot… and yes, it turned out I liked him.

(A vaguely similar thing happened with my student Chris Rogers, who’d written some papers applying techniques from general relativity to the study of crystals. As soon as I assured myself this stuff was for real, I knew I’d like him!)

Since Mike knew a lot about computer science, his presence at U. C. Riverside emboldened me to give a seminar on classical versus quantum computation. I used this as an excuse to learn the old stuff about the lambda-calculus and cartesian closed categories. When I first started, I thought the basic story would be obvious: people must be making up categories where the morphisms describe processes of computation.

But I soon learned I was wrong: people were making up categories where objects were data types, but the morphisms were equivalence classes of things going between data types—and this equivalence relation completely washed out the difference, between, say, a program that actually computes 237 × 419 and a program that just prints out 99303, which happens to be the answer to that problem.

In other words, the actual process of computation was not visible in the category-theoretic framework. I decided that to make it visible, what we really need are 2-categories in which 2-morphisms are ‘processes of computation’. Or in the jargon: objects are types, morphisms are terms, and 2-morphisms are rewrites.

It turned out people had already thought about this:

• Barnaby P. Hilken, Towards a proof theory of rewriting: the simply-typed 2λ-calculus, Theor. Comp. Sci. 170 (1996), 407–444.

• R. A. G. Seely, Weak adjointness in proof theory in Proc. Durham Conf. on Applications of Sheaves, Springer Lecture Notes in Mathematics 753, Springer, Berlin, 1979, pp. 697–701.

• R. A. G. Seely, Modeling computations: a 2-categorical framework, in Proc. Symposium on Logic in Computer Science 1987, Computer Society of the IEEE, pp. 65—71.

But I felt this viewpoint wasn’t nearly as popular as it should be. It should be very popular, at least among theoretical computer scientists, because it describes what’s actually going on in the lambda-calculus. If you read a big fat book on the lambda-calculus, like Barendregt’s The Lambda Calculus: Its Syntax and Semantics, you’ll see it spends a lot of time on reduction strategies: that is, ways of organizing the process of computation. All this is buried in the usual 1-categorical treatment. It’s living up at the 2-morphism level!

Mike basically agreed with me. We started by writing introduction to the usual 1-categorical stuff, and how it shows up in many different fields:

• John Baez and Michael Stay, Physics, topology, logic and computation: a Rosetta Stone, in New Structures for Physics, ed. Bob Coecke, Lecture Notes in Physics vol. 813, Springer, Berlin, 2011, pp. 95–172.

For financial reasons he had to leave U. C. Riverside and take a job at Google. But he finished his Ph.D. at the University of Auckland, with Cristian Calude and me as co-advisors. And large chunk of his thesis dealt with cartesian closed 2-categories and their generalizations suitable for quantum computation:

• Michael Stay, Compact closed bicategories, Theory and Applications of Categories 31 (2016), 755–798.

Great stuff! My students these days are building on this math and marching ahead.

I said Mike ‘basically’ agreed with me. He agreed that we need to go beyond the usual 1-categorical treatment to model the process of computation. But when it came to applying this idea to computer science, Mike wasn’t satisfied with thinking about the lambda-calculus. That’s an old model of computation: it’s okay for a single primitive computer, but not good for systems where different parts are sending messages to each other, like the internet, or even multiprocessing in a single computer. In other words, the lambda-calculus doesn’t really handle the pressing issues of concurrency and distributed computation.

So, Mike wanted to think about more modern formalisms for computation, like the pi-calculus, using 2-categories.

He left Google and wrote some papers with Greg Meredith on these ideas, for example:

• Michael Stay and Lucius Gregory Meredith, Higher category models of the pi-calculus.

• Michael Stay and Lucius Gregory Meredith, Representing operational semantics with enriched Lawvere theories.

The second one takes a key step: moving away from 2-categories to graph-enriched categories, which are simpler and perhaps better.

Then, after various twists and turns, he started a company called Pyrofex with Nash Foster. Or perhaps I should say Foster started a company with Mike, since Foster is the real bigshot of the two. Here’s what their webpage says:


My name is Nash Foster, and together with my friend and colleague Mike Stay, I recently founded a company called Pyrofex. We founded our company for one simple reason: we love to build large-scale distributed systems that are always reliable and secure and we wanted to help users like yourself do it more easily.

When Mike and I founded the company, we felt strongly that several key advances in programming language theory would ease the development of every day large-scale systems. However, we were not sure exactly how to expose the APIs to users or how to build the tool-sets. Grand visions compete with practical necessity, so we spent months talking to users just like you to discover what kinds of things you were most interested in. We spent many hours at white-boards, checking our work against the best CS theory we know. And, of course, we have enjoyed many long days of hacking in the hopes that we can produce something new and useful, for you.

I think this is the most exciting time in history to be a programmer (and I wrote my first program in the early 1980’s). The technologies available today are varied and compelling and exciting. I hope that you’ll be as excited as I am about some of the ideas we discovered while building our software.

And on January 8th, 2018, their company entered into an arrangement with Greg Meredith’s company Rchain. Below I’ll quote part of an announcement. I don’t know much about this stuff—at least, not yet. But I’m happy to see some good ideas getting applied in the real world, and especially happy to see Mike doing it.

The RChain Cooperative & Pyrofex Corporation announce Strategic Partnership

The RChain Cooperative and Pyrofex Corporation today announced strategically important service contracts and an equity investment intended to deliver several mutually beneficial blockchain solutions. RChain will acquire 1.1 million shares of Pyrofex Common Stock as a strategic investment. The two companies will ink separate service contracts to reinforce their existing relationship and help to align their business interests.

Pyrofex will develop critical tools and platform components necessary for the long-term success of the RChain platform. These tools are designed to leverage RChain’s unique blockchain environment and make blockchain development simpler, faster, and more effective than ever before. Under these agreements, Pyrofex will develop the world’s first decentralized IDE for writing blockchain smart contracts on the RChain blockchain.

Pyrofex also commits to continuing the core development of RChain’s blockchain platform and to organizing RChain’s global developer events and conferences.

Comments on the News

“We’re thrilled to have an opportunity to strengthen our relationship with the RChain Cooperative in 2018. Their commitment to open-source development mirrors our own corporate values. It’s a pleasure to have such a close relationship with a vibrant open-source community. I’ve rarely seen the kind of excitement the Coop’s members share and we look forward to delivering some great new technology this year.” — Nash E. Foster, Cofounder & CEO, Pyrofex Corp.

“Intuitive development tools are important for us and the blockchain ecosystem as a whole; we’re incredibly glad Pyrofex intends to launch their tools on RChain first. But, Ethereum has been a huge supporter of RChain and we’re pleased that Pyrofex intends to support Solidity developers as well. Having tools that will make it possible for developers to migrate smart contracts between blockchains is going to create tremendous possibilities.” — Lucius Greg Meredith, President, RChain Cooperative Background

Pyrofex is a software development company co-founded by Dr. Michael Stay, PhD and Nash Foster in 2016. Dr. Stay and Greg Meredith are long-time colleagues and collaborators whose mutual research efforts form the mathematical foundations of RChain’s technology. One example of the work that Greg and Mike have collaborated on is the work on the LADL (Logic as Distributed Law) algorithm. You can watch Dr. Stay present the latest research from the RChain Developers retreat.

Pyrofex and its development team should be familiar to those who follow the RChain Cooperative. They currently employ 14 full-time and several part-time developers dedicated to RChain platform development. Pyrofex CEO Nash Foster and Lead Project Manager Medha Parlikar have helped grow RChain’s development team to an impressive 20+ Core devs with plans on doubling by mid 2018. The team now includes multiple PhDs, ex-Googlers, and other word class talents.

Every Wednesday, you will find Medha on our debrief updating the community with the latest developments in RChain. Here she is announcing the recent node.hello release along with a demo from core developer Chris Kirkwood-Watts.

The working relationship between the RChain Cooperative and Pyrofex has gone so well that the Board of Directors and the community at large have supported Pyrofex’s proposal to develop Cryptofex, the much needed developer tool kit for the decentralized world.

The RChain Cooperative is ecstatic to further develop its relationship with the Pyrofex team.

“As we fly towards Mercury and beyond, we all could use better tools.”
— The RChain Co-op Team.

Listen to the announcement from Greg Meredith as well as a short Q&A with Pyrofex CEO Nash Foster, from a recent community debrief.

The following is an excerpt from the soon to be released Cryptofex Whitepaper.

The Problem: Writing Software is Hard, Compiling is Harder

In 1983, Bordland Software Corporation acquired a small compiler called Compas Pascal and released it in the United States as Turbo Pascal. It was the first product to integrate a compiler and the editor in which software was written and for nearly a decade Borland’s products defined the market for integrated development environments (IDEs).

The year after Borland released TurboPascal, Ken Thompson observed the distinct and unique dangers associated with compiler technologies. In his famous Turing Award acceptance speech, Thompson described a mechanism by which a virus can be injected into a compiler such that every binary compiled with that compiler will replicate the virus.

“In demonstrating the possibility of this kind of attack, I picked on the C compiler. I could have picked on any program-handling program such as an assembler, a loader, or even hardware microcode. As the level of program gets lower, these bugs will be harder and harder to detect. A well installed microcode bug will be almost impossible to detect.” — Ken Thompson

Unfortunately, many developers today remain stuck in a world constructed in the early 1980’s. IDEs remain essentially the same, able to solve only those problems that neatly fit onto their laptop’s single Intel CPU. But barely a month ago, on 22nd November 2017, the Intel Corporation released a critical firmware update to the Intel Management Engine and in the intervening weeks, the public at large has become aware of the “Meltdown” bug. The IME and other components are exactly the sort of low-level microcode applications that Thompson warned about. Intel has demonstrated perfectly that in the past 33 years, we have learned little and gone nowhere.

Ironically, we have had a partial solution to these problems for nearly a decade. In 2009, David A. Wheeler published his PhD dissertation, in which he proposed a mechanism by which multiple compilers can be used to verify the correctness of a compiler output. Such a mechanism turns out to be tailor-made for the decentralized blockchain environment. Combining Wheeler’s mechanism with a set of economic incentives for compile farms to submit correct outputs gives us a very real shot at correcting a problem that has plagued us for more than 30 years.

The Solution: A Distributed and Decentralized Toolchain

If we crack open the development environments at companies like Google and Amazon, many of us would be surprised to discover that very few programs are compiled on a single machine. Already, the most sophisticated organizations in the world have moved to a distributed development environment. This allows them to leverage the cloud, bringing high-performance distributed computing to bear on software development itself. At Google, many thousands of machines churn away compiling code, checking it for correctness, and storing objects to be re-used later. Through clever use of caching and “hermetic” builds, Google makes its builds faster and more computationally efficient than could possibly be done on individual developer workstations. Unfortunately, most of us cannot afford to dedicate thousands of machines to compilation.

The open-source community might be able to build large scale shared compilation environments on the Internet, but Ken Thompson explained to us why we could not trust a shared environment for these workloads. However, in the age of blockchain, it’s now possible to build development environments that harness the power of large-scale compute to compile and check programs against programmer intent. Secure, cheap, and fast — we can get all three.

CryptoFex is just such a Decentralized Integrated Development Environment (DIDE) allowing software engineers to author, test, compile, and statically check their code to ensure that it is secure, efficient, and scalable.

by John Baez at February 04, 2018 01:53 AM

February 02, 2018

Tommaso Dorigo - Scientificblogging

Your Main Reference For Resonances Decaying To Boson Pairs
Some shameless self-promotion is in order today, as my review titled "Hadron Collider Searches for Diboson Resonances", meant for publication on the prestigious journal "Progress in Particle and Nuclear Physics", has been made available on the Cornell Arxiv.
My review covers quite extensively the topic, as it is not constrained in length as other reviews usually are. At 76 pages, and with 500 references, it aims to be the main reference on this type of physics for the next five years or so - at least, this is the stipulation with PPNP. Whether I managed to make it such, it is something to be judged by others.

The plan of the work is as follows:

read more

by Tommaso Dorigo at February 02, 2018 09:06 AM

February 01, 2018

Tommaso Dorigo - Scientificblogging

Gravitational Effects Explain Muon Magnetic Moment Anomaly Away!!
The field of particle physics is populated with believers and skeptics. The believers will try to convince you that new physics is about to be discovered, or that is anyway at close reach. The skeptics will on the other hand look at the mass of confirmations of the current theory -the Standard Model- and claim that any speculation about the existence of discoverable new phenomena has no basis.

read more

by Tommaso Dorigo at February 01, 2018 03:36 PM

Clifford V. Johnson - Asymptotia

Two Events of Interest!

This week there are two opportunities to hear me talk about The Dialogues in person, for you to ask questions, and to get your own personally signed copy!

On Thursday at 7:00pm I'll be chatting with writer M. G. Lord at Vroman's bookstore in Pasadena. We'll talk about the book of course, but probably about science and art and writing and a whole lot of things in general. There'll be a Q&A of course. Then I'll sign the books you bring to me. More details here.

On Friday at 7:30pm I'll chat with Griffith Observatory's curator Laura Danly as part of their excellent All Space Considered show. It'll be up at the Observatory and there'll be a Q&A and of course books on hand for you to obtain and I'll sign them for you. More details here.

Come to one, or come to both if you like, as time, geography, and tastes dictate. They'll be quite different events with different emphases!

Click to continue reading this post

The post Two Events of Interest! appeared first on Asymptotia.

by Clifford at February 01, 2018 04:25 AM

January 31, 2018

Clifford V. Johnson - Asymptotia

Conversation Piece

I wrote a piece for The Conversation two week ago. It turned out to be very well read. It concerns science, entertainment, and culture. I also discuss aspects of how my work on the book fits into the larger arc of my work on engaging the public with science. I hope that you like it. -cvj

New ways scientists can help put science back into popular culture

File 20180116 53324 11262fb.jpg?ixlib=rb 1.1 Science is one thread of culture – and entertainment, including graphic books, can reflect that.
'The Dialogues,' by Clifford V. Johnson (MIT Press 2017), CC BY-ND

Clifford Johnson, University of Southern California – Dornsife College of Letters, Arts and Sciences

How often do you, outside the requirements of an assignment, ponder things like the workings of a distant star, the innards of your phone camera, or the number and layout of petals on a flower? Maybe a little bit, maybe never. Too often, people regard science as sitting outside the general culture: A specialized, difficult topic carried out by somewhat strange people with arcane talents. It’s somehow not for them.

But really science is part of the wonderful tapestry of human culture, intertwined with things like art, music, theater, film and even religion. These elements of our culture help us understand and celebrate our place in the universe, navigate it and be in dialogue with it and each other. Everyone should be able to engage freely in whichever parts of the general culture they choose, from going to a show or humming a tune to talking about a new movie over dinner.

Science, though, gets portrayed as opposite to art, intuition and mystery, as though knowing in detail how that flower works somehow undermines its beauty. As a practicing physicist, I disagree. Science can enhance our appreciation of the world around us. It should be part of our general culture, accessible to all. Those “special talents” required in order to engage with and even contribute to science are present in all of us.

So how do we bring about a change? I think using the tools of the general [...] Click to continue reading this post

The post Conversation Piece appeared first on Asymptotia.

by Clifford at January 31, 2018 07:14 AM

January 29, 2018

Sean Carroll - Preposterous Universe

Guest Post: Nicole Yunger Halpern on What Makes Extraordinary Science Extraordinary

Nicole Yunger Halpern is a theoretical physicist at Caltech’s Institute for Quantum Information and Matter (IQIM).  She blends quantum information theory with thermodynamics and applies the combination across science, including to condensed matter; black-hole physics; and atomic, molecular, and optical physics. She writes for Quantum Frontiers, the IQIM blog, every month.

What makes extraordinary science extraordinary?

Political junkies watch C-SPAN. Sports fans watch ESPN. Art collectors watch Christie’s. I watch scientists respond to ideas.

John Preskill—Caltech professor, quantum-information theorist, and my PhD advisor—serves as the Chief Justice John Roberts of my C-SPAN. Ideas fly during group meetings, at lunch outside a campus cafeteria, and in John’s office. Many ideas encounter a laconicism compared with which Ernest Hemingway babbles. “Hmm,” I hear. “Ok.” “Wait… What?”

The occasional idea provokes an “mhm.” The final syllable has a higher pitch than the first. Usually, the inflection change conveys agreement and interest. Receiving such an “mhm” brightens my afternoon like a Big Dipper sighting during a 9 PM trudge home.

Hearing “That’s cool,” “Nice,” or “I’m excited,” I cartwheel internally.

What distinguishes “ok” ideas from “mhm” ideas? Peeling the Preskillite trappings off this question reveals its core: What distinguishes good science from extraordinary science?

I’ve been grateful for opportunities to interview senior scientists, over the past few months, from coast to coast. The opinions I collected varied. Several interviewees latched onto the question as though they pondered it daily. A couple of interviewees balked (I don’t know; that’s tricky…) but summoned up a sermon. All the responses fired me up: The more wisps of mist withdrew from the nature of extraordinary science, the more I burned to contribute.

I’ll distill, interpret, and embellish upon the opinions I received. Italics flag lines that I assembled to capture ideas that I heard, as well as imperfect memories of others’ words. Quotation marks surround lines that others constructed. Feel welcome to chime in, in the “comments” section.

One word surfaced in all, or nearly all, my conversations: “impact.” Extraordinary science changes how researchers across the world think. Extraordinary science reaches beyond one subdiscipline.

This reach reminded me of answers to a question I’d asked senior scientists when in college: “What do you mean by ‘beautiful’?”  Replies had varied, but a synopsis had crystallized: “Beautiful science enables us to explain a lot with a little.” Schrodinger’s equation, which describes how quantum systems evolve, fits on one line. But the equation describes electrons bound to atoms, particles trapped in boxes, nuclei in magnetic fields, and more. Beautiful science, which overlaps with extraordinary science, captures much of nature in a small net.

Inventing a field constitutes extraordinary science. Examples include the fusion of quantum information with high-energy physics. Entanglement, quantum computation, and error correction are illuminating black holes, wormholes, and space-time.

Extraordinary science surprises us, revealing faces that we never expected nature to wear. Many extraordinary experiments generate data inexplicable with existing theories. Some extraordinary theory accounts for puzzling data; some extraordinary theory provokes experiments. I graduated from the Perimeter Scholars International Masters program,  at the Perimeter Institute for Theoretical Physics, almost five years ago. Canadian physicist Art McDonald presented my class’s commencement address. An interest in theory, he said, brought you to this institute. Plunge into theory, if you like. Theorem away. But keep a bead on experiments. Talk with experimentalists; work to understand them. McDonald won a Nobel Prize, two years later, for directing the Sudbury Neutrino Observatory (SNO). (SNOLab, with the Homestake experiment, revealed properties of subatomic particles called “neutrinos.” A neutrino’s species can change, and neutrinos have tiny masses. Neutrinos might reveal why the universe contains more matter than antimatter.)

Not all extraordinary theory clings to experiment like bubblegum to hair. Elliott Lieb and Mary Beth Ruskai proved that quantum entropies obey an inequality called “strong subadditivity” (SSA).  Entropies quantify uncertainty about which outcomes measurements will yield. Experimentalists could test SSA’s governance of atoms, ions, and materials. But no physical platform captures SSA’s essence.

Abstract mathematics underlies Lieb and Ruskai’s theorem: convexity and concavity (properties of functions), the Golden-Thompson inequality (a theorem about exponentials of matrices), etc. Some extraordinary theory dovetails with experiment; some wings away.

One interviewee sees extraordinary science in foundational science. At our understanding’s roots lie ideas that fertilize diverse sprouts. Other extraordinary ideas provide tools for calculating, observing, or measuring. Richard Feynman sped up particle-physics computations, for instance, by drawing diagrams.  Those diagrams depict high-energy physics as the creation, separation, recombination, and annihilation of particles. Feynman drove not only a technical, but also a conceptual, advance. Some extraordinary ideas transform our views of the world.

Difficulty preoccupied two experimentalists. An experiment isn’t worth undertaking, one said, if it isn’t difficult. A colleague, said another, “does the impossible and makes it look easy.”

Simplicity preoccupied two theorists. I wrung my hands, during year one of my PhD, in an email to John. The results I’d derived—now that I’d found them— looked as though I should have noticed them months earlier. What if the results lacked gristle? “Don’t worry about too simple,” John wrote back. “I like simple.”

Another theorist agreed: Simplification promotes clarity. Not all simple ideas “go the distance.” But ideas run farther when stripped down than when weighed down by complications.

Extraordinary scientists have a sense of taste. Not every idea merits exploration. Identifying the ideas that do requires taste, or style, or distinction. What distinguishes extraordinary science? More of the theater critic and Julia Child than I expected five years ago.

With gratitude to the thinkers who let me pick their brains.

by Sean Carroll at January 29, 2018 05:45 PM

Tommaso Dorigo - Scientificblogging

Dijet Resonances With A Photon, And The 150 GeV ATLAS Excess
A search just posted on the Cornell arXiv by the ATLAS Collaboration caught my eye today, as it involves a signature which I have been interested on for quite a while. This is the final state of proton-proton collisions that includes an energetic photon and a recoiling dijet system, when the dijet system may be the result of the decay of a W or Z boson - or of a heavier partner of the Z, called Z'. 

The setting

read more

by Tommaso Dorigo at January 29, 2018 12:57 PM

Georg von Hippel - Life on the lattice

Looking for guest blogger(s) to cover LATTICE 2018
Since I will not be attending LATTICE 2018 for some excellent personal reasons, I am looking for a guest blogger or even better several guest bloggers from the lattice community who would be interested in covering the conference. Especially for advanced PhD students or junior postdocs, this might be a great opportunity to get your name some visibility. If you are interested, drop me a line either in the comment section or by email (my university address is easy to find).

by Georg v. Hippel ( at January 29, 2018 11:49 AM

January 25, 2018

Alexey Petrov - Symmetry factor

Rapid-response (non-linear) teaching: report

Some of you might remember my previous post about non-linear teaching, where I described a new teaching strategy that I came up with and was about to implement in teaching my undergraduate Classical Mechanics I class. Here I want to report on the outcomes of this experiment and share some of my impressions on teaching.

Course description

Our Classical Mechanics class is a gateway class for our physics majors. It is the first class they take after they are done with general physics lectures. So the students are already familiar with the (simpler version of the) material they are going to be taught. The goal of this class is to start molding physicists out of physics students. It is a rather small class (max allowed enrollment is 20 students; I had 22 in my class), which makes professor-student interaction rather easy.

Rapid-response (non-linear) teaching: generalities

To motivate the method that I proposed, I looked at some studies in experimental psychology, in particular in memory and learning studies. What I was curious about is how much is currently known about the process of learning and what suggestions I can take from the psychologists who know something about the way our brain works in retaining the knowledge we receive.

As it turns out, there are some studies on this subject (I have references, if you are interested). The earliest ones go back to 1880’s when German psychologist Hermann Ebbinghaus hypothesized the way our brain retains information over time. The “forgetting curve” that he introduced gives approximate representation of information retention as a function of time. His studies have been replicated with similar conclusions in recent experiments.

EbbinghausCurveThe upshot of these studies is that loss of learned information is pretty much exponential; as can be seen from the figure on the left, in about a day we only retain about 40% of what we learned.

Psychologists also learned that one of the ways to overcome the loss of information is to (meaningfully) retrieve it: this is how learning  happens. Retrieval is critical for robust, durable, and long-term learning. It appears that every time we retrieve learned information, it becomes more accessible in the future. It is, however, important how we retrieve that stored information: simple re-reading of notes or looking through the examples will not be as effective as re-working the lecture material. It is also important how often we retrieve the stored info.

So, here is what I decided to change in the way I teach my class in light of the above-mentioned information (no pun intended).

Rapid-response (non-linear) teaching: details

To counter the single-day information loss, I changed the way homework is assigned: instead of assigning homework sets with 3-4-5 problems per week, I introduced two types of homework assignments: short homeworks and projects.

Short homework assignments are single-problem assignments given after each class that must be done by the next class. They are designed such that a student needs to re-derive material that was discussed previously in class (with small new twist added). For example, if the block-down-to-incline problem was discussed in class, the short assignment asks to redo the problem with a different choice of coordinate axes. This way, instead of doing an assignment in the last minute at the end of the week, the students are forced to work out what they just learned in class every day (meaningful retrieval)!

The second type of assignments, project homework assignments are designed to develop understanding of how topics in a given chapter relate to each other. There are as many project assignments as there are chapters. Students get two weeks to complete them.

At the end, the students get to solve approximately the same number of problems over the course of the semester.

For a professor, the introduction of short homework assignments changes the way class material is presented. Depending on how students performed on the previous short homework, I adjusted the material (both speed and volume) that we discussed in class. I also designed examples for the future sections in such a way that I could repeat parts of the topic that posed some difficulties in comprehension. Overall, instead of a usual “linear” propagation of the course, we moved along something akin to helical motion, returning and spending more time on topics that students found more difficult (hence “rapid-response or non-linear” teaching).

Other things were easy to introduce: for instance, using Socrates’ method in doing examples. The lecture itself was an open discussion between the prof and students.


So, I have implemented this method in teaching Classical Mechanics I class in Fall 2017 semester. It was not an easy exercise, mostly because it was the first time I was teaching GraphNonlinearTeachingthis class and had no grader help. I would say the results confirmed my expectations: introduction of short homework assignments helps students to perform better on the exams. Now, my statistics is still limited: I only had 20 students in my class. Yet, among students there were several who decided to either largely ignore short homework assignments or did them irregularly. They were given zero points for each missed short assignment. All students generally did well on their project assignments, yet there appears some correlation (see graph above) between the total number of points acquired on short homework assignments and exam performance (measured by a total score on the Final and two midterms). This makes me thing that short assignments were beneficial for students. I plan to teach this course again next year, which will increase my statistics.

I was quite surprised that my students generally liked this way of teaching. In fact, they were disappointed that I decided not to apply this method for the Mechanics II class that I am teaching this semester. They also found that problems assigned in projects were considerably harder than the problems from the short assignments (this is how it was supposed to be).

For me, this was not an easy semester. I had to develop my set of lectures — so big thanks go to my colleagues Joern Putschke and Rob Harr who made their notes available. I spent a lot of time preparing this course, which, I think, affected my research outcome last semester. Yet, most difficulties are mainly Wayne State-specifics: Wayne State does not provide TAs for small classes, so I had to not only design all homework assignments, but also grade them (on top of developing the lectures from the ground up). During the semester, it was important to grade short assignments in the same day I received them to re-tune lectures, this did take a lot of my time. I would say TAs would certainly help to run this course — so I’ll be applying for some internal WSU educational grants to continue development of this method. I plan to employ it again next year to teach Classical Mechanics.

by apetrov at January 25, 2018 08:18 PM

January 22, 2018

Axel Maas - Looking Inside the Standard Model

Finding - and curing - disagreements
The topic of grand-unified theories came up in the blog several times, most recently last year in January. To briefly recap, such theories, called GUTs for short, predict that all three forces between elementary particles emerge from a single master force. That would explain a lot of unconnected observations we have in particle physics. For example, why atoms are electrically neutral. The latter we can describe, but not yet explain.

However, if such a GUT exists, then it must not only explain the forces, but also somehow why we see the numbers and kinds of elementary particles we observe in nature. And now things become complicated. As discussed in the last entry on GUTs there maybe a serious issue in how we determine which particles are actually described by such a theory.

To understand how this issue comes about, I need to put together many different things my research partners and I have worked on during the last couple of years. All of these issues are actually put into an expert language in the review of which I talked in the previous entry. It is now finished, and if your interested, you can get it free from here. But it is very technical.

So, let me explain it less technically.

Particle physics is actually superinvolved. If we would like to write down a theory which describes what we see, and only what we see, it would be terribly complicated. It is much more simple to introduce redundancies in the description, so-called gauge symmetries. This makes life much easier, though still not easy. However, the most prominent feature is that we add auxiliary particles to the game. Of course, they cannot be really seen, as they are just auxiliary. Some of them are very obviously unphysical, called therefore ghosts. They can be taken care of comparatively simply. For others, this is less simple.

Now, it turns out that the weak interaction is a very special beast. In this case, there is a unique one-to-one identification between a really observable particle and an auxiliary particle. Thus, it is almost correct to identify both. But this is due to the very special structure of this part of particle physics.

Thus, a natural question is whether, even if it is special, it is justified to do the same for other theories. Well, in some cases, this seems to be the case. But we suspected that this may not be the case in general. And especially not in GUTs.

Now, recently we were going about this much more systematically. You can again access the (very, very technical) result for free here. There, we looked at a very generic class of such GUTs. Well, we actually looked at the most relevant part of them, and still by far not all of them. We also ignored a lot of stuff, e.g. what would become quarks and leptons, and concentrated only on the generalization of the weak interaction and the Higgs.

We then checked, based on our earlier experiences and methods, whether a one-to-one identification of experimentally accessible and auxiliary particles works. And it does essentially never. Visually, this result looks like

On the left, it is seen that everything works nicely with a one-to-one identification in the standard model. On the right, if one-to-one identification would work in a GUT, everything would still be nice. But a our more precise calculation shows that the actually situation, which would be seen in an experiment, is different. There is non one-to-one identification possible. And thus the prediction of the GUT differs from what we already see inn experiments. Thus, a previously good GUT candidate is no longer good.

Though more checks are needed, as always, this is a baffling, and at the same time very discomforting, result.

Baffling as we did originally expect to have problems under very special circumstances. It now appears that actually the standard model of particles is the very special case, and having problems is the standard.

It is discomforting because in the powerful method of perturbation theory the one-to-one identification is essentially always made. As this tool is widely used, this seems to question the validity of many predictions on GUTs. That could have far-reaching consequences. Is this the case? Do we need to forget everything about GUTs we learned so far?

Well, not really, for two reasons. One is that we also showed that methods almost as easily handleable as perturbation theory can be used to fix the problems. This is good, because more powerful methods, like the simulations we used before, are much more cumbersome. However, this leaves us with the problem of having made so far wrong predictions. Well, this we cannot change. But this is just normal scientific progress. You try, you check, you fail, you improve, and then you try again.

And, in fact, this does not mean that GUTs are wrong. Just that we need to consider somewhat different GUTs, and make the predictions more carefully next time. Which GUTs we need to look at we still need to figure out, and that will not be simple. But, fortunately, the improved methods mentioned beforehand can use much of what has been done so far, so most technical results are still unbelievable useful. This will help enormously in finding GUTs which are applicable, and yield a consistent picture, without the one-to-one identification. GUTs are not dead. They likely just need a bit of changing.

This is indeed a dramatic development. But one which fits logically and technically to the improved understanding of the theoretical structures underlying particle physics, which were developed over the last decades. Thus, we are confident that this is just the next logical step in our understanding of how particle physics works.

by Axel Maas ( at January 22, 2018 04:54 PM

January 21, 2018

Clifford V. Johnson - Asymptotia


(Process from the book.)

The book is appearing on bookshelves in the UK this week! Warehouses filling up for UK online shipping too.

-cvj Click to continue reading this post

The post Thinking… appeared first on Asymptotia.

by Clifford at January 21, 2018 07:41 PM

January 17, 2018

Sean Carroll - Preposterous Universe

Beyond Falsifiability

I have a backlog of fun papers that I haven’t yet talked about on the blog, so I’m going to try to work through them in reverse chronological order. I just came out with a philosophically-oriented paper on the thorny issue of the scientific status of multiverse cosmological models:

Beyond Falsifiability: Normal Science in a Multiverse
Sean M. Carroll

Cosmological models that invoke a multiverse – a collection of unobservable regions of space where conditions are very different from the region around us – are controversial, on the grounds that unobservable phenomena shouldn’t play a crucial role in legitimate scientific theories. I argue that the way we evaluate multiverse models is precisely the same as the way we evaluate any other models, on the basis of abduction, Bayesian inference, and empirical success. There is no scientifically respectable way to do cosmology without taking into account different possibilities for what the universe might be like outside our horizon. Multiverse theories are utterly conventionally scientific, even if evaluating them can be difficult in practice.

This is well-trodden ground, of course. We’re talking about the cosmological multiverse, not its very different relative the Many-Worlds interpretation of quantum mechanics. It’s not the best name, as the idea is that there is only one “universe,” in the sense of a connected region of space, but of course in an expanding universe there will be a horizon past which it is impossible to see. If conditions in far-away unobservable regions are very different from conditions nearby, we call the collection of all such regions “the multiverse.”

There are legitimate scientific puzzles raised by the multiverse idea, but there are also fake problems. Among the fakes is the idea that “the multiverse isn’t science because it’s unobservable and therefore unfalsifiable.” I’ve written about this before, but shockingly not everyone immediately agreed with everything I have said.

Back in 2014 the Edge Annual Question was “What Scientific Theory Is Ready for Retirement?”, and I answered Falsifiability. The idea of falsifiability, pioneered by philosopher Karl Popper and adopted as a bumper-sticker slogan by some working scientists, is that a theory only counts as “science” if we can envision an experiment that could potentially return an answer that was utterly incompatible with the theory, thereby consigning it to the scientific dustbin. Popper’s idea was to rule out so-called theories that were so fuzzy and ill-defined that they were compatible with literally anything.

As I explained in my short write-up, it’s not so much that falsifiability is completely wrong-headed, it’s just not quite up to the difficult task of precisely demarcating the line between science and non-science. This is well-recognized by philosophers; in my paper I quote Alex Broadbent as saying

It is remarkable and interesting that Popper remains extremely popular among natural scientists, despite almost universal agreement among philosophers that – notwithstanding his ingenuity and philosophical prowess – his central claims are false.

If we care about accurately characterizing the practice and principles of science, we need to do a little better — which philosophers work hard to do, while some physicists can’t be bothered. (I’m not blaming Popper himself here, nor even trying to carefully figure out what precisely he had in mind — the point is that a certain cartoonish version of his views has been elevated to the status of a sacred principle, and that’s a mistake.)

After my short piece came out, George Ellis and Joe Silk wrote an editorial in Nature, arguing that theories like the multiverse served to undermine the integrity of physics, which needs to be defended from attack. They suggested that people like me think that “elegance [as opposed to data] should suffice,” that sufficiently elegant theories “need not be tested experimentally,” and that I wanted to “to weaken the testability requirement for fundamental physics.” All of which is, of course, thoroughly false.

Nobody argues that elegance should suffice — indeed, I explicitly emphasized the importance of empirical testing in my very short piece. And I’m not suggesting that we “weaken” anything at all — I’m suggesting that we physicists treat the philosophy of science with the intellectual care that it deserves. The point is not that falsifiability used to be the right criterion for demarcating science from non-science, and now we want to change it; the point is that it never was, and we should be more honest about how science is practiced.

Another target of Ellis and Silk’s ire was Richard Dawid, a string theorist turned philosopher, who wrote a provocative book called String Theory and the Scientific Method. While I don’t necessarily agree with Dawid about everything, he does make some very sensible points. Unfortunately he coins the term “non-empirical theory confirmation,” which was an extremely bad marketing strategy. It sounds like Dawid is saying that we can confirm theories (in the sense of demonstrating that they are true) without using any empirical data, but he’s not saying that at all. Philosophers use “confirmation” in a much weaker sense than that of ordinary language, to refer to any considerations that could increase our credence in a theory. Of course there are some non-empirical ways that our credence in a theory could change; we could suddenly realize that it explains more than we expected, for example. But we can’t simply declare a theory to be “correct” on such grounds, nor was Dawid suggesting that we could.

In 2015 Dawid organized a conference on “Why Trust a Theory?” to discuss some of these issues, which I was unfortunately not able to attend. Now he is putting together a volume of essays, both from people who were at the conference and some additional contributors; it’s for that volume that this current essay was written. You can find other interesting contributions on the arxiv, for example from Joe Polchinski, Eva Silverstein, and Carlo Rovelli.

Hopefully with this longer format, the message I am trying to convey will be less amenable to misconstrual. Nobody is trying to change the rules of science; we are just trying to state them accurately. The multiverse is scientific in an utterly boring, conventional way: it makes definite statements about how things are, it has explanatory power for phenomena we do observe empirically, and our credence in it can go up or down on the basis of both observations and improvements in our theoretical understanding. Most importantly, it might be true, even if it might be difficult to ever decide with high confidence whether it is or not. Understanding how science progresses is an interesting and difficult question, and should not be reduced to brandishing bumper-sticker mottos to attack theoretical approaches to which we are not personally sympathetic.

by Sean Carroll at January 17, 2018 04:44 PM

January 13, 2018

Clifford V. Johnson - Asymptotia

Process for The Dialogues

Over on instagram (@asymptotia – and maybe here too, not sure) I’ll be posting some images of developmental drawings I did for The Dialogues, sometimes alongside the finished panels in the book. It is often very interesting to see how a finished thing came to be, and so that’s why … Click to continue reading this post

The post Process for The Dialogues appeared first on Asymptotia.

by Clifford at January 13, 2018 12:55 AM

January 06, 2018

Jon Butterworth - Life and Physics

Atom Land: A Guided Tour Through the Strange (and Impossibly Small) World of Particle Physics

Book review in Publishers’ Weekly.

Butterworth (Most Wanted Particle), a CERN alum and professor of physics at University College London, explains everything particle physics from antimatter to Z bosons in this charming trek through a landscape of “the otherwise invisible.” His accessible narrative cleverly relates difficult concepts, such as wave-particle duality or electron spin, in bite-size bits. Readers become explorers on Butterworth’s metaphoric map… Read more.

by Jon Butterworth at January 06, 2018 05:13 PM

Jon Butterworth - Life and Physics

December 30, 2017

Cormac O’Raifeartaigh - Antimatter (Life in a puzzling universe)

A week’s research and a New Year resolution

If anyone had suggested a few years ago that I would forgo a snowsports holiday in the Alps for a week’s research, I would probably not have believed them. Yet here I am, sitting comfortably in the library of the Dublin Institute for Advanced Studies.

ew3          The School of Theoretical Physics at the Dublin Institute for Advanced Studies

It’s been a most satisfying week. One reason is that a change truly is as good as a rest – after a busy teaching term, it’s very enjoyable to spend some time in a quiet spot, surrounded by books on the history of physics. Another reason is that one can accomplish an astonishing amount in one week’s uninterrupted study. That said, I’m not sure I could do this all year round, I’d miss the teaching!

As regards a resolution for 2018, I’ve decided to focus on getting a book out this year. For some time, I have been putting together a small introductory book on the big bang theory, based on a public lecture I give to diverse audiences, from amateur astronomers to curious taxi drivers. The material is drawn from a course I teach at both Waterford Institute of Technology and University College Dublin and is almost in book form already. The UCD experience is particularly useful, as the module is aimed at first-year students from all disciplines.

Of course, there are already plenty of books out there on this topic. My students have a comprehensive reading list, which includes classics such as A Brief History of Time (Hawking), The First Three Minutes (Weinberg) and The Big Bang (Singh). However, I regularly get feedback to the effect that the books are too hard (Hawking) or too long (Singh) or out of date (Weinberg). So I decided a while ago to put together my own effort; a useful exercise if nothing else comes of it.

In particular, I intend to take a historical approach to the story. I’m a great believer in the ‘how-we-found-out’ approach to explaining scientific theories (think for example of that great BBC4 documentary on the discovery of oxygen). My experience is that a historical approach allows the reader to share the excitement of discovery and makes technical material much easier to understand. In addition, much of the work of the early pioneers remains relevant today. The challenge will be to present a story that is also concise – that’s the hard part!

by cormac at December 30, 2017 04:26 PM

December 28, 2017

Life as a Physicist

Christmas Project

Every Christmas I try to do some sort of project. Something new. Sometimes it turns into something real, and last for years. Sometimes it goes no where. Normally, I have an idea of what I’m going to attempt – usually it has been bugging me for months and I can’t wait till break to get it started. This year, I had none.

But, I arrived home at my parent’s house in New Jersey and there it was waiting for me. The house is old – more 200 yrs old – and the steam furnace had just been replaced. For those of you unfamiliar with this method of heating a house: it is noisy! The furnace boils water, and the steam is forced up through the pipes to cast iron radiators. The radiators hiss through valves as the air is forced up – an iconic sound from my childhood. Eventually, after traveling sometimes four floors, the super hot steam reaches the end of a radiator and the valve shuts off. The valves are cool – heat sensitive! The radiator, full of hot steam, then warms the room – and rather effectively.

The bane of this system, however, is that it can leak. And you have no idea where the leak is in the whole house! The only way you know: the furnace reservoir needs refilling too often. So… the problem: how to detect the reservoir needs refilling? Especially with this new modern furnace which can automatically refill its resevoir.

Me: Oh, look, there is a little LED that comes on when the automatic refilling system comes on! I can watch that! Dad: Oh, look, there is a little light that comes on when the water level is low. We can watch that.

Dad’s choice of tools: a wifi cam that is triggered by noise. Me: A Raspberry Pi 3, a photo-resistor, and a capacitor. Hahahaha. Game on!

IMG_20171227_030002What’s funny? Neither of us have detected a water-refill since we started this project. The first picture at the right you can see both of our devices – in the foreground taped to the gas input line is the CAM watching the water refill light through a mirror, and in the background (look for the yellow tape) is the Pi taped to the refill controller (and the capacitor and sensor hanging down looking at the LED on the bottom of the box).

I chose the Pi because I’ve used it once before – for a Spotify end-point. But never for anything that it is designed for. An Arduino is almost certainly better suited to this – but I wasn’t confident that I could get it up and running in the 3 days I had to make this (including time for ordering and shipping of all parts from Amazon). It was a lot of fun! And consumed a bunch of time. “Hey, where is Gordon? He needs to come for Christmas dinner!” “Wait, are you working on Christmas day?” – for once I could answer that last one with a honest no! Hahaha. Smile

I learned a bunch:

  • I had to solder! It has been a loooong time since I’ve done that. My first graduate student, whom I made learn how to solder before I let him graduate, would have laughed at how rusty my skills were!
  • I was surprised to learn, at the start, that the Pi has no analog to digital converter. I stole a quick and dirty trick that lots of people have used to get around this problem: time how long it takes to charge a capacitor up with a photoresistor. This is probably the biggest source of noise in my system, but does for crude measurements.
  • I got to write all my code in Python. Even interrupt handling (ok, no call backs, but still!)
  • The Pi, by default, runs a full build of Linux. Also, python 3! I made full use of this – all my code is in python, and a bit in bash to help it get going. I used things like cron and pip – they were either there, or trivial to install. Really, for this project, I was never consious of the Pi being anything less than a full computer.
  • At first I tried to write auto detection code – that would see any changes in the light levels and write them to a file… which was then served on a nginx simple webserver (seriously – that was about 2 lines of code to install). But the noise in the system plus the fact that we’ve not had a fill so I don’t know what my signal looks like yet… So, that code will have to be revised.
  • In the end, I have to write a file with the raw data in it, and analyze that – at least, until I know what an actual signal looks like. So… how to get that data off the Pi – especially given that I can’t access it anymore now that I’ve left New Jersey? In the end I used some Python code to push the files to OneDrive. Other than figuring out how to deal with OAuth2, it was really easy (and I’m still not done fighting the authentication battle). What will happen if/when it fails? Well… I’ve recorded the commands my Dad will have to execute to get the new authentication files down there. Hopefully there isn’t going to be an expiration!
  • imageTo analyze the raw data I’ve used a new tool I’ve recently learned at work: numpy and Jupyter notebooks. They allow me to produce a plot like this one. The dip near the left hand side of the plot is my Dad shining the flashlight at my sensors to see if I could actually see anything. The joker.

Pretty much the only thing I’d used before was Linux, and some very simple things with an older Raspberry Pi 2. If anyone is on the fence about this – I’d definately recommend trying it out. It is very easy and there are 1000’s of web pages with step by step instructions for most things you’ll want to do!

    by gordonwatts at December 28, 2017 06:25 AM

    December 15, 2017

    Andrew Jaffe - Leaves on the Line

    WMAP Breaks Through

    It was announced this morning that the WMAP team has won the $3 million Breakthrough Prize. Unlike the Nobel Prize, which infamously is only awarded to three people each year, the Breakthrough Prize was awarded to the whole 27-member WMAP team, led by Chuck Bennett, Gary Hinshaw, Norm Jarosik, Lyman Page, and David Spergel, but including everyone through postdocs and grad students who worked on the project. This is great, and I am happy to send my hearty congratulations to all of them (many of whom I know well and am lucky to count as friends).

    I actually knew about the prize last week as I was interviewed by Nature for an article about it. Luckily I didn’t have to keep the secret for long. Although I admit to a little envy, it’s hard to argue that the prize wasn’t deserved. WMAP was ideally placed to solidify the current standard model of cosmology, a Universe dominated by dark matter and dark energy, with strong indications that there was a period of cosmological inflation at very early times, which had several important observational consequences. First, it made the geometry of the Universe — as described by Einstein’s theory of general relativity, which links the contents of the Universe with its shape — flat. Second, it generated the tiny initial seeds which eventually grew into the galaxies that we observe in the Universe today (and the stars and planets within them, of course).

    By the time WMAP released its first results in 2003, a series of earlier experiments (including MAXIMA and BOOMERanG, which I had the privilege of being part of) had gone much of the way toward this standard model. Indeed, about ten years one of my Imperial colleagues, Carlo Contaldi, and I wanted to make that comparison explicit, so we used what were then considered fancy Bayesian sampling techniques to combine the data from balloons and ground-based telescopes (which are collectively known as “sub-orbital” experiments) and compare the results to WMAP. We got a plot like the following (which we never published), showing the main quantity that these CMB experiments measure, called the power spectrum (which I’ve discussed in a little more detail here). The horizontal axis corresponds to the size of structures in the map (actually, its inverse, so smaller is to the right) and the vertical axis to how large the the signal is on those scales.

    Grand unified spectrum

    As you can see, the suborbital experiments, en masse, had data at least as good as WMAP on most scales except the very largest (leftmost; this is because you really do need a satellite to see the entire sky) and indeed were able to probe smaller scales than WMAP (to the right). Since then, I’ve had the further privilege of being part of the Planck Satellite team, whose work has superseded all of these, giving much more precise measurements over all of these scales: PlanckCl

    Am I jealous? Ok, a little bit.

    But it’s also true, perhaps for entirely sociological reasons, that the community is more apt to trust results from a single, monolithic, very expensive satellite than an ensemble of results from a heterogeneous set of balloons and telescopes, run on (comparative!) shoestrings. On the other hand, the overall agreement amongst those experiments, and between them and WMAP, is remarkable.

    And that agreement remains remarkable, even if much of the effort of the cosmology community is devoted to understanding the small but significant differences that remain, especially between one monolithic and expensive satellite (WMAP) and another (Planck). Indeed, those “real and serious” (to quote myself) differences would be hard to see even if I plotted them on the same graph. But since both are ostensibly measuring exactly the same thing (the CMB sky), any differences — even those much smaller than the error bars — must be accounted for almost certainly boil down to differences in the analyses or misunderstanding of each team’s own data. Somewhat more interesting are differences between CMB results and measurements of cosmology from other, very different, methods, but that’s a story for another day.

    by Andrew at December 15, 2017 08:11 PM

    December 14, 2017

    Robert Helling - atdotde

    What are the odds?
    It's the time of year, you give out special problems in your classes. So this is mine for the blog. It is motivated by this picture of the home secretaries of the German federal states after their annual meeting as well as some recent discussions on Facebook:
    I would like to call it Summers' problem:

    Let's have two real random variables $M$ and $F$ that are drawn according to two probability distributions $\rho_{M/F}(x)$ (for starters you may both assume to be Gaussians but possibly with different mean and variance). Take $N$ draws from each and order the $2N$ results. What is the probability that the $k$ largest ones are all from $M$ rather than $F$? Express your results in terms of the $\rho_{M/F}(x)$. We are also interested in asymptotic results for $N$ large and $k$ fixed as well as $N$ and $k$ large but $k/N$ fixed.

    Last bonus question: How many of the people that say that they hire only based on merit and end up with an all male board realise that by this they say that women are not as good by quite a margin?

    by Robert Helling ( at December 14, 2017 08:58 AM

    November 30, 2017

    Axel Maas - Looking Inside the Standard Model

    Reaching closure – completing a review
    I did not publish anything here within the last few months, as the review I am writingtook up much more time than expected. A lot of interesting project developments happened also during this time. I will write on them as well later, so that nobody will miss out on the insights we gained and the fun we had with them.

    But now, I want to write about how the review comes along. It has now grown into a veritable almost 120 page document. And actually most of it is texts and formulas, and only very few figures. This makes for a lot of content. Right now, it has reached the status of a release candidate 2. This means I have distributed it to many of my colleagues to comment on it. I also used the draft as lecture notes for a lecture on its contents at a winter school in Odense/Denmark (where I actually wrote this blog entry). Why? Because I wanted to have feedback. What can be understood, and what may I have misunderstood? After all, this review not only looks at my own research. Rather, it compiles knowledge from more than a hundred scientists over 45 years. In fact, some of the results I write about have been obtained before I was born. Especially,I could have overlooked results. With by now dozens of new papers per day, this can easily happen. I have collected more than 330 relevant articles, which I refer to in the review.

    And, of course, I could have misunderstood other people’s results or made mistakes. This needs to be avoided in a review as good as possible.

    Indeed, I had many discussions by now on various aspects of the research I review. I got comments and was challenged. In the end, there was always either a conclusion or the insight that some points, believed to be clear, are not as entirely clear as it seemed. There are always more loopholes, more subtleties, than one anticipates. By this, the review became better, and could collect more insights from many brilliant scientists. And likewise I myself learned a lot.

    In the end, I learned two very important lessons about the physics I review.

    The first is that many more things are connected than I expected. Some issues, which looked to my like a parenthetical remark in the beginning became first remarks at more than one place and ultimately became an issue of their on.

    The second is that the standard modelof particle physics is even more special and more balanced than I thought. I was never really thinking that the standard model is so terrible special. Just one theory among many which happen to fit experiments. But really it is an extremely finely adjusted machinery. Every cog in it is important, and even slight changes will make everything fall apart. All the elements are in constant connection with each other, and influence each other.

    Does this mean anything? Good question. Perhaps it is a sign of an underlying ordering principle. But if it is, I cannot see it (yet?). Perhaps this is just an expression of how a law of nature must be – perfectly balanced. At any rate, it gave me a new perspective of what the standard model is.

    So, as I anticipated writing this review gave me a whole new perspectiveand a lot of insights. Partly by formulating questions and answers more precisely. But, and probably more importantly, I had to explain it to others, and to either successfully defend or adapt it or even correct it.

    In addition, two of the most important lessons about understanding physics I learned were the following:

    One: Take your theory seriously. Do not take a shortcut or use some experience. Literally understand what it means and only then start to interpret.

    Two: Pose your questions (and answers) clearly. Every statement should have a well-defined meaning. Never be vague when you want to make a scientific statement. Be always able to back up a question of “what do you mean by this?” by a precise definition. This seems obvious, but is something you tend to be cavalier about. Don’t.

    So, writing a review not only helps in summarizing knowledge. It also helps to understand this knowledge and realize its implications. And, probably fortunately, it poses new questions. What they are, and what we do about, this is something I will write about in the future.

    So, how does it proceed now? In two weeks I have to deliver the review to the journal which mandated it. At the same time (watch my twitteraccount) it will become available on the preprint server, the standard repository of all elementary particle physics knowledge. Then you can see for yourself what I wrote, and wrote about

    by Axel Maas ( at November 30, 2017 05:15 PM

    November 24, 2017

    Sean Carroll - Preposterous Universe


    This year we give thanks for a simple but profound principle of statistical mechanics that extends the famous Second Law of Thermodynamics: the Jarzynski Equality. (We’ve previously given thanks for the Standard Model Lagrangian, Hubble’s Law, the Spin-Statistics Theorem, conservation of momentum, effective field theory, the error bar, gauge symmetry, Landauer’s Principle, the Fourier Transform, Riemannian Geometry, and the speed of light.)

    The Second Law says that entropy increases in closed systems. But really it says that entropy usually increases; thermodynamics is the limit of statistical mechanics, and in the real world there can be rare but inevitable fluctuations around the typical behavior. The Jarzynski Equality is a way of quantifying such fluctuations, which is increasingly important in the modern world of nanoscale science and biophysics.

    Our story begins, as so many thermodynamic tales tend to do, with manipulating a piston containing a certain amount of gas. The gas is of course made of a number of jiggling particles (atoms and molecules). All of those jiggling particles contain energy, and we call the total amount of that energy the internal energy U of the gas. Let’s imagine the whole thing is embedded in an environment (a “heat bath”) at temperature T. That means that the gas inside the piston starts at temperature T, and after we manipulate it a bit and let it settle down, it will relax back to T by exchanging heat with the environment as necessary.

    Finally, let’s divide the internal energy into “useful energy” and “useless energy.” The useful energy, known to the cognoscenti as the (Helmholtz) free energy and denoted by F, is the amount of energy potentially available to do useful work. For example, the pressure in our piston may be quite high, and we could release it to push a lever or something. But there is also useless energy, which is just the entropy S of the system times the temperature T. That expresses the fact that once energy is in a highly-entropic form, there’s nothing useful we can do with it any more. So the total internal energy is the free energy plus the useless energy,

    U = F + TS. \qquad \qquad (1)

    Our piston starts in a boring equilibrium configuration a, but we’re not going to let it just sit there. Instead, we’re going to push in the piston, decreasing the volume inside, ending up in configuration b. This squeezes the gas together, and we expect that the total amount of energy will go up. It will typically cost us energy to do this, of course, and we refer to that energy as the work Wab we do when we push the piston from a to b.

    Remember that when we’re done pushing, the system might have heated up a bit, but we let it exchange heat Q with the environment to return to the temperature T. So three things happen when we do our work on the piston: (1) the free energy of the system changes; (2) the entropy changes, and therefore the useless energy; and (3) heat is exchanged with the environment. In total we have

    W_{ab} = \Delta F_{ab} + T\Delta S_{ab} - Q_{ab}.\qquad \qquad (2)

    (There is no ΔT, because T is the temperature of the environment, which stays fixed.) The Second Law of Thermodynamics says that entropy increases (or stays constant) in closed systems. Our system isn’t closed, since it might leak heat to the environment. But really the Second Law says that the total of the last two terms on the right-hand side of this equation add up to a positive number; in other words, the increase in entropy will more than compensate for the loss of heat. (Alternatively, you can lower the entropy of a bottle of champagne by putting it in a refrigerator and letting it cool down; no laws of physics are violated.) One way of stating the Second Law for situations such as this is therefore

    W_{ab} \geq \Delta F_{ab}. \qquad \qquad (3)

    The work we do on the system is greater than or equal to the change in free energy from beginning to end. We can make this inequality into an equality if we act as efficiently as possible, minimizing the entropy/heat production: that’s an adiabatic process, and in practical terms amounts to moving the piston as gradually as possible, rather than giving it a sudden jolt. That’s the limit in which the process is reversible: we can get the same energy out as we put in, just by going backwards.

    Awesome. But the language we’re speaking here is that of classical thermodynamics, which we all know is the limit of statistical mechanics when we have many particles. Let’s be a little more modern and open-minded, and take seriously the fact that our gas is actually a collection of particles in random motion. Because of that randomness, there will be fluctuations over and above the “typical” behavior we’ve been describing. Maybe, just by chance, all of the gas molecules happen to be moving away from our piston just as we move it, so we don’t have to do any work at all; alternatively, maybe there are more than the usual number of molecules hitting the piston, so we have to do more work than usual. The Jarzynski Equality, derived 20 years ago by Christopher Jarzynski, is a way of saying something about those fluctuations.

    One simple way of taking our thermodynamic version of the Second Law (3) and making it still hold true in a world of fluctuations is simply to say that it holds true on average. To denote an average over all possible things that could be happening in our system, we write angle brackets \langle \cdots \rangle around the quantity in question. So a more precise statement would be that the average work we do is greater than or equal to the change in free energy:

    \displaystyle \left\langle W_{ab}\right\rangle \geq \Delta F_{ab}. \qquad \qquad (4)

    (We don’t need angle brackets around ΔF, because F is determined completely by the equilibrium properties of the initial and final states a and b; it doesn’t fluctuate.) Let me multiply both sides by -1, which means we  need to flip the inequality sign to go the other way around:

    \displaystyle -\left\langle W_{ab}\right\rangle \leq -\Delta F_{ab}. \qquad \qquad (5)

    Next I will exponentiate both sides of the inequality. Note that this keeps the inequality sign going the same way, because the exponential is a monotonically increasing function; if x is less than y, we know that ex is less than ey.

    \displaystyle e^{-\left\langle W_{ab}\right\rangle} \leq e^{-\Delta F_{ab}}. \qquad\qquad (6)

    (More typically we will see the exponents divided by kT, where k is Boltzmann’s constant, but for simplicity I’m using units where kT = 1.)

    Jarzynski’s equality is the following remarkable statement: in equation (6), if we exchange  the exponential of the average work e^{-\langle W\rangle} for the average of the exponential of the work \langle e^{-W}\rangle, we get a precise equality, not merely an inequality:

    \displaystyle \left\langle e^{-W_{ab}}\right\rangle = e^{-\Delta F_{ab}}. \qquad\qquad (7)

    That’s the Jarzynski Equality: the average, over many trials, of the exponential of minus the work done, is equal to the exponential of minus the free energies between the initial and final states. It’s a stronger statement than the Second Law, just because it’s an equality rather than an inequality.

    In fact, we can derive the Second Law from the Jarzynski equality, using a math trick known as Jensen’s inequality. For our purposes, this says that the exponential of an average is less than the average of an exponential, e^{\langle x\rangle} \leq \langle e^x \rangle. Thus we immediately get

    \displaystyle e^{-\left\langle W_{ab}\right\rangle} \leq \left\langle e^{-W_{ab}}\right\rangle = e^{-\Delta F_{ab}}, \qquad\qquad (8)

    as we had before. Then just take the log of both sides to get \langle W_{ab}\rangle \geq \Delta F_{ab}, which is one way of writing the Second Law.

    So what does it mean? As we said, because of fluctuations, the work we needed to do on the piston will sometimes be a bit less than or a bit greater than the average, and the Second Law says that the average will be greater than the difference in free energies from beginning to end. Jarzynski’s Equality says there is a quantity, the exponential of minus the work, that averages out to be exactly the exponential of minus the free-energy difference. The function e^{-W} is convex and decreasing as a function of W. A fluctuation where W is lower than average, therefore, contributes a greater shift to the average of e^{-W} than a corresponding fluctuation where W is higher than average. To satisfy the Jarzynski Equality, we must have more fluctuations upward in W than downward in W, by a precise amount. So on average, we’ll need to do more work than the difference in free energies, as the Second Law implies.

    It’s a remarkable thing, really. Much of conventional thermodynamics deals with inequalities, with equality being achieved only in adiabatic processes happening close to equilibrium. The Jarzynski Equality is fully non-equilibrium, achieving equality no matter how dramatically we push around our piston. It tells us not only about the average behavior of statistical systems, but about the full ensemble of possibilities for individual trajectories around that average.

    The Jarzynski Equality has launched a mini-revolution in nonequilibrium statistical mechanics, the news of which hasn’t quite trickled to the outside world as yet. It’s one of a number of relations, collectively known as “fluctuation theorems,” which also include the Crooks Fluctuation Theorem, not to mention our own Bayesian Second Law of Thermodynamics. As our technological and experimental capabilities reach down to scales where the fluctuations become important, our theoretical toolbox has to keep pace. And that’s happening: the Jarzynski equality isn’t just imagination, it’s been experimentally tested and verified. (Of course, I remain just a poor theorist myself, so if you want to understand this image from the experimental paper, you’ll have to talk to someone who knows more about Raman spectroscopy than I do.)

    by Sean Carroll at November 24, 2017 02:04 AM

    November 09, 2017

    Robert Helling - atdotde

    Why is there a supercontinent cycle?
    One of the most influential books of my early childhood was my "Kinderatlas"
    There were many things to learn about the world (maps were actually only the last third of the book) and for example I blame my fascination for scuba diving on this book. Also last year, when we visited the Mont-Doré in Auvergne and I had to explain how volcanos are formed to my kids to make them forget how many stairs were still ahead of them to the summit, I did that while mentally picturing the pages in that book about plate tectonics.

    But there is one thing I about tectonics that has been bothering me for a long time and I still haven't found a good explanation for (or at least an acknowledgement that there is something to explain): Since the days of Alfred Wegener we know that the jigsaw puzzle pieces of the continents fit in a way that geologists believe that some hundred million years ago they were all connected as a supercontinent Pangea.
    Pangea animation 03.gif
    By Original upload by en:User:Tbower - USGS animation A08, Public Domain, Link

    In fact, that was only the last in a series of supercontinents, that keep forming and breaking up in the "supercontinent cycle".
    By SimplisticReps - Own work, CC BY-SA 4.0, Link

    So here is the question: I am happy with the idea of several (say $N$) plates roughly containing a continent each that a floating around on the magma driven by all kinds of convection processes in the liquid part of the earth. They are moving around in a pattern that looks to me to be pretty chaotic (in the non-technical sense) and of course for random motion you would expect that from time to time two of those collide and then maybe stick for a while.

    Then it would be possible that also a third plate collides with the two but that would be a coincidence (like two random lines typically intersect but if you have three lines they would typically intersect in pairs but typically not in a triple intersection). But to form a supercontinent, you need all $N$ plates to miraculously collide at the same time. This order-$N$ process seems to be highly unlikely when random let alone the fact that it seems to repeat. So this motion cannot be random (yes, Sabine, this is a naturalness argument). This needs an explanation.

    So, why, every few hundred million years, do all the land masses of the earth assemble on side of the earth?

    One explanation could for example be that during those tines, the center of mass of the earth is not in the symmetry center so the water of the oceans flow to one side of the earth and reveals the seabed on the opposite side of the earth. Then you would have essentially one big island. But this seems not to be the case as the continents (those parts that are above sea-level) appear to be stable on much longer time scales. It is not that the seabed comes up on one side and the land on the other goes under water but the land masses actually move around to meet on one side.

    I have already asked this question whenever I ran into people with a geosciences education but it is still open (and I have to admit that in a non-zero number of cases I failed to even make the question clear that an $N$-body collision needs an explanation). But I am sure, you my readers know the answer or even better can come up with one.

    by Robert Helling ( at November 09, 2017 09:35 AM

    October 24, 2017

    Andrew Jaffe - Leaves on the Line

    The Chandrasekhar Mass and the Hubble Constant


    first direct detection of gravitational waves was announced in February of 2015 by the LIGO team, after decades of planning, building and refining their beautiful experiment. Since that time, the US-based LIGO has been joined by the European Virgo gravitational wave telescope (and more are planned around the globe).

    The first four events that the teams announced were from the spiralling in and eventual mergers of pairs of black holes, with masses ranging from about seven to about forty times the mass of the sun. These masses are perhaps a bit higher than we expect to by typical, which might raise intriguing questions about how such black holes were formed and evolved, although even comparing the results to the predictions is a hard problem depending on the details of the statistical properties of the detectors and the astrophysical models for the evolution of black holes and the stars from which (we think) they formed.

    Last week, the teams announced the detection of a very different kind of event, the collision of two neutron stars, each about 1.4 times the mass of the sun. Neutron stars are one possible end state of the evolution of a star, when its atoms are no longer able to withstand the pressure of the gravity trying to force them together. This was first understood by S Chandrasekhar in the early years of the 20th Century, who realised that there was a limit to the mass of a star held up simply by the quantum-mechanical repulsion of the electrons at the outskirts of the atoms making up the star. When you surpass this mass, known, appropriately enough, as the Chandrasekhar mass, the star will collapse in upon itself, combining the electrons and protons into neutrons and likely releasing a vast amount of energy in the form of a supernova explosion. After the explosion, the remnant is likely to be a dense ball of neutrons, whose properties are actually determined fairly precisely by similar physics to that of the Chandrasekhar limit (discussed for this case by Oppenheimer, Volkoff and Tolman), giving us the magic 1.4 solar mass number.

    (Last week also coincidentally would have seen Chandrasekhar’s 107th birthday, and Google chose to illustrate their home page with an animation in his honour for the occasion. I was a graduate student at the University of Chicago, where Chandra, as he was known, spent most of his career. Most of us students were far too intimidated to interact with him, although it was always seen as an auspicious occasion when you spotted him around the halls of the Astronomy and Astrophysics Center.)

    This process can therefore make a single 1.4 solar-mass neutron star, and we can imagine that in some rare cases we can end up with two neutron stars orbiting one another. Indeed, the fact that LIGO saw one, but only one, such event during its year-and-a-half run allows the teams to constrain how often that happens, albeit with very large error bars, between 320 and 4740 events per cubic gigaparsec per year; a cubic gigaparsec is about 3 billion light-years on each side, so these are rare events indeed. These results and many other scientific inferences from this single amazing observation are reported in the teams’ overview paper.

    A series of other papers discuss those results in more detail, covering the physics of neutron stars to limits on departures from Einstein’s theory of gravity (for more on some of these other topics, see this blog, or this story from the NY Times). As a cosmologist, the most exciting of the results were the use of the event as a “standard siren”, an object whose gravitational wave properties are well-enough understood that we can deduce the distance to the object from the LIGO results alone. Although the idea came from Bernard Schutz in 1986, the term “Standard siren” was coined somewhat later (by Sean Carroll) in analogy to the (heretofore?) more common cosmological standard candles and standard rulers: objects whose intrinsic brightness and distances are known and so whose distances can be measured by observations of their apparent brightness or size, just as you can roughly deduce how far away a light bulb is by how bright it appears, or how far away a familiar object or person is by how big how it looks.

    Gravitational wave events are standard sirens because our understanding of relativity is good enough that an observation of the shape of gravitational wave pattern as a function of time can tell us the properties of its source. Knowing that, we also then know the amplitude of that pattern when it was released. Over the time since then, as the gravitational waves have travelled across the Universe toward us, the amplitude has gone down (further objects look dimmer sound quieter); the expansion of the Universe also causes the frequency of the waves to decrease — this is the cosmological redshift that we observe in the spectra of distant objects’ light.

    Unlike LIGO’s previous detections of binary-black-hole mergers, this new observation of a binary-neutron-star merger was also seen in photons: first as a gamma-ray burst, and then as a “nova”: a new dot of light in the sky. Indeed, the observation of the afterglow of the merger by teams of literally thousands of astronomers in gamma and x-rays, optical and infrared light, and in the radio, is one of the more amazing pieces of academic teamwork I have seen.

    And these observations allowed the teams to identify the host galaxy of the original neutron stars, and to measure the redshift of its light (the lengthening of the light’s wavelength due to the movement of the galaxy away from us). It is most likely a previously unexceptional galaxy called NGC 4993, with a redshift z=0.009, putting it about 40 megaparsecs away, relatively close on cosmological scales.

    But this means that we can measure all of the factors in one of the most celebrated equations in cosmology, Hubble’s law: cz=Hd, where c is the speed of light, z is the redshift just mentioned, and d is the distance measured from the gravitational wave burst itself. This just leaves H₀, the famous Hubble Constant, giving the current rate of expansion of the Universe, usually measured in kilometres per second per megaparsec. The old-fashioned way to measure this quantity is via the so-called cosmic distance ladder, bootstrapping up from nearby objects of known distance to more distant ones whose properties can only be calibrated by comparison with those more nearby. But errors accumulate in this process and we can be susceptible to the weakest rung on the chain (see recent work by some of my colleagues trying to formalise this process). Alternately, we can use data from cosmic microwave background (CMB) experiments like the Planck Satellite (see here for lots of discussion on this blog); the typical size of the CMB pattern on the sky is something very like a standard ruler. Unfortunately, it, too, needs to calibrated, implicitly by other aspects of the CMB pattern itself, and so ends up being a somewhat indirect measurement. Currently, the best cosmic-distance-ladder measurement gives something like 73.24 ± 1.74 km/sec/Mpc whereas Planck gives 67.81 ± 0.92 km/sec/Mpc; these numbers disagree by “a few sigma”, enough that it is hard to explain as simply a statistical fluctuation.

    Unfortunately, the new LIGO results do not solve the problem. Because we cannot observe the inclination of the neutron-star binary (i.e., the orientation of its orbit), this blows up the error on the distance to the object, due to the Bayesian marginalisation over this unknown parameter (just as the Planck measurement requires marginalization over all of the other cosmological parameters to fully calibrate the results). Because the host galaxy is relatively nearby, the teams must also account for the fact that the redshift includes the effect not only of the cosmological expansion but also the movement of galaxies with respect to one another due to the pull of gravity on relatively large scales; this so-called peculiar velocity has to be modelled which adds further to the errors.

    This procedure gives a final measurement of 70.0+12-8.0, with the full shape of the probability curve shown in the Figure, taken directly from the paper. Both the Planck and distance-ladder results are consistent with these rather large error bars. But this is calculated from a single object; as more of these events are seen these error bars will go down, typically by something like the square root of the number of events, so it might not be too long before this is the best way to measure the Hubble Constant.

    GW H0

    [Apologies: too long, too technical, and written late at night while trying to get my wonderful not-quite-three-week-old daughter to sleep through the night.]

    by Andrew at October 24, 2017 10:44 AM

    October 17, 2017

    Matt Strassler - Of Particular Significance

    The Significance of Yesterday’s Gravitational Wave Announcement: an FAQ

    Yesterday’s post on the results from the LIGO/VIRGO network of gravitational wave detectors was aimed at getting information out, rather than providing the pedagogical backdrop.  Today I’m following up with a post that attempts to answer some of the questions that my readers and my personal friends asked me.  Some wanted to understand better how to visualize what had happened, while others wanted more clarity on why the discovery was so important.  So I’ve put together a post which  (1) explains what neutron stars and black holes are and what their mergers are like, (2) clarifies why yesterday’s announcement was important — and there were many reasons, which is why it’s hard to reduce it all to a single soundbite.  And (3) there are some miscellaneous questions at the end.

    First, a disclaimer: I am *not* an expert in the very complex subject of neutron star mergers and the resulting explosions, called kilonovas.  These are much more complicated than black hole mergers.  I am still learning some of the details.  Hopefully I’ve avoided errors, but you’ll notice a few places where I don’t know the answers … yet.  Perhaps my more expert colleagues will help me fill in the gaps over time.

    Please, if you spot any errors, don’t hesitate to comment!!  And feel free to ask additional questions whose answers I can add to the list.


    What are neutron stars and black holes, and how are they related?

    Every atom is made from a tiny atomic nucleus, made of neutrons and protons (which are very similar), and loosely surrounded by electrons. Most of an atom is empty space, so it can, under extreme circumstances, be crushed — but only if every electron and proton convert to a neutron (which remains behind) and a neutrino (which heads off into outer space.) When a giant star runs out of fuel, the pressure from its furnace turns off, and it collapses inward under its own weight, creating just those extraordinary conditions in which the matter can be crushed. Thus: a star’s interior, with a mass one to several times the Sun’s mass, is all turned into a several-mile(kilometer)-wide ball of neutrons — the number of neutrons approaching a 1 with 57 zeroes after it.

    If the star is big but not too big, the neutron ball stiffens and holds its shape, and the star explodes outward, blowing itself to pieces in a what is called a core-collapse supernova. The ball of neutrons remains behind; this is what we call a neutron star. It’s a ball of the densest material that we know can exist in the universe — a pure atomic nucleus many miles(kilometers) across. It has a very hard surface; if you tried to go inside a neutron star, your experience would be a lot worse than running into a closed door at a hundred miles per hour.

    If the star is very big indeed, the neutron ball that forms may immediately (or soon) collapse under its own weight, forming a black hole. A supernova may or may not result in this case; the star might just disappear. A black hole is very, very different from a neutron star. Black holes are what’s left when matter collapses irretrievably upon itself under the pull of gravity, shrinking down endlessly. While a neutron star has a surface that you could smash your head on, a black hole has no surface — it has an edge that is simply a point of no return, called a horizon. In Einstein’s theory, you can just go right through, as if passing through an open door. You won’t even notice the moment you go in. [Note: this is true in Einstein’s theory. But there is a big controversy as to whether the combination of Einstein’s theory with quantum physics changes the horizon into something novel and dangerous to those who enter; this is known as the firewall controversy, and would take us too far afield into speculation.]  But once you pass through that door, you can never return.

    Black holes can form in other ways too, but not those that we’re observing with the LIGO/VIRGO detectors.

    Why are their mergers the best sources for gravitational waves?

    One of the easiest and most obvious ways to make gravitational waves is to have two objects orbiting each other.  If you put your two fists in a pool of water and move them around each other, you’ll get a pattern of water waves spiraling outward; this is in rough (very rough!) analogy to what happens with two orbiting objects, although, since the objects are moving in space, the waves aren’t in a material like water.  They are waves in space itself.

    To get powerful gravitational waves, you want objects each with a very big mass that are orbiting around each other at very high speed. To get the fast motion, you need the force of gravity between the two objects to be strong; and to get gravity to be as strong as possible, you need the two objects to be as close as possible (since, as Isaac Newton already knew, gravity between two objects grows stronger when the distance between them shrinks.) But if the objects are large, they can’t get too close; they will bump into each other and merge long before their orbit can become fast enough. So to get a really fast orbit, you need two relatively small objects, each with a relatively big mass — what scientists refer to as compact objects. Neutron stars and black holes are the most compact objects we know about. Fortunately, they do indeed often travel in orbiting pairs, and do sometimes, for a very brief period before they merge, orbit rapidly enough to produce gravitational waves that LIGO and VIRGO can observe.

    Why do we find these objects in pairs in the first place?

    Stars very often travel in pairs… they are called binary stars. They can start their lives in pairs, forming together in large gas clouds, or even if they begin solitary, they can end up pairing up if they live in large densely packed communities of stars where it is common for multiple stars to pass nearby. Perhaps surprisingly, their pairing can survive the collapse and explosion of either star, leaving two black holes, two neutron stars, or one of each in orbit around one another.

    What happens when these objects merge?

    Not surprisingly, there are three classes of mergers which can be detected: two black holes merging, two neutron stars merging, and a neutron star merging with a black hole. The first class was observed in 2015 (and announced in 2016), the second was announced yesterday, and it’s a matter of time before the third class is observed. The two objects may orbit each other for billions of years, very slowly radiating gravitational waves (an effect observed in the 70’s, leading to a Nobel Prize) and gradually coming closer and closer together. Only in the last day of their lives do their orbits really start to speed up. And just before these objects merge, they begin to orbit each other once per second, then ten times per second, then a hundred times per second. Visualize that if you can: objects a few dozen miles (kilometers) across, a few miles (kilometers) apart, each with the mass of the Sun or greater, orbiting each other 100 times each second. It’s truly mind-boggling — a spinning dumbbell beyond the imagination of even the greatest minds of the 19th century. I don’t know any scientist who isn’t awed by this vision. It all sounds like science fiction. But it’s not.

    How do we know this isn’t science fiction?

    We know, if we believe Einstein’s theory of gravity (and I’ll give you a very good reason to believe in it in just a moment.) Einstein’s theory predicts that such a rapidly spinning, large-mass dumbbell formed by two orbiting compact objects will produce a telltale pattern of ripples in space itself — gravitational waves. That pattern is both complicated and precisely predicted. In the case of black holes, the predictions go right up to and past the moment of merger, to the ringing of the larger black hole that forms in the merger. In the case of neutron stars, the instants just before, during and after the merger are more complex and we can’t yet be confident we understand them, but during tens of seconds before the merger Einstein’s theory is very precise about what to expect. The theory further predicts how those ripples will cross the vast distances from where they were created to the location of the Earth, and how they will appear in the LIGO/VIRGO network of three gravitational wave detectors. The prediction of what to expect at LIGO/VIRGO thus involves not just one prediction but many: the theory is used to predict the existence and properties of black holes and of neutron stars, the detailed features of their mergers, the precise patterns of the resulting gravitational waves, and how those gravitational waves cross space. That LIGO/VIRGO have detected the telltale patterns of these gravitational waves. That these wave patterns agree with Einstein’s theory in every detail is the strongest evidence ever obtained that there is nothing wrong with Einstein’s theory when used in these combined contexts.  That then in turn gives us confidence that our interpretation of the LIGO/VIRGO results is correct, confirming that black holes and neutron stars really exist and really merge. (Notice the reasoning is slightly circular… but that’s how scientific knowledge proceeds, as a set of detailed consistency checks that gradually and eventually become so tightly interconnected as to be almost impossible to unwind.  Scientific reasoning is not deductive; it is inductive.  We do it not because it is logically ironclad but because it works so incredibly well — as witnessed by the computer, and its screen, that I’m using to write this, and the wired and wireless internet and computer disk that will be used to transmit and store it.)


    What makes it difficult to explain the significance of yesterday’s announcement is that it consists of many important results piled up together, rather than a simple takeaway that can be reduced to a single soundbite. (That was also true of the black hole mergers announcement back in 2016, which is why I wrote a long post about it.)

    So here is a list of important things we learned.  No one of them, by itself, is earth-shattering, but each one is profound, and taken together they form a major event in scientific history.

    First confirmed observation of a merger of two neutron stars: We’ve known these mergers must occur, but there’s nothing like being sure. And since these things are too far away and too small to see in a telescope, the only way to be sure these mergers occur, and to learn more details about them, is with gravitational waves.  We expect to see many more of these mergers in coming years as gravitational wave astronomy increases in its sensitivity, and we will learn more and more about them.

    New information about the properties of neutron stars: Neutron stars were proposed almost a hundred years ago and were confirmed to exist in the 60’s and 70’s.  But their precise details aren’t known; we believe they are like a giant atomic nucleus, but they’re so vastly larger than ordinary atomic nuclei that can’t be sure we understand all of their internal properties, and there are debates in the scientific community that can’t be easily answered… until, perhaps, now.

    From the detailed pattern of the gravitational waves of this one neutron star merger, scientists already learn two things. First, we confirm that Einstein’s theory correctly predicts the basic pattern of gravitational waves from orbiting neutron stars, as it does for orbiting and merging black holes. Unlike black holes, however, there are more questions about what happens to neutron stars when they merge. The question of what happened to this pair after they merged is still out — did the form a neutron star, an unstable neutron star that, slowing its spin, eventually collapsed into a black hole, or a black hole straightaway?

    But something important was already learned about the internal properties of neutron stars. The stresses of being whipped around at such incredible speeds would tear you and I apart, and would even tear the Earth apart. We know neutron stars are much tougher than ordinary rock, but how much more? If they were too flimsy, they’d have broken apart at some point during LIGO/VIRGO’s observations, and the simple pattern of gravitational waves that was expected would have suddenly become much more complicated. That didn’t happen until perhaps just before the merger.   So scientists can use the simplicity of the pattern of gravitational waves to infer some new things about how stiff and strong neutron stars are.  More mergers will improve our understanding.  Again, there is no other simple way to obtain this information.

    First visual observation of an event that produces both immense gravitational waves and bright electromagnetic waves: Black hole mergers aren’t expected to create a brilliant light display, because, as I mentioned above, they’re more like open doors to an invisible playground than they are like rocks, so they merge rather quietly, without a big bright and hot smash-up.  But neutron stars are big balls of stuff, and so the smash-up can indeed create lots of heat and light of all sorts, just as you might naively expect.  By “light” I mean not just visible light but all forms of electromagnetic waves, at all wavelengths (and therefore at all frequencies.)  Scientists divide up the range of electromagnetic waves into categories. These categories are radio waves, microwaves, infrared light, visible light, ultraviolet light, X-rays, and gamma rays, listed from lowest frequency and largest wavelength to highest frequency and smallest wavelength.  (Note that these categories and the dividing lines between them are completely arbitrary, but the divisions are useful for various scientific purposes.  The only fundamental difference between yellow light, a radio wave, and a gamma ray is the wavelength and frequency; otherwise they’re exactly the same type of thing, a wave in the electric and magnetic fields.)

    So if and when two neutron stars merge, we expect both gravitational waves and electromagnetic waves, the latter of many different frequencies created by many different effects that can arise when two huge balls of neutrons collide.  But just because we expect them doesn’t mean they’re easy to see.  These mergers are pretty rare — perhaps one every hundred thousand years in each big galaxy like our own — so the ones we find using LIGO/VIRGO will generally be very far away.  If the light show is too dim, none of our telescopes will be able to see it.

    But this light show was plenty bright.  Gamma ray detectors out in space detected it instantly, confirming that the gravitational waves from the two neutron stars led to a collision and merger that produced very high frequency light.  Already, that’s a first.  It’s as though one had seen lightning for years but never heard thunder; or as though one had observed the waves from hurricanes for years but never observed one in the sky.  Seeing both allows us a whole new set of perspectives; one plus one is often much more than two.

    Over time — hours and days — effects were seen in visible light, ultraviolet light, infrared light, X-rays and radio waves.  Some were seen earlier than others, which itself is a story, but each one contributes to our understanding of what these mergers are actually like.

    Confirmation of the best guess concerning the origin of “short” gamma ray bursts:  For many years, bursts of gamma rays have been observed in the sky.  Among them, there seems to be a class of bursts that are shorter than most, typically lasting just a couple of seconds.  They come from all across the sky, indicating that they come from distant intergalactic space, presumably from distant galaxies.  Among other explanations, the most popular hypothesis concerning these short gamma-ray bursts has been that they come from merging neutron stars.  The only way to confirm this hypothesis is with the observation of the gravitational waves from such a merger.  That test has now been passed; it appears that the hypothesis is correct.  That in turn means that we have, for the first time, both a good explanation of these short gamma ray bursts and, because we know how often we observe these bursts, a good estimate as to how often neutron stars merge in the universe.

    First distance measurement to a source using both a gravitational wave measure and a redshift in electromagnetic waves, allowing a new calibration of the distance scale of the universe and of its expansion rate:  The pattern over time of the gravitational waves from a merger of two black holes or neutron stars is complex enough to reveal many things about the merging objects, including a rough estimate of their masses and the orientation of the spinning pair relative to the Earth.  The overall strength of the waves, combined with the knowledge of the masses, reveals how far the pair is from the Earth.  That by itself is nice, but the real win comes when the discovery of the object using visible light, or in fact any light with frequency below gamma-rays, can be made.  In this case, the galaxy that contains the neutron stars can be determined.

    Once we know the host galaxy, we can do something really important.  We can, by looking at the starlight, determine how rapidly the galaxy is moving away from us.  For distant galaxies, the speed at which the galaxy recedes should be related to its distance because the universe is expanding.

    How rapidly the universe is expanding has been recently measured with remarkable precision, but the problem is that there are two different methods for making the measurement, and they disagree.   This disagreement is one of the most important problems for our understanding of the universe.  Maybe one of the measurement methods is flawed, or maybe — and this would be much more interesting — the universe simply doesn’t behave the way we think it does.

    What gravitational waves do is give us a third method: the gravitational waves directly provide the distance to the galaxy, and the electromagnetic waves directly provide the speed of recession.  There is no other way to make this type of joint measurement directly for distant galaxies.  The method is not accurate enough to be useful in just one merger, but once dozens of mergers have been observed, the average result will provide important new information about the universe’s expansion.  When combined with the other methods, it may help resolve this all-important puzzle.

    Best test so far of Einstein’s prediction that the speed of light and the speed of gravitational waves are identical: Since gamma rays from the merger and the peak of the gravitational waves arrived within two seconds of one another after traveling 130 million years — that is, about 5 thousand million million seconds — we can say that the speed of light and the speed of gravitational waves are both equal to the cosmic speed limit to within one part in 2 thousand million million.  Such a precise test requires the combination of gravitational wave and gamma ray observations.

    Efficient production of heavy elements confirmed:  It’s long been said that we are star-stuff, or stardust, and it’s been clear for a long time that it’s true.  But there’s been a puzzle when one looks into the details.  While it’s known that all the chemical elements from hydrogen up to iron are formed inside of stars, and can be blasted into space in supernova explosions to drift around and eventually form planets, moons, and humans, it hasn’t been quite as clear how the other elements with heavier atoms — atoms such as iodine, cesium, gold, lead, bismuth, uranium and so on — predominantly formed.  Yes they can be formed in supernovas, but not so easily; and there seem to be more atoms of heavy elements around the universe than supernovas can explain.  There are many supernovas in the history of the universe, but the efficiency for producing heavy chemical elements is just too low.

    It was proposed some time ago that the mergers of neutron stars might be a suitable place to produce these heavy elements.  Even those these mergers are rare, they might be much more efficient, because the nuclei of heavy elements contain lots of neutrons and, not surprisingly, a collision of two neutron stars would produce lots of neutrons in its debris, suitable perhaps for making these nuclei.   A key indication that this is going on would be the following: if a neutron star merger could be identified using gravitational waves, and if its location could be determined using telescopes, then one would observe a pattern of light that would be characteristic of what is now called a “kilonova” explosion.   Warning: I don’t yet know much about kilonovas and I may be leaving out important details. A kilonova is powered by the process of forming heavy elements; most of the nuclei produced are initially radioactive — i.e., unstable — and they break down by emitting high energy particles, including the particles of light (called photons) which are in the gamma ray and X-ray categories.  The resulting characteristic glow would be expected to have a pattern of a certain type: it would be initially bright but would dim rapidly in visible light, with a long afterglow in infrared light.  The reasons for this are complex, so let me set them aside for now.  The important point is that this pattern was observed, confirming that a kilonova of this type occurred, and thus that, in this neutron star merger, enormous amounts of heavy elements were indeed produced.  So we now have a lot of evidence, for the first time, that almost all the heavy chemical elements on and around our planet were formed in neutron star mergers.  Again, we could not know this if we did not know that this was a neutron star merger, and that information comes only from the gravitational wave observation.


    Did the merger of these two neutron stars result in a new black hole, a larger neutron star, or an unstable rapidly spinning neutron star that later collapsed into a black hole?

    We don’t yet know, and maybe we won’t know.  Some scientists involved appear to be leaning toward the possibility that a black hole was formed, but others seem to say the jury is out.  I’m not sure what additional information can be obtained over time about this.

    If the two neutron stars formed a black hole, why was there a kilonova?  Why wasn’t everything sucked into the black hole?

    Black holes aren’t vacuum cleaners; they pull things in via gravity just the same way that the Earth and Sun do, and don’t suck things in some unusual way.  The only crucial thing about a black hole is that once you go in you can’t come out.  But just as when trying to avoid hitting the Earth or Sun, you can avoid falling in if you orbit fast enough or if you’re flung outward before you reach the edge.

    The point in a neutron star merger is that the forces at the moment of merger are so intense that one or both neutron stars are partially ripped apart.  The material that is thrown outward in all directions, at an immense speed, somehow creates the bright, hot flash of gamma rays and eventually the kilonova glow from the newly formed atomic nuclei.  Those details I don’t yet understand, but I know they have been carefully studied both with approximate equations and in computer simulations such as this one and this one.  However, the accuracy of the simulations can only be confirmed through the detailed studies of a merger, such as the one just announced.  It seems, from the data we’ve seen, that the simulations did a fairly good job.  I’m sure they will be improved once they are compared with the recent data.




    by Matt Strassler at October 17, 2017 04:03 PM

    October 16, 2017

    Sean Carroll - Preposterous Universe

    Standard Sirens

    Everyone is rightly excited about the latest gravitational-wave discovery. The LIGO observatory, recently joined by its European partner VIRGO, had previously seen gravitational waves from coalescing black holes. Which is super-awesome, but also a bit lonely — black holes are black, so we detect the gravitational waves and little else. Since our current gravitational-wave observatories aren’t very good at pinpointing source locations on the sky, we’ve been completely unable to say which galaxy, for example, the events originated in.

    This has changed now, as we’ve launched the era of “multi-messenger astronomy,” detecting both gravitational and electromagnetic radiation from a single source. The event was the merger of two neutron stars, rather than black holes, and all that matter coming together in a giant conflagration lit up the sky in a large number of wavelengths simultaneously.

    Look at all those different observatories, and all those wavelengths of electromagnetic radiation! Radio, infrared, optical, ultraviolet, X-ray, and gamma-ray — soup to nuts, astronomically speaking.

    A lot of cutting-edge science will come out of this, see e.g. this main science paper. Apparently some folks are very excited by the fact that the event produced an amount of gold equal to several times the mass of the Earth. But it’s my blog, so let me highlight the aspect of personal relevance to me: using “standard sirens” to measure the expansion of the universe.

    We’re already pretty good at measuring the expansion of the universe, using something called the cosmic distance ladder. You build up distance measures step by step, determining the distance to nearby stars, then to more distant clusters, and so forth. Works well, but of course is subject to accumulated errors along the way. This new kind of gravitational-wave observation is something else entirely, allowing us to completely jump over the distance ladder and obtain an independent measurement of the distance to cosmological objects. See this LIGO explainer.

    The simultaneous observation of gravitational and electromagnetic waves is crucial to this idea. You’re trying to compare two things: the distance to an object, and the apparent velocity with which it is moving away from us. Usually velocity is the easy part: you measure the redshift of light, which is easy to do when you have an electromagnetic spectrum of an object. But with gravitational waves alone, you can’t do it — there isn’t enough structure in the spectrum to measure a redshift. That’s why the exploding neutron stars were so crucial; in this event, GW170817, we can for the first time determine the precise redshift of a distant gravitational-wave source.

    Measuring the distance is the tricky part, and this is where gravitational waves offer a new technique. The favorite conventional strategy is to identify “standard candles” — objects for which you have a reason to believe you know their intrinsic brightness, so that by comparing to the brightness you actually observe you can figure out the distance. To discover the acceleration of the universe, for example,  astronomers used Type Ia supernovae as standard candles.

    Gravitational waves don’t quite give you standard candles; every one will generally have a different intrinsic gravitational “luminosity” (the amount of energy emitted). But by looking at the precise way in which the source evolves — the characteristic “chirp” waveform in gravitational waves as the two objects rapidly spiral together — we can work out precisely what that total luminosity actually is. Here’s the chirp for GW170817, compared to the other sources we’ve discovered — much more data, almost a full minute!

    So we have both distance and redshift, without using the conventional distance ladder at all! This is important for all sorts of reasons. An independent way of getting at cosmic distances will allow us to measure properties of the dark energy, for example. You might also have heard that there is a discrepancy between different ways of measuring the Hubble constant, which either means someone is making a tiny mistake or there is something dramatically wrong with the way we think about the universe. Having an independent check will be crucial in sorting this out. Just from this one event, we are able to say that the Hubble constant is 70 kilometers per second per megaparsec, albeit with large error bars (+12, -8 km/s/Mpc). That will get much better as we collect more events.

    So here is my (infinitesimally tiny) role in this exciting story. The idea of using gravitational-wave sources as standard sirens was put forward by Bernard Schutz all the way back in 1986. But it’s been developed substantially since then, especially by my friends Daniel Holz and Scott Hughes. Years ago Daniel told me about the idea, as he and Scott were writing one of the early papers. My immediate response was “Well, you have to call these things `standard sirens.'” And so a useful label was born.

    Sadly for my share of the glory, my Caltech colleague Sterl Phinney also suggested the name simultaneously, as the acknowledgments to the paper testify. That’s okay; when one’s contribution is this extremely small, sharing it doesn’t seem so bad.

    By contrast, the glory attaching to the physicists and astronomers who pulled off this observation, and the many others who have contributed to the theoretical understanding behind it, is substantial indeed. Congratulations to all of the hard-working people who have truly opened a new window on how we look at our universe.

    by Sean Carroll at October 16, 2017 03:52 PM

    Matt Strassler - Of Particular Significance

    A Scientific Breakthrough! Combining Gravitational and Electromagnetic Waves

    Gravitational waves are now the most important new tool in the astronomer’s toolbox.  Already they’ve been used to confirm that large black holes — with masses ten or more times that of the Sun — and mergers of these large black holes to form even larger ones, are not uncommon in the universe.   Today it goes a big step further.

    It’s long been known that neutron stars, remnants of collapsed stars that have exploded as supernovas, are common in the universe.  And it’s been known almost as long that sometimes neutron stars travel in pairs.  (In fact that’s how gravitational waves were first discovered, indirectly, back in the 1970s.)  Stars often form in pairs, and sometimes both stars explode as supernovas, leaving their neutron star relics in orbit around one another.  Neutron stars are small — just ten or so kilometers (miles) across.  According to Einstein’s theory of gravity, a pair of stars should gradually lose energy by emitting gravitational waves into space, and slowly but surely the two objects should spiral in on one another.   Eventually, after many millions or even billions of years, they collide and merge into a larger neutron star, or into a black hole.  This collision does two things.

    1. It makes some kind of brilliant flash of light — electromagnetic waves — whose details are only guessed at.  Some of those electromagnetic waves will be in the form of visible light, while much of it will be in invisible forms, such as gamma rays.
    2. It makes gravitational waves, whose details are easier to calculate and which are therefore distinctive, but couldn’t have been detected until LIGO and VIRGO started taking data, LIGO over the last couple of years, VIRGO over the last couple of months.

    It’s possible that we’ve seen the light from neutron star mergers before, but no one could be sure.  Wouldn’t it be great, then, if we could see gravitational waves AND electromagnetic waves from a neutron star merger?  It would be a little like seeing the flash and hearing the sound from fireworks — seeing and hearing is better than either one separately, with each one clarifying the other.  (Caution: scientists are often speaking as if detecting gravitational waves is like “hearing”.  This is only an analogy, and a vague one!  It’s not at all the same as acoustic waves that we can hear with our ears, for many reasons… so please don’t take it too literally.)  If we could do both, we could learn about neutron stars and their properties in an entirely new way.

    Today, we learned that this has happened.  LIGO , with the world’s first two gravitational observatories, detected the waves from two merging neutron stars, 130 million light years from Earth, on August 17th.  (Neutron star mergers last much longer than black hole mergers, so the two are easy to distinguish; and this one was so close, relatively speaking, that it was seen for a long while.)  VIRGO, with the third detector, allows scientists to triangulate and determine roughly where mergers have occurred.  They saw only a very weak signal, but that was extremely important, because it told the scientists that the merger must have occurred in a small region of the sky where VIRGO has a relative blind spot.  That told scientists where to look.

    The merger was detected for more than a full minute… to be compared with black holes whose mergers can be detected for less than a second.  It’s not exactly clear yet what happened at the end, however!  Did the merged neutron stars form a black hole or a neutron star?  The jury is out.

    At almost exactly the moment at which the gravitational waves reached their peak, a blast of gamma rays — electromagnetic waves of very high frequencies — were detected by a different scientific team, the one from FERMI. FERMI detects gamma rays from the distant universe every day, and a two-second gamma-ray-burst is not unusual.  And INTEGRAL, another gamma ray experiment, also detected it.   The teams communicated within minutes.   The FERMI and INTEGRAL gamma ray detectors can only indicate the rough region of the sky from which their gamma rays originate, and LIGO/VIRGO together also only give a rough region.  But the scientists saw those regions overlapped.  The evidence was clear.  And with that, astronomy entered a new, highly anticipated phase.

    Already this was a huge discovery.  Brief gamma-ray bursts have been a mystery for years.  One of the best guesses as to their origin has been neutron star mergers.  Now the mystery is solved; that guess is apparently correct. (Or is it?  Probably, but the gamma ray discovery is surprisingly dim, given how close it is.  So there are still questions to ask.)

    Also confirmed by the fact that these signals arrived within a couple of seconds of one another, after traveling for over 100 million years from the same source, is that, indeed, the speed of light and the speed of gravitational waves are exactly the same — both of them equal to the cosmic speed limit, just as Einstein’s theory of gravity predicts.

    Next, these teams quickly told their astronomer friends to train their telescopes in the general area of the source. Dozens of telescopes, from every continent and from space, and looking for electromagnetic waves at a huge range of frequencies, pointed in that rough direction and scanned for anything unusual.  (A big challenge: the object was near the Sun in the sky, so it could be viewed in darkness only for an hour each night!) Light was detected!  At all frequencies!  The object was very bright, making it easy to find the galaxy in which the merger took place.  The brilliant glow was seen in gamma rays, ultraviolet light, infrared light, X-rays, and radio.  (Neutrinos, particles that can serve as another way to observe distant explosions, were not detected this time.)

    And with so much information, so much can be learned!

    Most important, perhaps, is this: from the pattern of the spectrum of light, the conjecture seems to be confirmed that the mergers of neutron stars are important sources, perhaps the dominant one, for many of the heavy chemical elements — iodine, iridium, cesium, gold, platinum, and so on — that are forged in the intense heat of these collisions.  It used to be thought that the same supernovas that form neutron stars in the first place were the most likely source.  But now it seems that this second stage of neutron star life — merger, rather than birth — is just as important.  That’s fascinating, because neutron star mergers are much more rare than the supernovas that form them.  There’s a supernova in our Milky Way galaxy every century or so, but it’s tens of millenia or more between these “kilonovas”, created in neutron star mergers.

    If there’s anything disappointing about this news, it’s this: almost everything that was observed by all these different experiments was predicted in advance.  Sometimes it’s more important and useful when some of your predictions fail completely, because then you realize how much you have to learn.  Apparently our understanding of gravity, of neutron stars, and of their mergers, and of all sorts of sources of electromagnetic radiation that are produced in those merges, is even better than we might have thought. But fortunately there are a few new puzzles.  The X-rays were late; the gamma rays were dim… we’ll hear more about this shortly, as NASA is holding a second news conference.

    Some highlights from the second news conference:

    • New information about neutron star interiors, which affects how large they are and therefore how exactly they merge, has been obtained
    • The first ever visual-light image of a gravitational wave source, from the Swope telescope, at the outskirts of a distant galaxy; the galaxy’s center is the blob of light, and the arrow points to the explosion.

    • The theoretical calculations for a kilonova explosion suggest that debris from the blast should rather quickly block the visual light, so the explosion dims quickly in visible light — but infrared light lasts much longer.  The observations by the visible and infrared light telescopes confirm this aspect of the theory; and you can see evidence for that in the picture above, where four days later the bright spot is both much dimmer and much redder than when it was discovered.
    • Estimate: the total mass of the gold and platinum produced in this explosion is vastly larger than the mass of the Earth.
    • Estimate: these neutron stars were formed about 10 or so billion years ago.  They’ve been orbiting each other for most of the universe’s history, and ended their lives just 130 million years ago, creating the blast we’ve so recently detected.
    • Big Puzzle: all of the previous gamma-ray bursts seen up to now have always had shone in ultraviolet light and X-rays as well as gamma rays.   But X-rays didn’t show up this time, at least not initially.  This was a big surprise.  It took 9 days for the Chandra telescope to observe X-rays, too faint for any other X-ray telescope.  Does this mean that the two neutron stars created a black hole, which then created a jet of matter that points not quite directly at us but off-axis, and shines by illuminating the matter in interstellar space?  This had been suggested as a possibility twenty years ago, but this is the first time there’s been any evidence for it.
    • One more surprise: it took 16 days for radio waves from the source to be discovered, with the Very Large Array, the most powerful existing radio telescope.  The radio emission has been growing brighter since then!  As with the X-rays, this seems also to support the idea of an off-axis jet.
    • Nothing quite like this gamma-ray burst has been seen — or rather, recognized — before.  When a gamma ray burst doesn’t have an X-ray component showing up right away, it simply looks odd and a bit mysterious.  Its harder to observe than most bursts, because without a jet pointing right at us, its afterglow fades quickly.  Moreover, a jet pointing at us is bright, so it blinds us to the more detailed and subtle features of the kilonova.  But this time, LIGO/VIRGO told scientists that “Yes, this is a neutron star merger”, leading to detailed study from all electromagnetic frequencies, including patient study over many days of the X-rays and radio.  In other cases those observations would have stopped after just a short time, and the whole story couldn’t have been properly interpreted.



    by Matt Strassler at October 16, 2017 03:10 PM