Particle Physics Planet

April 22, 2018

Christian P. Robert - xi'an's og

look, look, confidence! [book review]

As it happens, I recently bought [with Amazon Associate earnings] a (used) copy of Confidence, Likelihood, Probability (Statistical Inference with Confidence Distributions), by Tore Schweder and Nils Hjort, to try to understand this confusing notion of confidence distributions. (And hence did not get the book from CUP or anyone else towards purposely writing a review. Or a ½-review like the one below.)

“Fisher squared the circle and obtained a posterior without a prior.” (p.419)

Now that I have gone through a few chapters, I am no less confused about the point of this notion. Which seems to rely on the availability of confidence intervals. Exact or asymptotic ones. The authors plainly recognise (p.61) that a confidence distribution is neither a posterior distribution nor a fiducial distribution, hence cutting off any possible Bayesian usage of the approach. Which seems right in that there is no coherence behind the construct, meaning for instance there is no joint distribution corresponding to the resulting marginals. Or even a specific dominating measure in the parameter space. (Always go looking for the dominating measure!) As usual with frequentist procedures, there is always a feeling of arbitrariness in the resolution, as for instance in the Neyman-Scott problem (p.112) where the profile likelihood and the deviance do not work, but considering directly the distribution of the (inconsistent) MLE of the variance “saves the day”, which sounds a bit like starting from the solution. Another statistical freak, the Fieller-Creasy problem (p.116) remains a freak in this context as it does not seem to allow for a confidence distribution. I also notice an ambivalence in the discourse of the authors of this book, namely that while they claim confidence distributions are both outside a probabilisation of the parameter and inside, “producing distributions for parameters of interest given the data (…) with fewer philosophical and interpretational obstacles” (p.428).

“Bias is particularly difficult to discuss for Bayesian methods, and seems not to be a worry for most Bayesian statisticians.” (p.10)

The discussions as to whether or not confidence distributions form a synthesis of Bayesianism and frequentism always fall short from being convincing, the choice of (or the dependence on) a prior distribution appearing to the authors as a failure of the former approach. Or unnecessarily complicated when there are nuisance parameters. Apparently missing on the (high) degree of subjectivity involved in creating the confidence procedures. Chapter 1 contains a section on “Why not go Bayesian?” that starts from Chris Sims‘ Nobel Lecture on the appeal of Bayesian methods and goes [softly] rampaging through each item. One point (3) is recurrent in many criticisms of B and I always wonder whether or not it is tongue-in-cheek-y… Namely the fact that parameters of a model are rarely if ever stochastic. This is a misrepresentation of the use of prior and posterior distributions [which are in fact] as summaries of information cum uncertainty. About a true fixed parameter. Refusing as does the book to endow posteriors with an epistemic meaning (except for “Bayesian of the Lindley breed” (p.419) is thus most curious. (The debate is repeating in the final(e) chapter as “why the world need not be Bayesian after all”.)

“To obtain frequentist unbiasedness, the Bayesian will have to choose her prior with unbiasedness in mind. Is she then a Bayesian?” (p.430)

A general puzzling feature of the book is that notions are not always immediately defined, but rather discussed and illustrated first. As for instance for the central notion of fiducial probability (Section 1.7, then Chapter 6), maybe because Fisher himself did not have a general principle to advance. The construction of a confidence distribution most often keeps a measure of mystery (and arbitrariness), outside the rather stylised setting of exponential families and sufficient (conditionally so) statistics. (Incidentally, our 2012 ABC survey is [kindly] quoted in relation with approximate sufficiency (p.180), while it does not sound particularly related to this part of the book. Now, is there an ABC version of confidence distributions? Or an ABC derivation?) This is not to imply that the book is uninteresting!, as I found reading it quite entertaining, with many humorous and tongue-in-cheek remarks, like “From Fraser (1961a) and until Fraser (2011), and hopefully even further” (p.92), and great datasets. (Including one entitled Pornoscope, which is about drosophilia mating.) And also datasets with lesser greatness, like the 3000 mink whales that were killed for Example 8.5, where the authors if not the whales “are saved by a large and informative dataset”… (Whaling is a recurrent [national?] theme throughout the book, along with sport statistics usually involving Norway!)

Miscellanea: The interest of the authors in the topic is credited to bowhead whales, more precisely to Adrian Raftery’s geometric merging (or melding) of two priors and to the resulting Borel paradox (xiii). Proposal that I remember Adrian presenting in Luminy, presumably in 1994. Or maybe in Aussois the year after. The book also repeats Don Fraser’s notion that the likelihood is a sufficient statistic, a point that still bothers me. (On the side, I realised while reading Confidence, &tc., that ABC cannot comply with the likelihood principle.) To end up on a French nitpicking note (!), Quenouille is typ(o)ed Quenoille in the main text, the references and the index. (Blame the .bib file!)

by xi'an at April 22, 2018 10:18 PM

Lubos Motl - string vacua and pheno

Brian Keating's Nobel prize obsession surprised me
Brian Keating will release his first book, "Losing the Nobel Prize", on April 24th. I don't own it and I haven't read it. But I was still intrigued by some of the discussions about it.

Backreation wrote a review and Keating responded.

I used to think that the title was just a trick to emphasize the importance of Keating's work: He has done work that could have led to a Nobel prize but Nature wasn't generous enough, it has seemed for some 3 years. But the two articles linked to in the previous paragraph suggest that Keating is much more obsessed with the Nobel prize. That's ironic because the book seems to say that Keating is not obsessed, and he doesn't even want such a lame prize, but it's his colleagues, the spherical bastards, who are obsessed. ;-)

OK, let me start to react to basic statements by Keating and Hossenfelder. First, Keating designed BICEP1 and lots of us were very excited about BICEP2, an upgraded version of that gadget. It could have seen the primordial gravitational waves. Even though I had theoretical prejudices leading me to believe that those waves should be weak enough so that they shouldn't have been seen, I was impressed by the actual graphs and claims by the BICEP2 collaboration and willing to believe that they really found the waves and proved us wrong (by "us", I mean people around the Weak Gravity Conjecture and related schools of thought).

Keating has clearly designed a nice gadget and he deserves to be considered a top professional in his field. Because that gadget hasn't made a breakthrough that we would still believe to be real and solid, Keating hasn't won any major prize that also requires some collaboration of Mother Nature. He's still a top professional who rightfully earns a regular salary for that work and skills but his big lottery ticket hasn't won so he wasn't given a Nobel prize, an extraordinary donation.

During the excitement about BICEP2, if you told me that the Keating was this obsessed with the Nobel prize, I would have probably been more skeptical about the claims than I was. From my perspective, this obsession looks like a warning. If you really want a Nobel prize, it's natural to think that you make the arguments in favor of your discovery look a little bit clearer than what follows from your cold hard data. I don't really claim that Keating has committed such an "improvement" but I do claim that the expectation value of the "improvement" that I would have believed if I had known about his Nobel prize obsession would be positive and significant.

Keating seems to combine comments about his particular work with some more general criticism of the Nobel prize. Only 1/4 of the Nobel prize winners in physics are theorists; the rest are experimenters and observational people. Keating says that the fraction of theorists should be higher. I agree. He also says that experimenters shouldn't be getting Nobel prizes for things that some theorists outlined before them. I have mixed feelings about that claim – on some days, I would subscribe to that, on others, I wouldn't.

Hossenfelder seems upset about that very statement:
You read that right. No Nobel for the Higgs, no Nobel for B-modes, and no Nobel for a direct discovery of dark matter (should it ever happen), because someone predicted that.
Ms Hossenfelder must have missed it but one of these experimental discoveries has been made, that of the Higgs boson, and the experimenters indeed didn't get any Nobel prize. The 2013 Nobel prize went to Higgs and Englert, two of the theorists who discovered the mechanism and (perhaps) the particle theoretically. There have been several reasons why the experimenters haven't received the award (yet?): the CERN teams are too large, too many people could be said to deserve it (Alfred Nobel's limit is 3 – well, his will actually said 1 but soon afterwards, the number was tripled and another change would seem too radical now). But I think that Keating's thinking has also played a role. CERN has really done something straightforward. They knew what they should see. In my opinion, this makes the contribution by the experimenters less groundbreaking.

In 2017, Weiss, Thorn, and Barish got their experimental Nobel prize for something that was predicted by the theorists – such as Albert Einstein – namely the gravitational waves. But if you look at the justification, they got the prize both for LIGO and the discovery of the gravitational waves. So they were the "first men" who created LIGO and/or made it very powerful. It seems to me that no one who has done something this groundbreaking in particle physics experimentation was a visible member of the teams that discovered the Higgs boson. That discovery was made by a gradual improvement of the collider technology – by a large collective of people.

I think that if the primordial gravitational waves were discovered by BICEP2 and the discovery were confirmed and withstood the tests, Keating would both deserve the Nobel prize and he would get the Nobel prize. Now, some theorists have predicted strong enough primordial gravitational waves. But these waves may also be weak or non-existent. The difference from the Higgs boson is that the Higgs boson was really agreed to be necessarily there by good particle physicists, it was the unique player that makes the \(W_L W_L\to W_L W_L\) scattering unitary. On the other hand, there's no such uniqueness in the case of the primordial gravitational waves and their strength (and similarly in the problem of the identity of the dark matter). When the answers aren't agreed to be unique by the theorists, the experimenters play a much bigger role and they arguably deserve the Nobel prize.

Some people are very upset when Keating (or I) point out that the confirmation of a theory by an experiment – when the experimenter already knows what to look for – is less spectacular. For example:
naivetheorist said: Keating writes: "I am advocating that more theorists should win it, and experimentalists should not win it if they/we merely confirm a theory". Merely? that's an incredibly condescending attitude. Keating's rather lame response' affirms my decision to cancel my order for his book.
The attitude may look condescending but there are very good reasons for this "condescension". The Nobel prize simply is meant to reward the original contribution and when someone is just confirming the work (theory) by someone else, this work is more derivative even if the first guy is a theorist and the second guy is an experimenter. It's great that experimenters are confirming or refuting hypotheses formulated by the theorists. But that's merely the scientific "business as usual". Prizes such as the Nobel prize are given for something extraordinary that isn't just "business as usual". One needs to be the really first person to do something – and luck or Nature's cooperation is often needed.

Keating seems to propose some boycotts of the Nobel prize or lawsuits against the Nobel prize. I don't get these comments. The inventor of some explosives got rich and created a system in which his money is invested and some fraction is paid to some people who are chosen as worthy the award by a committee that Nobel envisioned in his Nobel. It's a private activity. Well, one that has become globally famous, but the global fame is a consequence of the fame of Nobel himself and the winners (plus the money that attracts human eyes), not something that defines the award. Just because the award is followed by many people in the world doesn't mean that these people have the right to change the rules. After all, it's not their money.

As I said, the Nobel prize could be "better" according to many of us – and a higher percentage of the theorists could be a part of this "improvement". But this discussion is detached from reality. The Nobel prize is whatever it is. Alfred Nobel was a very practical person – explosives are rather practical compounds – and I believe that if he knew the whole list of the winners of his physics prize, he would be surprised by the high percentage of nerds and pure theorists. And maybe he would find it OK. And maybe he would want to increase the number of theorists, too. We don't know. But the prize has some traditional rules and expectations. Theorists only get their prizes for theories that have been experimentally verified – like Higgs and Englert.

The original BICEP2 claims about the very strong gravitational waves seem largely discredited now. This simple fact seems much more important for the question whether BICEP2 should be awarded a Nobel prize or not than some proposals to increase the number of theorists or reduce the number of experimental winners who just confirm predictions by theorists.

Concerning the obsession by the Nobel prizes, well, I think it's normal for the people who get close enough to be eligible to think about the prize. Some of the fathers of QCD knew that they deserved the prize and they were patiently waiting for some 30 years. The winners get some money directly, some extra money indirectly, and they may enjoy the life more than they did previously.

I think that the people who work on hep-th and ambitious hep-ph – like string theory and particle physics beyond the Standard Model – must know that according to the current scheme of things, the Nobel prize for their work is unlikely. But that doesn't mean that their work isn't the most valuable thing done in science. The best things in hep-th almost certainly are the most valuable part of science. But things are just arranged in such a way that authors of such ground-breaking theoretical papers haven't gotten a Nobel prize and they're expected not to get it soon, either.

Is that such a problem? I don't think so. The Nobel prize is a distinguished award and – with the exception of the Nobel prize in peace and perhaps literature – it keeps on rewarding people who have done something important and who are usually very smart, too. But the precise criteria that decide who is rewarded are a bit more subtle – the physics prize isn't meant to reward people who are smart and/or made a deep contribution, without additional adjectives. The contributions must be confirmed experimentally because that's how "physics" is defined in the Nobel prize context. So there are rather good reasons why even Stephen Hawking hasn't ever received a Nobel prize although most quantum gravity theorists – and most formal theoretical particle physicists – would agree that his contributions to physics have been greater than those of the average Nobel prize winner. But the Hawking radiation hasn't really been seen. For me, the observation is a formality – I have no real doubts about the existence of the Hawking radiation and other things – but I have no trouble to respect the rules of the game in which these formalities decide about the prize. These are just the rules of the Nobel prize – and those ultimately reflect the rules of the scientific method.

By the way, I think that many people who have been doing similar things as your humble correspondent are often reminded that "they wanted a Nobel prize". It's possible that as a kid, I have independently talked about such things as well but at the end, I think that the obsession with the Nobel prize has primarily been widespread in my (or our) environment, not in my own thinking. The real excitement that underlined some of my important ideas – and even the hopes that one can get much further with these ideas – have had virtually nothing to do with the Nobel prize for over 20 years.

If you look rationally, the Nobel prize is just an honor. I actually think that my opinions about these matters – including the importance of the Nobel prize – were largely shaped by Feynman's view above since the moment when I read "Surely You're Joking Mr Feynman" for the first time. And I was 17. Well, the Nobel prize is still a better honor than almost all others. After all, e.g. Richard Feynman who didn't like honors was one of those who got that particular honor. ;-) But it's unwise to be obsessed with the selection process and generic winners of that prize. At the end, the decision is one made by a smart but imperfect committee, and the prize primarily affects the winners only.

by Luboš Motl ( at April 22, 2018 04:31 PM

April 21, 2018

Peter Coles - In the Dark

Cardiff Bound

Just time for a quick post using the airport WIFI to fill some time before my flight leaves from Dublin Airport. Once again on a Saturday morning I was up at 5am to get the 6am bus here from Maynooth. The journey back to Cardiff is far from arduous, but I won’t be sorry when I won’t have to do it every week. Fortunately, term is coming to an end and after teaching finishes I won’t be dictated to by the timetables of Cardiff and Maynooth Universities. And after July I won’t have to do the trip at all!

This morning a large group – I believe the correct collective noun is a murder – of crows gathered to give the bus a sendoff. I did think of Hitchcock’s The Birds but the birds in this case were more interested in rummaging through the rubbish bin than attacking any of us waiting for the bus. Incidentally, it was the anniversary of Daphne Du Maurier’s death on 19th April; she wrote the short story on which that film was based.

Anyway, it’s a lovely sunny morning. Yesterday was a nice day too, both in terms of weather and other things. In the afternoon there was a staff barbecue and an awards ceremony at Maynooth University. There was a big crowd already there when I arrived, a bit late because I’d been at a seminar. Standing at the back I couldn’t really hear the speeches. I didn’t win any awards, of course, but I did get a glass of wine and a beefburger.

On my way home I bumped into the President, Philip Nolan, who is the equivalent of a Vice-Chancellor. To my surprise he mentioned a point I had raised in a recent Faculty meeting about the possibility of Maynooth signing up to the San Francisco Declaration on Research Assessment (DORA). To my even greater surprise he went on to say that this was going to be in the University’s strategic plan. Good news!

Anyway, I’d better make my way to the gate.  Have a nice day!


by telescoper at April 21, 2018 06:50 AM

Clifford V. Johnson - Asymptotia

Take Your Pick

I'll at two festivals this weekend, which I admit seems a bit over-ambitious! Let me tell you a little about both.

One event is the Los Angeles Times Festival of Books, which you've read me talking about many times over the years (that's a photo from 2015 above). It's the largest such festival in the USA, and is a wonderful celebration of books and related things. It is on Saturday and Sunday.

The other event is the San Diego Comic Fest, which also runs through the weekend (although it starts Friday). Don't mix this up with ComicCon (although there are connections between the two if you care to dig a little to find out).

As I write this post I'm actually basking in the sun as I ride on the train (the Pacific Surfliner) from Los Angeles to San Diego, as tomorrow I'll be giving a talk at the comics fest. Here are the details:
[...] Click to continue reading this post

The post Take Your Pick appeared first on Asymptotia.

by Clifford at April 21, 2018 02:14 AM

April 20, 2018

Emily Lakdawalla - The Planetary Society Blog

OSIRIS-REx shows us space isn't entirely empty
What a cool photo of OSIRIS-REx's sample return capsule! But wait, what's that black dot near the top?

April 20, 2018 07:07 PM

Peter Coles - In the Dark

Pictures from Post-Planck Cosmology in Pune

Regular readers of this blog (Sid and Doris Bonkers) will know that last year I went to the Inter-University Centre for Astronomy and Astrophysics in Pune (India) for a conference on `Post-Planck Cosmology’. Well, I recently received a copy of the official conference photograph, which I thought I’d share:

There is also an online collection of pictures taken during the talks, from which I have taken the liberty of extracting this picture of me during my talk:

I think this picture has a lot of potential for a caption competition, so please feel free to suggest captions through the comments block!

by telescoper at April 20, 2018 02:01 PM

Emily Lakdawalla - The Planetary Society Blog

The Opportunity selfie: 5000 Sols in the making
A personal story recounts how a NASA team used a microscopic imager to take a selfie of the Opportunity rover.

April 20, 2018 11:00 AM

Peter Coles - In the Dark

Melancholy – Johnny Dodds

Well, it’s fine and sunny today and if the weather doesn’t put a spring in your step, hopefully this will. It’s a lovely old tune and something of a jazz standard called Melancholy, but this is very probably the least melancholy version of it you’ll ever hear. On top of that it’s quite an interesting piece of jazz history, as it features legendary clarinet player Johnny Dodds (who played in King Oliver’s Creole Jazz Band and later in the Hot Fives and Hot Sevens with Louis Armstrong in the 1920s) as did pianist Lil Hardin, but the rest of the band is from a younger generation, especially Charlie Shavers on trumpet and Teddy Bunn (a much underrated guitarist). The rhythm section has a define taste of the Swing Era rather than New Orleans, but the main thing about this is how well the different styles blend together. Enjoy!

by telescoper at April 20, 2018 09:58 AM

Jester - Resonaances

Massive Gravity, or You Only Live Twice
Proving Einstein wrong is the ultimate ambition of every crackpot and physicist alike. In particular, Einstein's theory of gravitation -  the general relativity -  has been a victim of constant harassment. That is to say, it is trivial to modify gravity at large energies (short distances), for example by embedding it in string theory, but it is notoriously difficult to change its long distance behavior. At the same time, motivations to keep trying go beyond intellectual gymnastics. For example, the accelerated expansion of the universe may be a manifestation of modified gravity (rather than of a small cosmological constant).   

In Einstein's general relativity, gravitational interactions are mediated by a massless spin-2 particle - the so-called graviton. This is what gives it its hallmark properties: the long range and the universality. One obvious way to screw with Einstein is to add mass to the graviton, as entertained already in 1939 by Fierz and Pauli. The Particle Data Group quotes the constraint m ≤ 6*10^−32 eV, so we are talking about the De Broglie wavelength comparable to the size of the observable universe. Yet even that teeny mass may cause massive troubles. In 1970 the Fierz-Pauli theory was killed by the van Dam-Veltman-Zakharov (vDVZ) discontinuity. The problem stems from the fact that a massive spin-2 particle has 5 polarization states (0,±1,±2) unlike a massless one which has only two (±2). It turns out that the polarization-0 state couples to matter with the similar strength as the usual polarization ±2 modes, even in the limit where the mass goes to zero, and thus mediates an additional force which differs from the usual gravity. One finds that, in massive gravity, light bending would be 25% smaller, in conflict with the very precise observations of stars' deflection around the Sun. vDV concluded that "the graviton has rigorously zero mass". Dead for the first time...           

The second coming was heralded soon after by Vainshtein, who noticed that the troublesome polarization-0 mode can be shut off in the proximity of stars and planets. This can happen in the presence of graviton self-interactions of a certain type. Technically, what happens is that the polarization-0 mode develops a background value around massive sources which, through the derivative self-interactions, renormalizes its kinetic term and effectively diminishes its interaction strength with matter. See here for a nice review and more technical details. Thanks to the Vainshtein mechanism, the usual predictions of general relativity are recovered around large massive source, which is exactly where we can best measure gravitational effects. The possible self-interactions leading a healthy theory without ghosts have been classified, and go under the name of the dRGT massive gravity.

There is however one inevitable consequence of the Vainshtein mechanism. The graviton self-interaction strength grows with energy, and at some point becomes inconsistent with the unitarity limits that every quantum theory should obey. This means that massive gravity is necessarily an effective theory with a limited validity range and has to be replaced by a more fundamental theory at some cutoff scale 𝞚. This is of course nothing new for gravity: the usual Einstein gravity is also an effective theory valid at most up to the Planck scale MPl~10^19 GeV.  But for massive gravity the cutoff depends on the graviton mass and is much smaller for realistic theories. At best,
So the massive gravity theory in its usual form cannot be used at distance scales shorter than ~300 km. For particle physicists that would be a disaster, but for cosmologists this is fine, as one can still predict the behavior of galaxies, stars, and planets. While the theory certainly cannot be used to describe the results of table top experiments,  it is relevant for the  movement of celestial bodies in the Solar System. Indeed, lunar laser ranging experiments or precision studies of Jupiter's orbit are interesting probes of the graviton mass.

Now comes the latest twist in the story. Some time ago this paper showed that not everything is allowed  in effective theories.  Assuming the full theory is unitary, causal and local implies non-trivial constraints on the possible interactions in the low-energy effective theory. These techniques are suitable to constrain, via dispersion relations, derivative interactions of the kind required by the Vainshtein mechanism. Applying them to the dRGT gravity one finds that it is inconsistent to assume the theory is valid all the way up to 𝞚max. Instead, it must be replaced by a more fundamental theory already at a much lower cutoff scale,  parameterized as 𝞚 = g*^1/3 𝞚max (the parameter g* is interpreted as the coupling strength of the more fundamental theory). The allowed parameter space in the g*-m plane is showed in this plot:

Massive gravity must live in the lower left corner, outside the gray area  excluded theoretically  and where the graviton mass satisfies the experimental upper limit m~10^−32 eV. This implies g* ≼ 10^-10, and thus the validity range of the theory is some 3 order of magnitude lower than 𝞚max. In other words, massive gravity is not a consistent effective theory at distance scales below ~1 million km, and thus cannot be used to describe the motion of falling apples, GPS satellites or even the Moon. In this sense, it's not much of a competition to, say, Newton. Dead for the second time.   

Is this the end of the story? For the third coming we would need a more general theory with additional light particles beyond the massive graviton, which is consistent theoretically in a larger energy range, realizes the Vainshtein mechanism, and is in agreement with the current experimental observations. This is hard but not impossible to imagine. Whatever the outcome, what I like in this story is the role of theory in driving the progress, which is rarely seen these days. In the process, we have understood a lot of interesting physics whose relevance goes well beyond one specific theory. So the trip was certainly worth it, even if we find ourselves back at the departure point.

by Mad Hatter ( at April 20, 2018 08:11 AM

April 19, 2018

Emily Lakdawalla - The Planetary Society Blog

Funpost! Can you name a space vehicle for each letter of the alphabet?
There are NASA choices for all letters except Y and Z; for those two, you'll need to go international.

April 19, 2018 11:00 AM

Peter Coles - In the Dark

The Parnell Connection

Charles Stewart Parnell (1846-1891)

Taking a short breather and a cup of coffee in between this morning’s lecture and a forthcoming computer lab session I thought I’d do a quick post following on from a comment on yesterday’s post about an O-level History paper.

I was an undergraduate student at Magdalene College, Cambridge, which just happens to be where 19th century Irish nationalist politician Charles Stewart Parnell (above) studied, although I hasten to add that we weren’t contemporaries. There is an annual Parnell Lecture at Magdalene in his honour; an annual Coles lecture is yet to be established. Parnell is widely remembered here in Ireland too: thereis , for example, a handsome Georgian square in Dublin named after him.

Parnell was one of the most charismatic, capable and influential Parliamentarians of his era, and led the Irish Parliamentary Party at the forefront of moves for Home Rule for Ireland. He also had a splendid beard. His career was cut short by scandal in the form of an adulterous relationship with Kitty (Katherine) O’Shea, whom her husband divorced in 1889 naming Parnell in the case, and whom he married after the divorce. (Kitty, that is, not her husband.) They were not to enjoy life together for long, however, as Parnell died in 1891 of pneumonia in the arms of his wife in 1891 at their home in Brighton (Hove, actually).


by telescoper at April 19, 2018 09:46 AM

April 18, 2018

ZapperZ - Physics and Physicists

Forum with Congressman and Physicist Bill Foster
This is the talk given by Congressman and the only Physicist left in the US Congress, Bill Foster, at this year's APS March meeting.

I have been in attendance to one of Bill Foster's talk before, at the 2011 TIPP conference in Chicago. You may read my "live" reporting of that talk back then, and also a follow-up post on it.


by ZapperZ ( at April 18, 2018 02:20 PM

Emily Lakdawalla - The Planetary Society Blog

Recap: Breakthrough Discuss 2018
If you had a spaceship and could take it anywhere in the solar system to search for life, where would you go?

April 18, 2018 11:00 AM

Peter Coles - In the Dark

An O-Level History Examination from 1979

I have in the past posted a few examples of the O- and A-level examinations I took when I was at school. These have been mainly science and mathematics papers as those are relevant to the area of higher education in which I work, and I thought they might be of interest to students past and present.

A few people have emailed me recently to ask if I could share any other examinations, so here are the two History papers I took for O-level in June/July 1979. Can that really have been almost 40 years ago?

These were Papers 5 and 12 out of an unknown number of possible papers chosen by schools. My school taught us exclusively about British and European history from the mid-19th to early 20th centuries; you will observe that in both cases `history’ was deemed to have ended in 1914. It’s possible that some of the other papers paid more attention to the wider world.

I have no idea what modern GCSE history examinations look like, but I’d be interested in any comments from people who do about the style and content!

by telescoper at April 18, 2018 10:06 AM

John Baez - Azimuth

Applied Category Theory at NIST (Part 2)

Here are links to the slides and videos for most of the talks from this workshop:

Applied Category Theory: Bridging Theory & Practice, March 15–16, 2018, NIST, Gaithersburg, Maryland, USA. Organized by Spencer Breiner and Eswaran Subrahmanian.

They give a pretty good picture of what went on. Spencer Breiner put them up here; what follows is just a copy of what’s on his site.

Unfortunately, the end of Dusko Pavlovic’s talk, as well as Ryan Wisnesky’s and Steve Huntsman’s were lost due to a technical error. You can also find a Youtube playlist with all of the videos here.

Introduction to NIST:

Ram Sriram – NIST and Category Theory


Spencer Breiner – Introduction

Invited talks:

Bob Coecke – From quantum foundations to cognition via pictures


Dusko Pavlovic – Security Science in string diagrams (partial video)


John Baez – Compositional design and tasking of networks (part 1)


John Foley – Compositional design and tasking of networks (part 2)


David Spivak – A higher-order temporal logic for dynamical systems


Lightning Round Talks:

Ryan Wisnesky – Categorical databases (no video)

Steve Huntsman – Towards an operad of goals (no video)


Bill Regli – Disrupting interoperability (no slides)


Evan Patterson – Applied category theory in data science


Brendan Fong – data structures for network languages


Stephane Dugowson – A short introduction to a general theory of interactivity


Michael Robinson – Sheaf methods for inference


Cliff Joslyn – Seeking a categorical systems theory via the category of hypergraphs


Emilie Purvine – A category-theoretical investigation of the type hierarchy for heterogeneous sensor integration


Helle Hvid Hansen – Long-term values in Markov decision processes, corecursively


Alberto Speranzon – Localization and planning for autonomous systems via (co)homology computation


Josh Tan – Indicator frameworks (no slides)

Breakout round report

by John Baez at April 18, 2018 05:47 AM

Emily Lakdawalla - The Planetary Society Blog

Curiosity Update, sols 1972-2026: Completing the Vera Rubin Ridge Walkabout
The Curiosity team has completed its initial survey of the top of Vera Rubin Ridge, and is ready to make another attempt at drilling after the rock at Lake Orcadie proved to be too hard.

April 18, 2018 12:34 AM

April 17, 2018

ZapperZ - Physics and Physicists

The Friedmann Equation
Astrophysicist Ethan Siegal picked the Friedmann equation as the "most important" equation in the universe.

The first Friedmann equation describes how, based on what is in the universe, its expansion rate will change over time. If you want to know where the Universe came from and where it's headed, all you need to measure is how it is expanding today and what is in it. This equation allows you to predict the rest!

I don't have the "most important equation" in the universe for my pick, mainly because I don't know the criteria for picking such a thing. And often times, people confuses "interesting" with "important", which need not be mutually inclusive.

It's still fun to read what other physicists think is the most important equation, even if I don't necessarily agree with their picks.


by ZapperZ ( at April 17, 2018 04:28 PM

Lubos Motl - string vacua and pheno

Einstein's amateur popularizer in Florida sketched 10D (stringy) spacetime in 1928
Thanks to Willie Soon, Paul Halpern.

St Petersburg Times, Sunday, November 11th, 1928
Guest blog by John Nations, 3141 Twenty-sixth avenue South, City (St. Petersburg), Nov. 9, 1928

Mr Nations played with glimpses of string theory in 1928 and in that year, Lonnie Johnson recorded "Playing with the strings" about that achievement.

Open forum (on the right side from the picture)

Editor The Times:

A lot of people believe that Einstein is as transparent as boiler iron, one able authority estimating roughly that at least eight people in the world understand him.

This should not be considered a disparagement. Those who understand Einstein can easily vindicate themselves by explaining him in "street" terms to those who avoid the subject for the sake of two things, honesty and delicacy. Those who admit that they understand Einstein might choose to tell just what would happen if an Antares should derail and go through the curve where space "curves around." It is beyond the small comprehensive powers of a large group, just what would happen to that great orb should it become entangled in a void of nothingness that isn't even space. When Mr. Einstein declares that space is not infinite but curves around, that settles it for those with broad vision, but not for the great masses who insist upon speculating upon what exists just outside the "curve" where space is claimed to stop.

And as to time being the fourth dimension, a lot of ignorant folks might say that it is as good name for it as anything, but they might also ask about the ninth dimension or the tenth—not yet being reconciled to the fact that there has to be a fourth dimension tucked away somewhere in time, space or music, and figuring that since there is bound to be a fourth dimension there is bound to be a sixteenth dimension, since one is quite as reasonable as the other to their small conception.

[Bold face added by LM for emphasis.]

A concise explanation of Einstein's theory of relativity would doubtless be appreciated by thousands of people, but anyone attempting an explanation should refrain from Einsteinian phraseology—the big crowd doesn't understand that. For instance, in attempting to explain the location and predicament of Antares should that orb break jail and plunge through Einstein's "curve 'around'" it would not be advisable to say: Function measured in speed, amplitude, frequency, infrequency etc.; nor that Antares bumped into fourth dimension and rebounded like a hailstone off a greenhouse.

Maurice Ravel's "Bolero" premiered in 1928. Bolero is from Spanish "bula" (ball, whirling motion).

That might all be to the point but so many could not understand. It would be more tenable, less abstruse, if explained in terms indigenous of the ignorant. Many of the ignorant persist in the belief that time and space are brothers and infinite, and when they are told that either space or time is limited they are sure to ask about what is outside of space or, after time ceases how long such a condition can prevail—it is very difficult to explain those simple little details so that the average man can grasp your meaning.

It is easy to state that Einstein is simple and clear and unerring, and not so difficult to explain him in terms that you do not understand yourself—that is the usual way it is done. It sometimes scares the crowd and makes them envious of your deep insight, but when a poor, dumb fellow who has been too weak to grasp the impossibility of attaching a meaning to your baffling claims, asks you some of these simple questions about what happens after time ends or outside the domains of space, it is comfortable to have a long list of scrambled, incoherent words already prepared to smother him or he will cause you trouble.

3141 Twenty-sixth avenue South, City, Nov. 9, 1928


\left(\beta mc^2 + c\left(\sum_{n \mathop =1}^{3}\alpha_n p_n\right)\right) \psi (x,t) = i \hbar \frac{\partial\psi(x,t) }{\partial t}

\] The (original) Dirac equation above was also published in 1928. Too bad, Mr Dirac didn't cooperate with Mr Nations. They could have obtained string theory (or the superstring) 40 or 45 years earlier.

LM: Thanks for the nice contribution, John, and sorry for the delay before I published this guest blog. I guess that you are already dead by now and your house seems to be replaced by a highway. But if I understood you well, you recommend popularizers of relativity to start with plain English but switch to the fancy technical language as soon as the audience starts to ask something, even if the speaker doesn't understand the meaning of these fancy words, just to reduce the annoying questions. Clever. ;-)

Concerning the 9 spatial or 10 spacetime dimensions of string theory, it seems that you (or the people who annoyed you with obvious questions) found them as straightforward and as valid as the curved and possibly compact topology of the spacetime according to general relativity. It was a great guess. Indeed, when combined with the entanglement of quantum mechanics, when music of string theory and perhaps Antares are allowed, and when the Woit of nothingness is eliminated, Einstein's general relativity implies string theory with its 10-dimensional spacetime. Did you have a proof or did you guess? I am asking because even now, almost 90 years after your letter, I only have a partial proof of your statement.

Thanks, I will probably need a truly compact spacetime with closed time-like curves to get the answers from you.

by Luboš Motl ( at April 17, 2018 04:24 PM

CERN Bulletin

Crèche and School of the Staff Association: a programme for children from 2 to 4 year old

Find out how children between the ages of two to four and their families can benefit from the Staff Association Crèche and School programme.

Classes for children aged between two and four years have been set-up to ensure the initial transition from home to school is as smooth as possible. Children attend mornings only and are welcomed by the same group of teaching staff throughout the week, allowing the children to establish a link between their home life and the crèche/school.

The challenge of these classes is to offer the children a happy environment where they can feel emotionally secure, and the teaching staff can create a harmonious atmosphere allowing them to learn by moving, manipulating, exchanging, making mistakes… playing. Creativity takes a central role as it enables the child to express him or herself and increase their ability to handle their emotions constructively.

According to Albert Einstein ‘Play is the highest form of research’.  

Building a sense of self-confidence is essential for a child’s development. High self-esteem will enable the child to develop a comfortable relationship with others, and at the same time, the child must learn to play and work cooperatively with others within the class.

The teaching staff are there to offer support to the children and to ensure that any conflicts are resolved through negotiation and compromise.

One of the strengths and particularities of this class is the fact that the children are multilingual. More often than not, the children are from non-francophone families and with the right support, they begin to assimilate the French language which steadily becomes an integral part of their environment. The younger members of the class will also develop their communication skills from non-verbal to speaking.

Taking responsibility, helping others and self-help are the key aspects of this class for pre-schoolers, which will give them the foundations required to understand the needs of others.

April 17, 2018 11:04 AM

CERN Bulletin

What’s that you say? ‘Le Jardin des Particules’ (The Particle Garden)?

A few weeks ago, the Staff Association launched a competition to find a new name for the Staff Association Nursery and School, currently known under the acronym EVEE (‘Espace de Vie Enfantine et École’). The goal of the competition was to find a memorable, well-liked name, which also defines the professional, dynamic and playful identity of the establishment.

Many thanks to the many who took part in the competition.

The jury, which was composed of seven people (Members of the staff association, the Headmistress of the school and representatives of teaching members of staff as well as a representative of parents) initially selected four entries.

On 21 March, the jury met for a second time and unanimously selected the winning name ‘Le Jardin de Particules’ (The Particle Garden)

Why was this the winner?

* ‘Jardin’ or garden refers to the historical name of the establishment, formerly known as ‘le jardin d’enfants’, thereby maintaining a certain continuity. It also makes reference to the superb garden surrounding the school, a real plus point for the establishment. The grounds surrounding the school are an essential component of the educational programme and are highly appreciated by the parents.

* The word ‘particules’ (particles) can be associated to both the children and CERN.
This word evokes the frenetic and dynamic movement of children, their interactions with each other and their environment. Each particle’s specificity is an example of the child’s individuality and suggests the promise of a being in the making.
There is also a parallel between the diversity of elementary particles and the multicultural origins of the school’s pupils and families.
‘Particules’ is also a nod to the ‘Esplanade des particules’ neighbouring the school and a close link with CERN, the world-renowned Organization for Particle Physics.

A prize-giving ceremony will be held soon to reveal the winners’ names.

Many THANKS to all who took part and who contributed to finding a new name for the CERN Staff Association Nursery and School.

Welcome and long-live the ‘Jardin des Particules’

April 17, 2018 10:04 AM

April 16, 2018

CERN Bulletin

24 April 2018: Ordinary General Assembly of the Staff Association!

Conformément aux statuts de l’Association du personnel, une Assemblée générale ordinaire est organisée une fois par année (article IV.2.1).

Projet d’ordre du jour :

  1. Adoption de l’ordre du jour;
  2. Approbation du procès-verbal de l’Assemblée générale du 29 juin 2017;
  3. Présentation et approbation du rapport d’activité 2017;
  4. Présentation et approbation du rapport financier 2017;
  5. Présentation et approbation du rapport des vérification aux comptes;
  6. Programme de travail 2018;
  7. Présentation et approbation du budget 2018;
  8. Approbation du taux de cotisation 2019;
  9. Election des membres de la commission électorale;
  10. Election des vérificateurs aux comptes;
  11. Présentation et approbation du principe de création d’une fondation pour reprendre les activité de la Crèche et École de l'Association du personnel du CERN;
  12. Divers

Nous vous rappelons l’article IV.3.4 des Statuts de l’Association disant que :

« Les membres peuvent, après épuisement de l’ordre du jour, mettre en discussion d’autres questions avec le consentement de l’Assemblée, mais seules les questions inscrites à l’ordre du jour peuvent faire l’objet de décisions. L’Assemblée peut cependant charger le Conseil d’étudier toute question qui lui apparaîtrait utile ».

April 16, 2018 04:04 PM

CERN Bulletin

CERN Relay Race 2018

The CERN running club, in collaboration with the Staff Association, is happy to announce the 2018 relay race edition. It will take place on Thursday, May 24th and will consist as every year in a round trip of the CERN Meyrin site in teams of 6 members. It is a fun event, and you do not have to run fast to enjoy it.

Registrations will be open from May 1st to May 22nd on the running club web site. All information concerning the race and the registration are available there too:

A video of the previous edition is also available here :

As every year, there will be animations starting at noon on the lawn in front of restaurant 1, and information stands for many CERN associations and clubs will be available. The running club partners will also be participate in the event, namely Berthie Sport, Interfon and Uniqa.

April 16, 2018 02:04 PM

CERN Bulletin


Maxence Piquet - ChantalEn dehors des frontières

Maxence Piquet

Du 2 au 11 mai 2018 | CERN Meyrin, Bâtiment principal

Exposition de peinture d'un artiste autodidacte Maxence Piquet (signature artiste M-P), avec différentes techniques (acrylique, huile, fusain, collage...) et sur différents supports.

Un art souvent brut et parfois provoquant, avec des touches expressionnistes et cubistes principale origine de son art.
Des œuvres souvent vivent et colorées...

Cette exposition est la première en dehors d ses frontières Lorraine et a pour but de faire voyager son art au regard du plus grand nombre .

Pour plus d’informations et demandes d’accès : | Tél: 022 766 37 38

April 16, 2018 02:04 PM

April 13, 2018

ZapperZ - Physics and Physicists

An Overview of CLIC at CERN
This is the lesser known effort at CERN among the general public, and yet, it may have one of the most significant impacts coming out of this high-energy physics lab.

CLIC, or the Compact Linear Collider research project at CERN has been studying accelerator science for many years. This is one of a few prominent research centers on accelerator physics throughout the world. Both they and many other accelerator research centers are making advancements in accelerator science that have a direct benefit and application to the general public.

So my intention in highlighting this article is not simply for you to learn what the people at CLIC do. Some of the description may even be beyond your understanding. What you should focus on is all the applications that are already in use, or can be possible in the near future, on the advancements made in this area of physics/engineering. These applications are not just within physics/engineering.

Unfortunately, as I've stated a few times in this blog, funding for accelerator science is often tied to funding in high energy physics, and for the US, the funding profile in this sector has been abysmal. So while accelerator science is actually independent of HEP, its funding has gone downhill with HEP funding over the last few years, especially after the shutdown of the Tevatron at Fermilab.

Whether you support funding, or increase in funding, of this area of study is a different matter, but you should at least be aware and have the knowledge of what you are supporting or not supporting, and not simply make a decision based on ignorance of what it is and what it's implication can be.


by ZapperZ ( at April 13, 2018 02:10 PM

John Baez - Azimuth

Applied Category Theory 2018 Schedule

Here’s the schedule of the ACT2018 workshop:

Click to enlarge!

They put me on last, either because my talk will be so boring that it’s okay everyone will have left, or because my talk will be so exciting that nobody will want to leave. I haven’t dared ask the organizers which one.

On the other hand, they’ve put me on first for the “school” which occurs one week before the workshop. Here’s the schedule for the ACT 2018 Adjoint School:

by John Baez at April 13, 2018 01:29 AM

April 12, 2018

The n-Category Cafe

Torsion: Graph Magnitude Homology Meets Combinatorial Topology

As I explained in my previous post, magnitude homology categorifies the magnitude of graphs. There are two questions that will jump out to seasoned students of homology.

  • Are there two graphs which have the same magnitude but different homology groups?
  • Is there a graph with torsion in its homology groups?

Both of these were stated as open questions by Richard Hepworth and me in our paper as we were unable to answer them, despite thinking about them a fair bit. However, recently both of these have been answered positively!

The first question has been answered positively by Yuzhou Gu in a reply to my post. Well, this is essentially answered, in the sense that he has given two graphs both of which we know (provably) the magnitude of, one of which we know (provably) the magnitude homology groups of and the other of which we can compute the magnitude homology of using mine and James Cranch’s SageMath software. So this just requires verification that the program result is correct! I have no doubt that it is correct though.

The second question on the existence of torsion is what I want to concentrate on in this post. This question has been answered positively by Ryuki Kaneta and Masahiko Yoshinaga in their paper

It is a consequence of what they prove in their paper that the graph below has <semantics>2<annotation encoding="application/x-tex">2</annotation></semantics>-torsion in its magnitude homology; SageMath has drawn it as a directed graph, but you can ignore the arrows. (Click on it to see a bigger version.)


In their paper they prove that if you have a finite triangulation <semantics>T<annotation encoding="application/x-tex">T</annotation></semantics> of an <semantics>m<annotation encoding="application/x-tex">m</annotation></semantics>-dimensional manifold <semantics>M<annotation encoding="application/x-tex">M</annotation></semantics> then you can construct a graph <semantics>G((T)<annotation encoding="application/x-tex">G((T)</annotation></semantics> so that the reduced homology groups of <semantics>M<annotation encoding="application/x-tex">M</annotation></semantics> embed in the magnitude homology groups of <semantics>G((T)<annotation encoding="application/x-tex">G((T)</annotation></semantics>:

<semantics>H˜ i(M)MH i+2,m+2(G(T))for 0im.<annotation encoding="application/x-tex"> \widetilde{\mathrm{H}}_i(M)\hookrightarrow MH_{i+2, m+2}( G(T)) \,\,\,\, \text{for }\,\,0\le i \le m. </annotation></semantics>

Following the suggestion in their paper, I’ve taken a minimal triangulation <semantics>T 0<annotation encoding="application/x-tex">T_0</annotation></semantics> of the real projective plane <semantics>P 2<annotation encoding="application/x-tex">\mathbb{R} P^2</annotation></semantics> and used that to construct the above graph. As we know <semantics>H 1(P 2)=/2<annotation encoding="application/x-tex">\mathrm{H}_1(\mathbb{R} P^2)=\mathbb{Z}/2\mathbb{Z}</annotation></semantics>, we know that there is <semantics>2<annotation encoding="application/x-tex">2</annotation></semantics>-torsion in <semantics>MH 3,4(G(T 0))<annotation encoding="application/x-tex">MH_{3,4}(G({T_0}))</annotation></semantics>.

In the rest of this post I’ll explain the construction of the graph and show explicitly how to give a <semantics>2<annotation encoding="application/x-tex">2</annotation></semantics>-torsion class in <semantics>MH 3,4(G(T 0))<annotation encoding="application/x-tex">MH_{3,4}(G({T_0}))</annotation></semantics>. I’ll illustrate the theory of Kaneta and Yoshinaga by working through a specific example. Barycentric subdivision plays a key role!

The minimal triangulation of the projective plane

We are going to construct our graph from the minimal triangulation of <semantics>P 2<annotation encoding="application/x-tex">\mathbb{R} P^2</annotation></semantics> so let’s have a look at that first. We want to see how the <semantics>2<annotation encoding="application/x-tex">2</annotation></semantics>-torsion in the homology of <semantics>P 2<annotation encoding="application/x-tex">\mathbb{R} P^2</annotation></semantics> can be expressed using this triangulation as we will need that later for the <semantics>2<annotation encoding="application/x-tex">2</annotation></semantics>-torsion in the graph magnitude homology.

The real projective plane can be thought of as the two-sphere quotiented out by the antipodal map. The antipodal map acts on the icosahedral triangulation of the two-sphere. So quotienting the icosahedron by the antipodal map gives us a triangulation <semantics>T 0<annotation encoding="application/x-tex">T_0</annotation></semantics> of <semantics>P 2<annotation encoding="application/x-tex">\mathbb{R} P^2</annotation></semantics> which is in fact the triangulation with fewest simplices. Here is a picture of it.


I’ve numbered the vertices and we will label each simplex by its vertices, so <semantics>(0,1,2)<annotation encoding="application/x-tex">(0, 1, 2)</annotation></semantics> is a 2-simplex you can see in the triangulation above. The label for a simplex will have the vertices in linear order.

Let’s recall how we get the <semantics>2<annotation encoding="application/x-tex">2</annotation></semantics>-torsion element in homology. If we take the boundary <semantics><annotation encoding="application/x-tex">\partial</annotation></semantics> of the ten 2-simplices with the orientation drawn above then we get

<semantics>(3,4)+(4,5)(3,5)+(3,4)+(4,5)(3,5).<annotation encoding="application/x-tex"> (3, 4)+(4,5)-(3,5) + (3, 4)+(4,5)-(3,5). </annotation></semantics>

(We write <semantics>(3,5)<annotation encoding="application/x-tex">-(3,5)</annotation></semantics> rather than <semantics>(5,3)<annotation encoding="application/x-tex">(5,3)</annotation></semantics> because of the orientation conventions which I will gloss over. You can figure them out if you’re interested/concerned. I think they are right!)

As <semantics> 2=0<annotation encoding="application/x-tex">\partial^2=0</annotation></semantics> this boundary chain is a cycle. To get homology we quotient cycles out by boundaries, so this cycle is trivial in homology, which means

<semantics>2[(3,4)+(4,5)(3,5)]=0H 1 simp(T 0).<annotation encoding="application/x-tex"> 2[(3, 4)+(4,5)-(3,5)] = 0 \in \mathrm{H}_1^{\mathrm{simp}}(T_0). </annotation></semantics>

Thus <semantics>[(3,4)+(4,5)(3,5)]<annotation encoding="application/x-tex">[(3, 4)+(4,5)-(3,5)]</annotation></semantics> is a <semantics>2<annotation encoding="application/x-tex">2</annotation></semantics>-torsion element. Off the top of my head I can’t think of a nice argument showing that this is non-trivial in homology, but hopefully someone can provide one in the comments! Anyway, we’re going to need this later.

Constructing the graph and the magnitude cycles

Following Kaneta and Yoshinaga, we are now going to use the above triangulation <semantics>T 0<annotation encoding="application/x-tex">T_0</annotation></semantics> to build our graph <semantics>G(T 0)<annotation encoding="application/x-tex">G({T_0})</annotation></semantics>. We take the simplices of the triangulation as the nodes of the graph and have an edge <semantics>στ<annotation encoding="application/x-tex">\sigma \to \tau</annotation></semantics> if <semantics>σ<annotation encoding="application/x-tex">\sigma</annotation></semantics> is a facet of <semantics>τ<annotation encoding="application/x-tex">\tau</annotation></semantics>, remember that a facet is a face of maximal dimension. We then add top and bottom nodes, so <semantics>bottom<annotation encoding="application/x-tex">\mathrm{bottom}</annotation></semantics> has an arrow to each <semantics>0<annotation encoding="application/x-tex">0</annotation></semantics>-simplex and <semantics>top<annotation encoding="application/x-tex">\mathrm{top}</annotation></semantics> has an arrow from each <semantics>2<annotation encoding="application/x-tex">2</annotation></semantics>-dimensional simplex. Here is the graph again. You should be able to see the six vertices, fifteen edges and ten faces of the original triangulation.


A more sophisticated way of saying what we’ve done is the following. The vertices in the triangulation form a poset, the face poset, with the ordering is ‘is a face of’. We add top and bottom elements to that poset. We then take the Hasse diagram which is the graph which has the elements of the poset as its nodes and where there is an edge <semantics>xy<annotation encoding="application/x-tex">x\to y</annotation></semantics> if <semantics>xy<annotation encoding="application/x-tex">x\le y</annotation></semantics> but there is no <semantics>z<annotation encoding="application/x-tex">z</annotation></semantics> with <semantics>xzy<annotation encoding="application/x-tex">x\le z\le y</annotation></semantics>. Clearly this process gives us a graph <semantics>G(T)<annotation encoding="application/x-tex">G({T})</annotation></semantics> from any triangulation <semantics>T<annotation encoding="application/x-tex">T</annotation></semantics> of a manifold.

We can obtain the graph <semantics>G(T 0)<annotation encoding="application/x-tex">G({T_0})</annotation></semantics> in SageMath with a couple of commands:

triangulation = simplicial_complexes.RealProjectivePlane()
poset = triangulation.face_poset().with_bounds()
graph = poset.hasse_diagram()

To see the above picture you can use the following command:


For what we do next it doesn’t matter if we stick with this directed graph or take the associated undirected graph as we are going to be forced to only consider upward pointing edges.

The magnitude chains for this graph

In the previous post I explained that for a finite graph <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics> the magnitude chain groups are defined as follows.

A chain generator is a tuple of the form <semantics>c=(x 0,x 1,,x k1,x k),<annotation encoding="application/x-tex">c= (x_0, x_1,\dots, x_{k-1},x_k),</annotation></semantics> where each <semantics>x i<annotation encoding="application/x-tex">x_i</annotation></semantics> is a node of the graph <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics> and <semantics>x i1x i<annotation encoding="application/x-tex">x_{i-1}\ne x_{i}</annotation></semantics>. The degree is <semantics>deg(c)=k<annotation encoding="application/x-tex">\mathrm{deg}(c)=k</annotation></semantics> and the length is <semantics>len(c)=d(x i1,x i)<annotation encoding="application/x-tex">\mathrm{len}(c)=\sum \mathrm{d}(x_{i-1}, x_i)</annotation></semantics>.

The face map <semantics> i<annotation encoding="application/x-tex">\partial_i</annotation></semantics> for <semantics>i=1,,k1<annotation encoding="application/x-tex">i=1,\dots, k-1</annotation></semantics> is defined by:

<semantics> i(x 0,,x k)={(x 0,,x i^,,x k) ifx i1<x i<x i+1, 0 otherwise.<annotation encoding="application/x-tex"> \partial_{i}(x_0,\ldots,x_k) = \begin{cases} (x_0,\ldots,\widehat{x_i},\ldots,x_k) & \text{if}\,\, x_{i-1} \lt x_{i} \lt x_{i+1}, \\ 0 & \text{otherwise}. \end{cases} </annotation></semantics>

where <semantics>x i1<x i<x i+1<annotation encoding="application/x-tex">x_{i-1} \lt x_{i}\lt x_{i+1}</annotation></semantics> means that <semantics>x i<annotation encoding="application/x-tex">x_i</annotation></semantics> lies on a shortest path between <semantics>x i1<annotation encoding="application/x-tex">x_{i-1}</annotation></semantics> and <semantics>x i+1<annotation encoding="application/x-tex">x_{i+1}</annotation></semantics>, i.e., <semantics>d(x i1,x i)+d(x i,x i+1)=d(x i1,x i+1)<annotation encoding="application/x-tex">\text{d}(x_{i-1},x_i)+\text{d}(x_i,x_{i+1})=\text{d}(x_{i-1},x_{i+1})</annotation></semantics>.

Neither the length nor the endpoints of chains are altered by the face maps, so the magnitude complex splits up into subcomplexes with specified endpoints. So if we define

<semantics>MC k,l y,z(G)=(y,x 1,,x k1,z)|chain generator of length l.<annotation encoding="application/x-tex"> MC^{y,z}_{k,l}(G)=\left\langle (y, x_1,\dots, x_{k-1},z) \,\, |\,\, \text{chain generator of length }\,\,l\right\rangle. </annotation></semantics>

Then the magnitude chain complex splits as

<semantics>MC *,*(G)= y,zG lMC *,l y,z(G)<annotation encoding="application/x-tex"> MC_{\ast,\ast}(G)=\bigoplus_{y,z\in G}\bigoplus_l MC^{y,z}_{\ast,l}(G) </annotation></semantics>

We will concentrate on the subcomplex of length-four chains from the bottom element to the top element in our graph (here, four is dimension of <semantics>P 2<annotation encoding="application/x-tex">\mathbb{R} P^2</annotation></semantics> plus two). Writing <semantics>b<annotation encoding="application/x-tex">\mathrm{b}</annotation></semantics> and <semantics>t<annotation encoding="application/x-tex">\mathrm{t}</annotation></semantics> for the bottom and top elements we consider the magnitude chain complex <semantics>MC *,4 b,t(G(T 0)<annotation encoding="application/x-tex">MC^{\mathrm{b},\mathrm{t}}_{\ast,4}(G({T_0})</annotation></semantics>. We will see that the homology of this is isomorphic to <semantics>H˜ *+2(P 2)<annotation encoding="application/x-tex">\widetilde{\mathrm{H}}_{\ast+2}(\mathbb{R} P_2)</annotation></semantics> and so we get the embedding <semantics>H˜ *(P 2)MC *+2,4(G)<annotation encoding="application/x-tex">\widetilde{\mathrm{H}}_{\ast}(\mathbb{R} P_2)\hookrightarrow MC_{\ast+2,4}(G)</annotation></semantics>.

Looking at our graph it is easy to see that a length four chain must be of the form

<semantics>(b,σ 1,,σ k1,t)<annotation encoding="application/x-tex"> (\mathrm{b}, \sigma_1, \dots,\sigma_{k-1},\mathrm{t}) </annotation></semantics>

where <semantics>σ 1σ k1<annotation encoding="application/x-tex">\sigma_1\subset \dots\subset\sigma_{k-1}</annotation></semantics> is a sequence of simplices, each of which is a face of the following one, in other words it is flag of simplices in our original triangulation. Such a flag can be a full flag like <semantics>(0)(0,1)(0,1,2)<annotation encoding="application/x-tex">(0)\subset(0,1)\subset(0,1,2)</annotation></semantics> in which each is a facet of the following simplex, or it can be a partial flag like <semantics>(0)(0,1,2)<annotation encoding="application/x-tex">(0)\subset (0,1,2)</annotation></semantics>.

Looking at the formula for the face maps you can see that given a generator corresponding to a flag of simplices, each facet of the generator corresponds to a flag with one of the simplices removed.

Those of you familiar with combinatorial topology might have spotted the connection with the barycentric subdivision construction. We will look a little more closely at these flags now.

Barycentric subdivision

One of the things you usually learn in a first course on algebraic topology is that if you have a triangulation of a space then you can form a finer triangulation – the barycentric subdivision – in the following way. Take the midpoint of each simplex in the original triangulation and use these as the vertices of the new triangulation, decomposing each of the old <semantics>n<annotation encoding="application/x-tex">n</annotation></semantics>-simplices into <semantics>(n+1)!<annotation encoding="application/x-tex">(n+1)!</annotation></semantics> new <semantics>n<annotation encoding="application/x-tex">n</annotation></semantics>-simplices in ‘the obvious fashion.

For instance, if we take a triangle in our triangulation then each <semantics>1<annotation encoding="application/x-tex">1</annotation></semantics>-simplex gets split into <semantics>2<annotation encoding="application/x-tex">2</annotation></semantics> new <semantics>1<annotation encoding="application/x-tex">1</annotation></semantics>-simplices and the <semantics>2<annotation encoding="application/x-tex">2</annotation></semantics>-simplex gets split into <semantics>6<annotation encoding="application/x-tex">6</annotation></semantics> new <semantics>2<annotation encoding="application/x-tex">2</annotation></semantics>-simplices.


Each new vertex can be labelled by the <semantics>n<annotation encoding="application/x-tex">n</annotation></semantics>-simplex that it is the midpoint of. What then is ‘the obvious fashion’ for creating the new simplices? Well each new <semantics>n<annotation encoding="application/x-tex">n</annotation></semantics>-simplex in the subdivision corresponds to a flag of old simplices <semantics>σ 0σ 1σ n<annotation encoding="application/x-tex">\sigma_0\subset \sigma_1\subset \dots \subset \sigma_n</annotation></semantics>. We can picture the <semantics>2<annotation encoding="application/x-tex">2</annotation></semantics>-simplex corresponding to <semantics>(0)(0,1)(0,1,2)<annotation encoding="application/x-tex">(0)\subset(0,1)\subset(0,1,2)</annotation></semantics> and the <semantics>1<annotation encoding="application/x-tex">1</annotation></semantics>-simplex corresponding to <semantics>(0)(0,1,2)<annotation encoding="application/x-tex">(0)\subset (0,1,2)</annotation></semantics> as follows.


It ought to be clear that for the <semantics>n<annotation encoding="application/x-tex">n</annotation></semantics>-simplex in the subdivision corresponding to a flag <semantics>σ 0σ 1σ n<annotation encoding="application/x-tex">\sigma_0\subset \sigma_1\subset \dots \subset \sigma_n</annotation></semantics>, each facet corresponds to a flag where one of the old simplices have been removed. So for <semantics>(0)(0,1)(0,1,2)<annotation encoding="application/x-tex">(0)\subset(0,1)\subset(0,1,2)</annotation></semantics> the facets correspond to <semantics>(0,1)(0,1,2)<annotation encoding="application/x-tex">(0,1)\subset(0,1,2)</annotation></semantics>, <semantics>(0)(0,1,2)<annotation encoding="application/x-tex">(0)\subset(0,1,2)</annotation></semantics> and <semantics>(0)(0,1)<annotation encoding="application/x-tex">(0)\subset(0,1)</annotation></semantics>.

So the barycentric subdivision <semantics>T<annotation encoding="application/x-tex">T'</annotation></semantics>, as a simplicial complex is isomorphic to the simplicial complex of flags where an <semantics>n<annotation encoding="application/x-tex">n</annotation></semantics>-simplex is a flag <semantics>σ 0σ 1σ n<annotation encoding="application/x-tex">\sigma_0\subset \sigma_1\subset \dots \subset \sigma_n</annotation></semantics> of simplices of the original triangulation <semantics>T<annotation encoding="application/x-tex">T</annotation></semantics>. Well, there’s a subtlety here in that we shouldn’t forget the empty flag! The flags naturally form an augmented simplicial complex, meaning that there is a unique simplex in degree <semantics>1<annotation encoding="application/x-tex">-1</annotation></semantics>, the empty flag, which is the facet of every <semantics>0<annotation encoding="application/x-tex">0</annotation></semantics>-simplex.

So the augmented chain complex <semantics>C˜ * simp(T)<annotation encoding="application/x-tex">\widetilde{\mathrm{C}}_\ast^{\mathrm{simp}}(T')</annotation></semantics>, which is obtained from the usual chain complex by sticking a unique generator in degree <semantics>1<annotation encoding="application/x-tex">-1</annotation></semantics>, is isomorphic to the complex of flags. The homology of the augmented chain complex gives the reduced homology <semantics>H˜ * simp(T)H˜ *(M)<annotation encoding="application/x-tex">\widetilde{\mathrm{H}}_\ast^{\mathrm{simp}}(T')\cong \widetilde{\mathrm{H}}_\ast(M)</annotation></semantics> which in practice means you kill off a copy of <semantics>Z<annotation encoding="application/x-tex">\mathrm{Z}</annotation></semantics> in degree zero from the usual homology.

The important thing is that we are seeing here precisely the same structure that we saw in the magnitude chain group <semantics>MC *,l b,t(G)<annotation encoding="application/x-tex">MC_{\ast, l}^{b,t}(G)</annotation></semantics>. We should now make precise.

Synthesis of the two sides: the theorem

I’ve hopefully given the impression that a complex of flags of simplices is isomorphic to both the augmented chain complex of the barycentric subdivision of a triangulation and to a subcomplex of the magnitude chain complex of the graph of the triangulation. Let’s give a proper statement of this now. Kaneta and Yoshinaga have a more general statement in their paper, involving ranked posets rather than just triangulations, but for the purpose of finding torsion, the following will suffice.

Theorem. Suppose that <semantics>T<annotation encoding="application/x-tex">T</annotation></semantics> is a finite triangulation of a <semantics>m<annotation encoding="application/x-tex">m</annotation></semantics>-manifold <semantics>M<annotation encoding="application/x-tex">M</annotation></semantics>, and that <semantics>T<annotation encoding="application/x-tex">T'</annotation></semantics> is its barycentric subdivision, with <semantics>G(T)<annotation encoding="application/x-tex">G({T})</annotation></semantics> the graph obtained as above from the poset structure, then the isomorphism of (augmented) chain groups for <semantics>k1<annotation encoding="application/x-tex">k\ge -1</annotation></semantics>

<semantics>C˜ k simp(T) MC k+2,m+2 b,t(G(T)); σ 0σ 1σ k (b,σ 0,,σ k,t)<annotation encoding="application/x-tex"> \begin{aligned} \widetilde{\mathrm{C}}^{\mathrm{simp}}_k (T')&\xrightarrow{\sim} MC^{\mathrm{b},\mathrm{t}}_{k+2,m+2}(G({T})); \\ \sigma_0\subset \sigma_1\subset \dots \subset \sigma_k &\mapsto (b, \sigma_0,\dots, \sigma_k,t) \end{aligned} </annotation></semantics>

commutes up to sign with the differentials. Thus this induces an isomorphism of homology groups

<semantics>H˜ * simp(T)MH *+2,m+2 b,t(G(T)).<annotation encoding="application/x-tex"> \widetilde{\mathrm{H}}^{\mathrm{simp}}_\ast (T')\xrightarrow{\simeq} MH^{b,t}_{\ast+2,m+2}(G({T})). </annotation></semantics>

As a corollary we get that the homology of <semantics>M<annotation encoding="application/x-tex">M</annotation></semantics> embeds in the magnitude homology of the graph.

Corollary. With <semantics>M<annotation encoding="application/x-tex">M</annotation></semantics>, <semantics>T<annotation encoding="application/x-tex">T</annotation></semantics>, <semantics>T<annotation encoding="application/x-tex">T'</annotation></semantics> and <semantics>G(T)<annotation encoding="application/x-tex">G({T})</annotation></semantics> as in the above theorem, the homology of <semantics>M<annotation encoding="application/x-tex">M</annotation></semantics> embeds in the magnitude homology of the graph <semantics>G(T)<annotation encoding="application/x-tex">G({T})</annotation></semantics> via the following sequence of isomorphisms and embeddings:

<semantics>H˜ *(M)H˜ * simp(T)H˜ * simp(T)MH *+2,m+2 b,t(G(T))MH *+2,m+2(G(T)).<annotation encoding="application/x-tex"> \widetilde{\mathrm{H}}_\ast(M) \cong \widetilde{\mathrm{H}}_\ast^{\mathrm{simp}}(T) \xrightarrow{\simeq} \widetilde{\mathrm{H}}^{\mathrm{simp}}_\ast (T') \xrightarrow{\simeq} MH^{b,t}_{\ast+2,m+2}(G({T})) \hookrightarrow MH_{\ast+2,m+2}(G({T})). </annotation></semantics>

The payoff: a torsion element in the homology of our graph

We saw earlier on in this post that

<semantics>[(3,4)+(4,5)(3,5)]H 1 simp(T 0).<annotation encoding="application/x-tex"> [(3, 4)+(4,5)-(3,5)] \in \mathrm{H}_1^{\mathrm{simp}}(T_0). </annotation></semantics>

is the non-trivial <semantics>2<annotation encoding="application/x-tex">2</annotation></semantics>-torsion element. We can follow this element through the sequence of maps above. We just need to note that the map <semantics>1<annotation encoding="application/x-tex">1</annotation></semantics>-chains on <semantics>T<annotation encoding="application/x-tex">T</annotation></semantics> to <semantics>1<annotation encoding="application/x-tex">1</annotation></semantics>-chains on <semantics>T<annotation encoding="application/x-tex">T'</annotation></semantics> rewrites each edge as the (signed) sum of its two half-edges, namely

<semantics>C 1 simp(T) C 1 simp(T) (a,b) ((a)(a,b))((b)(a,b)).<annotation encoding="application/x-tex"> \begin{aligned} \mathrm{C}_1^{\mathrm{simp}}(T) &\to \mathrm{C}_1^{\mathrm{simp}}(T') \\ (a,b)&\mapsto \bigl((a) \subset (a,b)\bigr) - \bigl((b)\subset (a,b)\bigr). \end{aligned} </annotation></semantics>

Then we get our non-trivial <semantics>2<annotation encoding="application/x-tex">2</annotation></semantics>-torsion element in <semantics>MH 3,4(G(T 0))<annotation encoding="application/x-tex">MH_{3,4}(G({T_0}))</annotation></semantics> to be

<semantics>[(b,(3),(3,4),t) (b,(4),(3,4),t)+(b,(4),(4,5),t) (b,(5),(4,5),t)+(b,(5),(3,5),t)(b,(3),(3,5),t)].<annotation encoding="application/x-tex"> \begin{split} [(\mathrm{b}, (3), (3,4), \mathrm{t}) &-(\mathrm{b}, (4), (3,4), \mathrm{t}) +(\mathrm{b}, (4), (4,5), \mathrm{t})\\ &-(\mathrm{b}, (5), (4,5), \mathrm{t}) +(\mathrm{b}, (5), (3,5), \mathrm{t}) -(\mathrm{b}, (3), (3,5), \mathrm{t})]. \end{split} </annotation></semantics>

Thus we have a graph with torsion in its magnitude homology groups!

by willerton ( at April 12, 2018 04:46 AM

April 10, 2018

ZapperZ - Physics and Physicists

What Astronomers Wish You Know About Dark Matter And Dark Energy
If you do a search of this blog, you will encounter numerous entries on both "dark matter" and "dark energy". It is something I've covered quite often, mainly because it is still an ongoing and active research area in astrophysics/astronomy/cosmology. Even high-energy physics/elementary particle physics is getting into the picture with particle astronomy.

In this article, Ethan Siegel gives you a condensed version of what "dark matter" and "dark energy" are, and what you need to know about them. But more importantly, if you think that you can discard them, you need to do more than just say that they are not needed.

It wasn't always apparent that this would be the solution, but this one solution works for literally all the observations. When someone puts forth the hypothesis that "dark matter and/or dark energy doesn't exist," the onus is on them to answer the implicit question, "okay, then what replaces General Relativity as your theory of gravity to explain the entire Universe?" As gravitational wave astronomy has further confirmed Einstein's greatest theory even more spectacularly, even many of the fringe alternatives to General Relativity have fallen away. The way it stands now, there are no theories that exist that successfully do away with dark matter and dark energy and still explain everything that we see. Until there are, there are no real alternatives to the modern picture that deserve to be taken seriously

It might not feel right to you, in your gut, that 95% of the Universe would be dark. It might not seem like it's a reasonable possibility when all you'd need to do, in principle, is to replace your underlying laws with new ones. But until those laws are found, and it hasn't even been shown that they could mathematically exist, you absolutely have to go with the description of the Universe that all the evidence points to. Anything else is simply an unscientific conclusion.


by ZapperZ ( at April 10, 2018 10:20 PM

Tommaso Dorigo - Scientificblogging

Interpreting The Predictions Of Deep Neural Networks
CERN has equipped itself with an inter-experimental working group on Machine Learning since a couple of years. Besides organizing monthly meetings and other activities fostering the dissemination of knowledge and active research on the topic, the group holds a yearly meeting at CERN where along with interesting presentations on advances and summaries, there are tutorials to teach participants the use of the fast-growing arsenal of tools that any machine-learning enthusiast these days should master.

read more

by Tommaso Dorigo at April 10, 2018 02:45 PM

April 09, 2018

Clifford V. Johnson - Asymptotia

North Carolina Science Fair!

Tomorrow I'll be giving a talk at the North Carolina Science Festival! The talk will be about black holes, time, space, movies, and books, held inside the historic Morehead planetarium. I'll sign the book for you after if you want.
Here is a link to the event.

(On the plane over, rather than doing my usual sketch-a-face-from-a-magazine exercise you might be familiar with from earlier posts, e.g. here, I made some new sketches for use in the talk. One of those is above.)

-cvj Click to continue reading this post

The post North Carolina Science Fair! appeared first on Asymptotia.

by Clifford at April 09, 2018 10:15 PM

Jester - Resonaances

Per kaons ad astra
NA62 is a precision experiment at CERN. From their name you wouldn't suspect that they're doing anything noteworthy: the collaboration was running in the contest for the most unimaginative name, only narrowly losing to CMS...  NA62 employs an intense beam of charged kaons to search for the very rare decay K+ → 𝝿+ 𝜈 𝜈. The Standard Model predicts the branching fraction BR(K+ → 𝝿+ 𝜈 𝜈) = 8.4x10^-11 with a small, 10% theoretical uncertainty (precious stuff in the flavor business). The previous measurement by the BNL-E949 experiment reported BR(K+ → 𝝿+ 𝜈 𝜈) = (1.7 ± 1.1)x10^-10, consistent with the Standard Model, but still  leaving room for large deviations.  NA62 is expected to pinpoint the decay and measure the branching fraction with a 10% accuracy, thus severely constraining new physics contributions. The wires, pipes, and gory details of the analysis  were nicely summarized by Tommaso. Let me jump directly to explaining what is it good for from the theory point of view.

To this end it is useful to adopt the effective theory perspective. At a more fundamental level, the decay occurs due to the strange quark inside the kaon undergoing the transformation  sbardbar 𝜈 𝜈bar. In the Standard Model, the amplitude for that process is dominated by one-loop diagrams with W/Z bosons and heavy quarks. But kaons live at low energies and do not really see the fine details of the loop amplitude. Instead, they effectively see the 4-fermion contact interaction:
The mass scale suppressing this interaction is quite large, more than 1000 times larger than the W boson mass, which is due to the loop factor and small CKM matrix elements entering the amplitude. The strong suppression is the reason why the K+ → 𝝿+ 𝜈 𝜈  decay is so rare in the first place. The corollary is that even a small new physics effect inducing that effective interaction may dramatically change the branching fraction. Even a particle with a mass as large as 1 PeV coupled to the quarks and leptons with order one strength could produce an observable shift of the decay rate.  In this sense, NA62 is a microscope probing physics down to 10^-20 cm  distances, or up to PeV energies, well beyond the reach of the LHC or other colliders in this century. If the new particle is lighter, say order TeV mass, NA62 can be sensitive to a tiny milli-coupling of that particle to quarks and leptons.

So, from a model-independent perspective, the advantages  of studying the K+ → 𝝿+ 𝜈 𝜈  decay are quite clear. A less trivial question is what can the future NA62 measurements teach us about our cherished models of new physics. One interesting application is in the industry of explaining the apparent violation of lepton flavor universality in BK l+ l-, and BD l 𝜈 decays. Those anomalies involve the 3rd generation bottom quark, thus a priori they do not need to have anything to do with kaon decays. However, many of the existing models introduce flavor symmetries controlling the couplings of the new particles to matter (instead of just ad-hoc interactions to address the anomalies). The flavor symmetries may then relate the couplings of different quark generations, and thus predict  correlations between new physics contributions to B meson and to kaon decays. One nice example is illustrated in this plot:

The observable RD(*) parametrizes the preference for BD 𝜏 𝜈 over similar decays with electrons and muon, and its measurement by the BaBar collaboration deviates from the Standard Model prediction by roughly 3 sigma. The plot shows that, in a model based on U(2)xU(2) flavor symmetry, a significant contribution to RD(*) generically implies a large enhancement of BR(K+ → 𝝿+ 𝜈 𝜈), unless the model parameters are tuned to avoid that.  The anomalies in the BK(*) 𝜇 𝜇 decays can also be correlated with large effects in K+ → 𝝿+ 𝜈 𝜈, see here for an example. Finally, in the presence of new light invisible particles, such as axions, the NA62 observations can be polluted by exotic decay channels, such as e.g.  K+ → axion 𝝿+.

The  K+ → 𝝿+ 𝜈 𝜈 decay is by no means the magic bullet that will inevitably break the Standard Model.  It should be seen as one piece of a larger puzzle that may or may not provide crucial hints about new physics. For the moment, NA62 has analyzed only a small batch of data collected in 2016, and their error bars are still larger than those of BNL-E949. That should change soon when the 2017  dataset is analyzed. More data will be acquired this year, with 20 signal events expected  before the long LHC shutdown. Simultaneously, another experiment called KOTO studies an even more rare process where neutral kaons undergo the CP-violating decay KL → 𝝿0 𝜈 𝜈,  which probes the imaginary part of the effective operator written above. As I wrote recently, my feeling is that low-energy precision experiments are currently our best hope for a better understanding of fundamental interactions, and I'm glad to see a good pace of progress on this front.

by Mad Hatter ( at April 09, 2018 08:33 PM

ZapperZ - Physics and Physicists

Another "Unconventional" Superconductor?
This is definitely exciting news, because if verified, this will truly open up a whole new phase space for superconductivity.

An advanced publication has appeared reporting the discovery of high-spin state quasiparticles that are involved in superconducitivty.[1] This occurs in a topological semimetal YPtBi.

Previously, superconductivity occurs due to quasiparticles of spin 1/2 forming pairs called Cooper pairs. Now these Cooper pairs can have a total spin of either 0 (singlet state), or 1 (triplet state). This new superconductor seems to be formed by quasiparticles having spin 3/2! The resulting Cooper pairs may have total spin of 3 or 2.

It turns out that based on their measurements, the pairing symmetry appears to be predominantly in the spin state of 3, with a sub-dominant component having 0 (the singlet) state.

If you want to know how a quasiparticle here could have a spin 3/2 state, then you need to learn about spin-orbit coupling that we all learned in intro QM classes, and read the article.

This is utterly fascinating. Just when you think you can't be surprised anymore by the phenomenon of superconductivity, along comes one!


[1] H. Kim et al., Sci. Adv.2018;4

by ZapperZ ( at April 09, 2018 03:42 PM

April 07, 2018

John Baez - Azimuth

Applied Category Theory Course: Ordered Sets

My applied category theory course based on Fong and Spivak’s book Seven Sketches is going well. Over 250 people have registered for the course, which allows them to ask question and discuss things. But even if you don’t register you can read my “lectures”.

Here are all the lectures on Chapter 1, which is about adjoint functors between posets, and how they interact with meets and joins. We study the applications to logic – both classical logic based on subsets, and the nonstandard version of logic based on partitions. And we show how this math can be used to understand “generative effects”: situations where the whole is more than the sum of its parts!

Lecture 1 – Introduction
Lecture 2 – What is Applied Category Theory?
Lecture 3 – Chapter 1: Preorders
Lecture 4 – Chapter 1: Galois Connections
Lecture 5 – Chapter 1: Galois Connections
Lecture 6 – Chapter 1: Computing Adjoints
Lecture 7 – Chapter 1: Logic
Lecture 8 – Chapter 1: The Logic of Subsets
Lecture 9 – Chapter 1: Adjoints and the Logic of Subsets
Lecture 10 – Chapter 1: The Logic of Partitions
Lecture 11 – Chapter 1: The Poset of Partitions
Lecture 12 – Chapter 1: Generative Effects
Lecture 13 – Chapter 1: Pulling Back Partitions
Lecture 14 – Chapter 1: Adjoints, Joins and Meets
Lecture 15 – Chapter 1: Preserving Joins and Meets
Lecture 16 – Chapter 1: The Adjoint Functor Theorem for Posets
Lecture 17 – Chapter 1: The Grand Synthesis

If you want to discuss these things, please visit the Azimuth Forum and register! Use your full real name as your username, with no spaces, and use a real working email address. If you don’t, I won’t be able to register you. Your email address will be kept confidential.

I’m finding this course a great excuse to put my thoughts about category theory into a more organized form, and it’s displaced most of the time I used to spend on Google+ and Twitter. That’s what I wanted: the conversations in the course are more interesting!

by John Baez at April 07, 2018 03:08 PM

April 06, 2018

Tommaso Dorigo - Scientificblogging

Machine Learning For Phenomenology
These days the use of machine learning is exploding, as problems which can be solved more effectively with it are ubiquitous, and the construction of deep neural networks or similar advanced tools is at reach of sixth graders.  So it is not surprising to see theoretical physicists joining the fun. If you think that the work of a particle theorist is too abstract to benefit from ML applications, you better think again. 

read more

by Tommaso Dorigo at April 06, 2018 11:41 AM

April 05, 2018

The n-Category Cafe

Magnitude Homology Reading Seminar, II

guest post by Scott Balchin

Following on from Simon’s introductory post, this is the second installment regarding the reading group at Sheffield on magnitude homology, and the first installment which looks at the paper of Leinster and Shulman. In this post, we will be discussing the concept of magnitude for enriched categories.

The idea of magnitude is to capture the essence of size of a (finite) enriched category. By changing the ambient enrichment, this magnitude carries different meanings. For example, when we enrich over the monoidal category <semantics>[0,]<annotation encoding="application/x-tex">[0,\infty ]</annotation></semantics> we capture metric space invariants, while changing the enrichment to <semantics>{true,false}<annotation encoding="application/x-tex">\{ \text {true},\text {false}\} </annotation></semantics> we capture poset invariants.

We will introduce the concept of magnitude via the use of zeta functions of enriched categories, which depend on the choice of a size function for the underlying enriching category. Then, we describe magnitude in a more general way using the theory of weightings. The latter will have the advantage that it is invariant under equivalence of categories, a highly desirable property.

What is presented here is taken almost verbatim from Section 2 of Leinster and Shulman’s Magnitude homology of enriched categories and metric spaces. It is, however, enhanced using comments from various other papers and, of course, multiple <semantics>n<annotation encoding="application/x-tex">n</annotation></semantics>-Café posts.

Sizes on monoidal categories

Recall that:

  • A symmetric monoidal category consists of a triple <semantics>(V,,I)<annotation encoding="application/x-tex">(\mathbf{V},\otimes ,I)</annotation></semantics> where <semantics>V<annotation encoding="application/x-tex">\mathbf{V}</annotation></semantics> is a category, and <semantics><annotation encoding="application/x-tex">\otimes</annotation></semantics> is a symmetric bifunctor <semantics>V×VV<annotation encoding="application/x-tex">\mathbf{V} \times \mathbf{V} \to \mathbf{V}</annotation></semantics> with identity object <semantics>I<annotation encoding="application/x-tex">I</annotation></semantics>.
  • A semiring (or rig) <semantics>𝕂<annotation encoding="application/x-tex">\mathbb{K}</annotation></semantics> is a ring without additive inverses.

Definition: A size is a function <semantics>#:ob(V)𝕂<annotation encoding="application/x-tex">\operatorname{\#} \colon \text {ob}(\mathbf{V}) \to \mathbb{K}</annotation></semantics> such that:

  1. <semantics>#<annotation encoding="application/x-tex">\operatorname{\#}</annotation></semantics> is invariant under isomorphism: <semantics>ab#a=#b<annotation encoding="application/x-tex">a \cong b \Rightarrow \operatorname{\#}a = \operatorname{\#}b</annotation></semantics>.
  2. <semantics>#<annotation encoding="application/x-tex">\operatorname{\#}</annotation></semantics> is multiplicative: <semantics>#(I)=1<annotation encoding="application/x-tex">\operatorname{\#}(I)=1</annotation></semantics> and <semantics>#(ab)=#a#b<annotation encoding="application/x-tex">\operatorname{\#}(a \otimes b) = \operatorname{\#}a \cdot \operatorname{\#}b</annotation></semantics>.

Example: Let <semantics>(V,,I)=(FinSet,×,{})<annotation encoding="application/x-tex">(\mathbf{V},{\otimes}, I)=(\text {FinSet},\times, \{\star\})</annotation></semantics> and <semantics>𝕂=<annotation encoding="application/x-tex">\mathbb{K} = \mathbb{N}</annotation></semantics>. Then we can take <semantics>#<annotation encoding="application/x-tex">\operatorname{\#}</annotation></semantics> to be the cardinality. Note that <semantics><annotation encoding="application/x-tex">\mathbb{N}</annotation></semantics> is the initial object in the category of semirings, and therefore by defining a size on <semantics><annotation encoding="application/x-tex">\mathbb{N}</annotation></semantics>, we can define a size on any other semiring by taking the unique map <semantics>ϕ:S<annotation encoding="application/x-tex">\phi \colon \mathbb{N} \to S</annotation></semantics>, <semantics>ϕ(1)=1 S<annotation encoding="application/x-tex">\phi (1) = 1_{S}</annotation></semantics>.

Example: Let <semantics>(V,,I)=([0,],+,0)<annotation encoding="application/x-tex">(\mathbf{V},{\otimes}, I)= ([0,\infty ], +, 0)</annotation></semantics>. Here <semantics>[0,]<annotation encoding="application/x-tex">[0,\infty ]</annotation></semantics> is the category whose objects are the non-negative reals together with <semantics><annotation encoding="application/x-tex">\infty</annotation></semantics> where there is a morphism <semantics>ab<annotation encoding="application/x-tex">a \to b</annotation></semantics> if and only if <semantics>ab<annotation encoding="application/x-tex">a \geq b</annotation></semantics>, and the monoidal structure is addition <semantics>+<annotation encoding="application/x-tex">+</annotation></semantics>. We take <semantics>𝕂=<annotation encoding="application/x-tex">\mathbb{K} = \mathbb{R}</annotation></semantics> and set <semantics>#a=e a<annotation encoding="application/x-tex">\operatorname{\#}a = e^{-a}</annotation></semantics>. The choice of <semantics>e<annotation encoding="application/x-tex">e</annotation></semantics> here is arbitrary, we could take any positive real number <semantics>q<annotation encoding="application/x-tex">q</annotation></semantics>, and note that we have <semantics>q a=e ta<annotation encoding="application/x-tex">q^{a} = e^{-t a}</annotation></semantics> for <semantics>t=lnq<annotation encoding="application/x-tex">t = \operatorname{ln}q</annotation></semantics>.

Let <semantics>V<annotation encoding="application/x-tex">\mathbf{V}</annotation></semantics> be essentially small, and <semantics>𝕂=[ob(𝕍)/]<annotation encoding="application/x-tex">\mathbb{K} = \mathbb{N}[\text {ob}(\mathbb{V})/\cong ]</annotation></semantics> be the monoid semiring of the monoid of isomorphism classes of objects in <semantics>V<annotation encoding="application/x-tex">\mathbf{V}</annotation></semantics>. This is the univerval example in that any other size on <semantics>𝕍<annotation encoding="application/x-tex">\mathbb{V}</annotation></semantics> factors uniquely through it.

For example, if <semantics>V=[0,]<annotation encoding="application/x-tex">\mathbf{V} = [0,\infty ]</annotation></semantics> as before, then the elements of this universal semiring are formal <semantics><annotation encoding="application/x-tex">\mathbb{N}</annotation></semantics>-linear combinations of numbers in <semantics>[0,]<annotation encoding="application/x-tex">[0,\infty ]</annotation></semantics>, and are therefore of the form

<semantics>a 1[ 1]+a 2[ 2]++a n[ n].<annotation encoding="application/x-tex"> a_{1}[\ell _{1}] + a_{2}[\ell _{2}] + \cdots + a_{n}[\ell _{n}]. </annotation></semantics>

Since multiplication in <semantics>𝕂<annotation encoding="application/x-tex">\mathbb{K}</annotation></semantics> is defined via <semantics>[ 1][ 2]=[ 1+ 2]<annotation encoding="application/x-tex">[\ell _{1}] \cdot [\ell _{2}] = [\ell _{1} + \ell _{2}]</annotation></semantics>, it makes more sense to write <semantics>[]<annotation encoding="application/x-tex">[\ell ]</annotation></semantics> as <semantics>q <annotation encoding="application/x-tex">q^{\ell }</annotation></semantics> for a formal variable <semantics>q<annotation encoding="application/x-tex">q</annotation></semantics>. Therefore we can see the elements of <semantics>𝕂<annotation encoding="application/x-tex">\mathbb{K}</annotation></semantics> represented as generalised polynomials

<semantics>a 1q 1+a 2q 2++a nq n<annotation encoding="application/x-tex"> a_{1}q^{\ell _{1}} + a_{2}q^{\ell _{2}} + \cdots + a_{n}q^{\ell _{n}} </annotation></semantics>

where the <semantics> i[0,]<annotation encoding="application/x-tex">\ell _{i} \in [0,\infty ]</annotation></semantics>. We write this semiring of generalised polynomials as <semantics>[q [0,]]<annotation encoding="application/x-tex">\mathbb{N}[q^{[0,\infty ]}]</annotation></semantics>.

We can now compare this universal size construction with the previous example of <semantics>#a=e a<annotation encoding="application/x-tex">\operatorname{\#}a = e^{-a}</annotation></semantics>. There is an evaluation map <semantics>[q [0,]]<annotation encoding="application/x-tex">\mathbb{N}[q^{[0,\infty ]}] \to \mathbb{R}</annotation></semantics> that substitutes <semantics>e 1<annotation encoding="application/x-tex">e^{-1}</annotation></semantics> (or any other positive real number) for <semantics>q<annotation encoding="application/x-tex">q</annotation></semantics>. Therefore, the universal size valued in <semantics>[q [0,]]<annotation encoding="application/x-tex">\mathbb{N}[q^{[0,\infty ]}]</annotation></semantics> contains all of the information of the sizes <semantics>ae ta<annotation encoding="application/x-tex">a\mapsto e^{-t a}</annotation></semantics> for all values of <semantics>t<annotation encoding="application/x-tex">t</annotation></semantics>.

Here are some further examples of sizes associated to other symmetric monoidal categories. However, we will not be considering any of these in the rest of this post.


  • <semantics>(V,,I)=(sSet,,Δ[0])<annotation encoding="application/x-tex">(\mathbf{V},\otimes ,I) = (\mathbf{sSet},\otimes ,\Delta [0])</annotation></semantics>. We can take <semantics>#<annotation encoding="application/x-tex">\operatorname{\#}</annotation></semantics> to be the Euler characteristic of the realisation of the simplicial set.
  • <semantics>(V,,I)=(FDVect,,)<annotation encoding="application/x-tex">(\mathbf{V},\otimes ,I) = (\mathbf{FDVect},\otimes ,\mathbb{C})</annotation></semantics>. We can take <semantics>#<annotation encoding="application/x-tex">\operatorname{\#}</annotation></semantics> to be the cardinality of the vector space
  • <semantics>(V,,I)=([0,],max,0)<annotation encoding="application/x-tex">(\mathbf{V},\otimes ,I) = ([0,\infty ],\text {max},0)</annotation></semantics>. This is the same category as above, however we have changed the monoidal structure to be the maximum instead of addition. Categories enriched over this are ultrametric spaces, such as the <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics>-adic numbers. In this case <semantics>#<annotation encoding="application/x-tex">\operatorname{\#}</annotation></semantics> cannot be <semantics>e a<annotation encoding="application/x-tex">e^{-a}</annotation></semantics>, instead, it needs to be some form of indicator function. We (arbitrarily) choose the interval <semantics>[0,1]<annotation encoding="application/x-tex">[0,1]</annotation></semantics> and say that <semantics>#a=1<annotation encoding="application/x-tex">\operatorname{\#}a = 1</annotation></semantics> if <semantics>a1<annotation encoding="application/x-tex">a \leq 1</annotation></semantics>, and <semantics>0<annotation encoding="application/x-tex">0</annotation></semantics> otherwise.

Enriched categories

Many people think that an enriched category is a category in which the hom-sets have extra structure. Whilst such a thing is usually an enriched category, the notion of enriched category is much more encompassing than that. The homs in an enriched category might not be sets, they might just be objects in some abstract category, so they might not even have elements, as we will see in the metric space example below.

Definition: For <semantics>(V,,I)<annotation encoding="application/x-tex">(\mathbf{V},{\otimes},I)</annotation></semantics> a monoidal category, a category enriched over <semantics>V<annotation encoding="application/x-tex">\mathbf{V}</annotation></semantics> – or, a <semantics>V<annotation encoding="application/x-tex">\mathbf{V}</annotation></semantics>-category<semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> consists of a set of objects <semantics>ob(X)<annotation encoding="application/x-tex">\operatorname{ob}(X)</annotation></semantics> such that the following hold:

  1. for each pair <semantics>a,bob(X)<annotation encoding="application/x-tex">a,b \in \operatorname{ob}(X)</annotation></semantics> there is a specified obect <semantics>X(a,b)V<annotation encoding="application/x-tex">X(a,b)\in \mathbf{V}</annotation></semantics> called the hom-object;
  2. for each triple <semantics>a,b,cob(X)<annotation encoding="application/x-tex">a,b,c \in \operatorname{ob}(X)</annotation></semantics> there is a specified morphism <semantics>X(a,b)X(b,c)X(a,c)<annotation encoding="application/x-tex">X(a,b)\otimes X(b,c)\to X(a,c)</annotation></semantics> in <semantics>V<annotation encoding="application/x-tex">\mathbf{V}</annotation></semantics> called composition;
  3. for each element <semantics>aob(X)<annotation encoding="application/x-tex">a \in \operatorname{ob}(X)</annotation></semantics> there is a specified morphism <semantics>id a:1X(a,a)<annotation encoding="application/x-tex">id_a\colon 1\to X(a,a)</annotation></semantics> in <semantics>V<annotation encoding="application/x-tex">\mathbf{V}</annotation></semantics> called the identity.

These are required to satisfy assocativity and identity axioms which we won’t go into here; see the nLab for the details.

Example: If <semantics>(V,,I)=(FinSet,×,{})<annotation encoding="application/x-tex">(\mathbf{V},{\otimes}, I)=(\text {FinSet},\times, \{\star\})</annotation></semantics> then a <semantics>V<annotation encoding="application/x-tex">\mathbf{V}</annotation></semantics>-category is precisely a small category with finite hom-sets.

Example: If <semantics>(V,,I)=([0,],+,1)<annotation encoding="application/x-tex">(\mathbf{V},{\otimes}, I)=([0,\infty ],+,1)</annotation></semantics> then a <semantics>V<annotation encoding="application/x-tex">\mathbf{V}</annotation></semantics>-category is an extended quasi-pseudo metric space (in the sense of Lawvere). The various adjectives here mean the following:

  • pseudo: <semantics>d(x,y)<annotation encoding="application/x-tex">d(x,y)</annotation></semantics> does not imply <semantics>x=y<annotation encoding="application/x-tex">x=y</annotation></semantics>.
  • quasi: <semantics>d(x,y)<annotation encoding="application/x-tex">d(x,y)</annotation></semantics> is not necessarily equal to <semantics>d(y,x)<annotation encoding="application/x-tex">d(y,x)</annotation></semantics>.
  • extended: <semantics>d(x,y)<annotation encoding="application/x-tex">d(x,y)</annotation></semantics> is allowed to be <semantics><annotation encoding="application/x-tex">\infty </annotation></semantics>.

Why does this enrichment give us something like a metric space? For each pair of objects <semantics>x,y<annotation encoding="application/x-tex">x,y</annotation></semantics>, we will denote the hom <semantics>X [0,](x,y) +<annotation encoding="application/x-tex">X_{[0,\infty ]}(x,y) \in \mathbb{R}^{+}</annotation></semantics> as <semantics>d(x,y)<annotation encoding="application/x-tex">d(x,y)</annotation></semantics>. Now, the identity axiom of the enrichment tells us that for each object <semantics>xX<annotation encoding="application/x-tex">x \in X</annotation></semantics> there is a morphism <semantics>0d(x,x)<annotation encoding="application/x-tex">0 \to d(x,x)</annotation></semantics> in <semantics>[0,]<annotation encoding="application/x-tex">[0,\infty ]</annotation></semantics> which tells us that <semantics>0d(x,y)0<annotation encoding="application/x-tex">0 \geq d(x,y) \geq 0</annotation></semantics>, and therefore <semantics>d(x,x)=0<annotation encoding="application/x-tex">d(x,x) = 0</annotation></semantics>. Finally the composition tells us that for all triples of objects <semantics>x,y,zX<annotation encoding="application/x-tex">x,y,z \in X</annotation></semantics> we have a morphism <semantics>d(x,y)+d(y,z)d(x,z)<annotation encoding="application/x-tex">d(x,y) + d(y,z) \to d(x,z)</annotation></semantics>, and therefore <semantics>d(x,y)+d(y,z)d(x,z)<annotation encoding="application/x-tex">d(x,y) + d(y,z) \geq d(x,z)</annotation></semantics> which gives us the triangle axiom.

Magnitude of finite enriched categories

For us, a square matrix will be one whose rows and columns are indexed by the same finite set (this means that we do not impose an ordering on the rows and columns). In particular, there is a category whose objects are finite sets, and whose morphisms <semantics>AB<annotation encoding="application/x-tex">A \to B</annotation></semantics> are functions <semantics>A×B𝕂<annotation encoding="application/x-tex">A \times B \to \mathbb{K}</annotation></semantics> with composition by matrix multiplication. The square matrices that we are interested in are the endomorphisms of this category. Note that this latter description illuminates what we mean by such a square matrix being invertible.

Definition: Let <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> be a <semantics>V<annotation encoding="application/x-tex">\mathbf{V}</annotation></semantics>-category with finitely many objects, where we denote the hom-object by <semantics>X(x,y)<annotation encoding="application/x-tex">X(x,y)</annotation></semantics>. Then its zeta function is the <semantics>ob(X)×ob(X)<annotation encoding="application/x-tex">\text {ob}(X) \times \text {ob}(X)</annotation></semantics> matrix over <semantics>𝕂<annotation encoding="application/x-tex">\mathbb{K}</annotation></semantics> such that

<semantics>Z X,𝕂(x,y)=#(X(x,y)).<annotation encoding="application/x-tex"> Z_{X,\mathbb{K}}(x,y) = \operatorname{\#}(X(x,y)). </annotation></semantics>

Our notation is slightly different here to the usual, in that we wish to explicitly keep track of the semiring <semantics>𝕂<annotation encoding="application/x-tex">\mathbb{K}</annotation></semantics>.

Definition: We say that <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> has Möbius inversion (with respect to <semantics>𝕂<annotation encoding="application/x-tex">\mathbb{K}</annotation></semantics> and <semantics>#<annotation encoding="application/x-tex">\operatorname{\#}</annotation></semantics>) if <semantics>Z X,𝕂<annotation encoding="application/x-tex">Z_{X,\mathbb{K}}</annotation></semantics> is invertible over <semantics>𝕂<annotation encoding="application/x-tex">\mathbb{K}</annotation></semantics>. If <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> has Möbius inversion, then we set its magnitude, <semantics>Mag 𝕂(X)<annotation encoding="application/x-tex">\operatorname{Mag}_{\mathbb{K}}(X)</annotation></semantics>, is the sum of all the entries of <semantics>Z X,𝕂 1<annotation encoding="application/x-tex">Z_{X,\mathbb{K}}^{-1}</annotation></semantics>.

Example: Let <semantics>V=FinSet<annotation encoding="application/x-tex">\mathbf{V} = \text {FinSet}</annotation></semantics>, <semantics>𝕂=<annotation encoding="application/x-tex">\mathbb{K} = \mathbb{Q}</annotation></semantics> and <semantics>#<annotation encoding="application/x-tex">#</annotation></semantics> be the cardinality. We take <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> to be any finite category which is skeletal (i.e., isomorphic objects are necessarily equal) and contains no nonidentity endomorphisms. Then <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> has Möbius inversion, and its magnitude is equal to the Euler characteristic of the geometric realisation of its nerve.

Note that if <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> were not skeletal, then there would be two identical rows in <semantics>Z X,𝕂<annotation encoding="application/x-tex">Z_{X,\mathbb{K}}</annotation></semantics> and the determinant would be zero. This raises an observation that the magnitude of a category is not invariant under equivalence of categories.

Let us expand a bit on the last comment made in the example above. Let <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> be a category of the above form. Let <semantics>a,bX<annotation encoding="application/x-tex">a,b \in X</annotation></semantics>, we say that an <semantics>n<annotation encoding="application/x-tex">n</annotation></semantics>-path from <semantics>a<annotation encoding="application/x-tex">a</annotation></semantics> to <semantics>b<annotation encoding="application/x-tex">b</annotation></semantics> is a diagram

<semantics>a=a 0a 1a n=b<annotation encoding="application/x-tex"> a = a_{0} \to a_{1} \to \cdots \to a_{n} = b </annotation></semantics>

Such a path is a circuit if <semantics>a=b<annotation encoding="application/x-tex">a=b</annotation></semantics>, and non-degenerate if no <semantics>f i<annotation encoding="application/x-tex">f_{i}</annotation></semantics> is the identity. Then for our particular <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics>, we have

<semantics>Z X, 1(a,b)= n0(1) n|{non-degeneraten-pathsfroma to b}|<annotation encoding="application/x-tex"> Z_{X,\mathbb{Q}}^{-1}(a,b) = \sum _{n \geq 0}(-1)^{n} | \{ \text{non-degenerate} \: n\text{-paths} \: \text{from} \: a \: \text{ to } \: b \} | \in \mathbb{Z} </annotation></semantics>

Now, we note that for our choice of <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics>, the nerve contains only finitely many non-degenerate simplices and we get that

<semantics>χ(|NX|)= n0(1) n|{non-degenerate n-simplicesin NX}|<annotation encoding="application/x-tex"> \chi (|NX|) = \sum _{n \geq 0} (-1)^{n} |\{ \text{non-degenerate } \: n \:\text{-simplices} \: \text{in } \: NX\} | </annotation></semantics>

The claimed result then follows from this.

Example: Remember from above that if <semantics>V=[0,]<annotation encoding="application/x-tex">\mathbf{V}=[0,\infty ]</annotation></semantics> then a <semantics>V<annotation encoding="application/x-tex">\mathbf{V}</annotation></semantics>-category is an extended quasi-pseudo metric space. With the family of <semantics><annotation encoding="application/x-tex">\mathbb{R}</annotation></semantics>-valued size functions <semantics>e td<annotation encoding="application/x-tex">e^{-t d}</annotation></semantics>, the resulting magnitude of an (extended quasi-pseudo-)metric space is an object of interest and has been studied extensively.

There is more probabilistic chance of a matrix being invertible over a field or a ring. Therefore if <semantics>𝕂<annotation encoding="application/x-tex">\mathbb{K}</annotation></semantics> is given as a semiring, it makes sense to universally complete it to a field or a ring. The universal semirings can be completed to rings by allowing integer coefficients as opposed to natural number coefficients. These rings need not be integral domains, in particular, <semantics>(q [0,])<annotation encoding="application/x-tex">\mathbb{Z}(q^{[0,\infty ]})</annotation></semantics> contains zero divisors

<semantics>q (1q )=q q +=q q =0.<annotation encoding="application/x-tex"> q^{\infty }(1-q^{\infty }) = q^{\infty }- q^{\infty +\infty } = q^{\infty }- q^{\infty }= 0. </annotation></semantics>

However, by omitting <semantics><annotation encoding="application/x-tex">\infty</annotation></semantics> (and only caring about quasi-psuedo metrics) we indeed do get an integral domain <semantics>[q [0,)]<annotation encoding="application/x-tex">\mathbb{Z}[q^{[0,\infty )}]</annotation></semantics>. Its field of rational fractions written <semantics>[q [0,]]<annotation encoding="application/x-tex">\mathbb{Q}[q^{[0,\infty ]}]</annotation></semantics> (or more suggestively <semantics>(q )<annotation encoding="application/x-tex">\mathbb{Q}(q^{\mathbb{R}})</annotation></semantics>) consists of generalised rational functions

<semantics>a 1q 1+a 2q 2++a nq nb 1q k 1+b 2q k 2++b mq k m<annotation encoding="application/x-tex"> \frac{a_{1}q^{\ell _{1}} + a_{2}q^{\ell _{2}} + \cdots + a_{n}q^{\ell _{n}}}{b_{1}q^{k_{1}} + b_{2}q^{k_{2}} + \cdots + b_{m}q^{k_{m}}} </annotation></semantics> with <semantics>a i,b j<annotation encoding="application/x-tex">a_{i},b_{j} \in \mathbb{Q}</annotation></semantics> and <semantics> i,k j<annotation encoding="application/x-tex">\ell _{i},k_{j} \in \mathbb{R}</annotation></semantics>.

Theorem: Any finite quasi-metric space (i.e., a finite skeletal <semantics>[0,)<annotation encoding="application/x-tex">[0,\infty )</annotation></semantics>-category) has Möbius inversion over <semantics>(q )<annotation encoding="application/x-tex">\mathbb{Q}(q^{\mathbb{R}})</annotation></semantics>.

To prove this we make the field <semantics>(q )<annotation encoding="application/x-tex">\mathbb{Q}(q^{\mathbb{R}})</annotation></semantics> ordered by inheriting the order of <semantics><annotation encoding="application/x-tex">\mathbb{Q}</annotation></semantics> and declaring the variable <semantics>q<annotation encoding="application/x-tex">q</annotation></semantics> to be infinitesimal. Therefore we order the generalised polynomials lexicographically on their coefficients, starting with the most negative exponents of <semantics>q<annotation encoding="application/x-tex">q</annotation></semantics>.

The condition <semantics>d(x,x)=0<annotation encoding="application/x-tex">d(x,x)=0</annotation></semantics> of a metric space gives us that the diagonal entries of <semantics>Z X,(q )<annotation encoding="application/x-tex">Z_{X,\mathbb{Q}(q^{\mathbb{R}})}</annotation></semantics> are all <semantics>q 0=1<annotation encoding="application/x-tex">q^{0} = 1</annotation></semantics>. The skeletal condition (<semantics>d(x,y)>0<annotation encoding="application/x-tex">d(x,y) \gt 0</annotation></semantics> if <semantics>xy<annotation encoding="application/x-tex">x\neq y</annotation></semantics>) means that the off-diagonal entries are <semantics>q d(x,y)<annotation encoding="application/x-tex">q^{d(x,y)}</annotation></semantics> which is infinitesimal as <semantics>d(x,y)>0<annotation encoding="application/x-tex">d(x,y)\gt 0</annotation></semantics>. Therefore, we get that the determinant of <semantics>Z X,(q )<annotation encoding="application/x-tex">Z_{X,\mathbb{Q}(q^{\mathbb{R}})}</annotation></semantics> is a sum of the diagonal terms (whose entries are <semantics>e d(x,x)=e 0=1<annotation encoding="application/x-tex">e^{d(x,x)}=e^{0}=1</annotation></semantics>) and a finite number of infinitesimals, which is necessarily positive and therefore non-zero. Therefore <semantics>Z X,(q )<annotation encoding="application/x-tex">Z_{X,\mathbb{Q}(q^{\mathbb{R}})}</annotation></semantics> is indeed invertible, and the theorem is proved.

Magnitudes via weightings

We now look at a way of generalising magnitudes using weightings. The advantage of this will be invariance under equivalence, a property highly desirable for any categorical invariant.

Definition: A weighting on a finite <semantics>V<annotation encoding="application/x-tex">\mathbf{V}</annotation></semantics>-category <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> is a function <semantics>w:ob(X)𝕂<annotation encoding="application/x-tex">w \colon \text {ob}(X) \to \mathbb{K}</annotation></semantics> such that <semantics> y#(X(x,y))w(y)=1<annotation encoding="application/x-tex">\sum _{y} \operatorname{\#}(X(x,y)) \cdot w(y) = 1</annotation></semantics> for all <semantics>xX<annotation encoding="application/x-tex">x \in X</annotation></semantics>. A coweighting on <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> is a weighting on <semantics>X op<annotation encoding="application/x-tex">X^{op}</annotation></semantics>.

Here are some simple examples.


  • Consider the category (i.e., enriched over <semantics>FinSet<annotation encoding="application/x-tex">\mathbf{FinSet}</annotation></semantics>) <semantics>b 1ab 2<annotation encoding="application/x-tex">b_{1} \leftarrow a \rightarrow b_{2}</annotation></semantics>. Then this category carries a unique weighting given by <semantics>w(a)=1<annotation encoding="application/x-tex">w(a)=-1</annotation></semantics> and <semantics>w(b i)=1<annotation encoding="application/x-tex">w(b_{i})=1</annotation></semantics>.
  • It is possible for a category to have no weightings. For an instance of this see Example 1.11(c) of Leinster’s paper The Euler characteristic of a category.
  • Consider a category with two objects and an isomorphism between them. A weighting is then given by a pair of rational numbers whose sum is 1. Therefore, there are infinitely many weightings on this category.

We can relate the notion of weighting to Möbius inversion in the following way.

Theorem: If <semantics>𝕂<annotation encoding="application/x-tex">\mathbb{K}</annotation></semantics> is a field, then a <semantics>V<annotation encoding="application/x-tex">\mathbf{V}</annotation></semantics>-category <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> has Möbius inversion if and only if it has a unique weighting <semantics>w<annotation encoding="application/x-tex">w</annotation></semantics>, and if and only if it has a unique coweighting <semantics>v<annotation encoding="application/x-tex">v</annotation></semantics>, in which case <semantics>Mag(X)= xw(x)= xv(x)<annotation encoding="application/x-tex">\operatorname{Mag}(X) = \sum _{x} w(x) = \sum _{x} v(x)</annotation></semantics>.


  • The category <semantics>b 1ab 2<annotation encoding="application/x-tex">b_{1} \leftarrow a \rightarrow b_{2}</annotation></semantics> has zeta function given by <semantics>Z X,=(1 1 1 0 1 0 0 0 1)<annotation encoding="application/x-tex"> Z_{X,\mathbb{Q}}= \begin{pmatrix} 1 & 1 & 1 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{pmatrix} </annotation></semantics> This matrix has inverse <semantics>Z X, 1=(1 1 1 0 1 0 0 0 1)<annotation encoding="application/x-tex"> Z_{X,\mathbb{Q}}^{-1}= \begin{pmatrix} 1 & -1 & -1 \\ 0 & 1 & 0 \\ 0 & 0 & 1 \end{pmatrix} </annotation></semantics> and therefore its magnitude <semantics>Mag(X)=1+1+111=1<annotation encoding="application/x-tex">\operatorname{Mag}(X) = 1+1+1-1-1=1</annotation></semantics>. We reconsider the weighting on <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics>, and see that the sum of the weightings is <semantics>1+11=1<annotation encoding="application/x-tex">1+1-1=1</annotation></semantics>.

    • The category <semantics><annotation encoding="application/x-tex">\bullet \cong \bullet</annotation></semantics> has zeta function given by <semantics>Z X=(1 1 1 1)<annotation encoding="application/x-tex"> Z_{X\mathbb{Q}}= \begin{pmatrix} 1 & 1 \\ 1 & 1 \end{pmatrix} </annotation></semantics> which clearly has no inverse, and therefore the category has no magnitude.

Theorem: If a <semantics>V<annotation encoding="application/x-tex">\mathbf{V}</annotation></semantics>-category <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> has both a weighting <semantics>w<annotation encoding="application/x-tex">w</annotation></semantics> and a coweighting <semantics>v<annotation encoding="application/x-tex">v</annotation></semantics> then <semantics> xw(x)= xv(x)<annotation encoding="application/x-tex">\sum _{x} w(x) = \sum _{x} v(x)</annotation></semantics>.

Definition: A <semantics>V<annotation encoding="application/x-tex">\mathbf{V}</annotation></semantics>-category has magnitude if it has both a weighting <semantics>w<annotation encoding="application/x-tex">w</annotation></semantics> and a coweighting <semantics>v<annotation encoding="application/x-tex">v</annotation></semantics>, in which case its magnitude is the common value of <semantics> xw(x)<annotation encoding="application/x-tex">\sum _{x} w(x)</annotation></semantics> and <semantics> xv(x)<annotation encoding="application/x-tex">\sum _{x} v(x)</annotation></semantics>.

This generalised notion of magnitude is that it is invariant under equivalence of <semantics>V<annotation encoding="application/x-tex">\mathbf{V}</annotation></semantics>-enriched categories unlike the definition involving Möbius inversions.

Theorem: If <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> and <semantics>X<annotation encoding="application/x-tex">X'</annotation></semantics> are equivalent <semantics>V<annotation encoding="application/x-tex">\mathbf{V}</annotation></semantics>-enriched categories, and <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> has a weighting, a coweighting, or has magnitude, then so does <semantics>X<annotation encoding="application/x-tex">X'</annotation></semantics>.

We provide a brief explanation of why this is true. Let <semantics>F:XX<annotation encoding="application/x-tex">F \colon X \to X'</annotation></semantics> be an equivalence. Given <semantics>aX<annotation encoding="application/x-tex">a \in X</annotation></semantics>, we write <semantics>C a<annotation encoding="application/x-tex">C_{a}</annotation></semantics> for the number of objects in the isoclass of <semantics>a<annotation encoding="application/x-tex">a</annotation></semantics>, and similarly <semantics>C a<annotation encoding="application/x-tex">C_{a'}</annotation></semantics> for <semantics>aX<annotation encoding="application/x-tex">a' \in X'</annotation></semantics>. Take a weighting <semantics>l<annotation encoding="application/x-tex">l</annotation></semantics> on <semantics>X<annotation encoding="application/x-tex">X'</annotation></semantics>, and set <semantics>k(a)=(C Fa/C a)l(Fa)<annotation encoding="application/x-tex">k(a) = (C_{Fa}/C_{a})l(Fa)</annotation></semantics>. Then <semantics>k<annotation encoding="application/x-tex">k</annotation></semantics> is a weighting on X.

Theorem: If <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> and <semantics>X<annotation encoding="application/x-tex">X'</annotation></semantics> are equivalent <semantics>V<annotation encoding="application/x-tex">\mathbf{V}</annotation></semantics>-enriched categories, and both have magnitude, then <semantics>Mag(X)=Mag(X)<annotation encoding="application/x-tex">\operatorname{Mag}(X) = \operatorname{Mag}(X')</annotation></semantics>.

Next time we will be looking at how Hochschild homology comes into the picture

by willerton ( at April 05, 2018 04:18 PM

April 02, 2018

The n-Category Cafe

Dynamical Systems and Their Steady States

guest post by Maru Sarazola

Now that we know how to use decorated cospans to represent open networks, the Applied Category Theory Seminar has turned its attention to open reaction networks (aka Petri nets) and the dynamical systems associated to them.

In A Compositional Framework for Reaction Networks (summarized in this very blog by John Baez not too long ago), authors John Baez and Blake Pollard put Fong’s results to good use and define cospan categories <semantics>RxNet<annotation encoding="application/x-tex">\mathbf{RxNet}</annotation></semantics> and <semantics>Dynam<annotation encoding="application/x-tex">\mathbf{Dynam}</annotation></semantics> of (open) reaction networks and (open) dynamical systems. Once this is done, the main goal of the paper is to show that the mapping that associates to an open reaction network its corresponding dynamical system is compositional, as is the mapping that takes an open dynamical system to the relation that holds between its constituents in steady state. In other words, they show that the study of the whole can be done through the study of the parts.

I would like to place the focus on dynamical systems and the study of their steady states, taking a closer look at this correspondence called “black-boxing”, and comparing it to previous related work done by David Spivak.

Baez–Pollard’s approach

The category <semantics>Dynam<annotation encoding="application/x-tex">\mathbf{Dynam}</annotation></semantics> of open dynamical systems

Let’s start by introducing the main players. A dynamical system is usually defined as a manifold <semantics>M<annotation encoding="application/x-tex">M</annotation></semantics> whose points are “states”, together with a smooth vector field on <semantics>M<annotation encoding="application/x-tex">M</annotation></semantics> saying how these states evolve in time. Since the motivation in this paper comes from chemistry, our manifolds will be euclidean spaces <semantics> S<annotation encoding="application/x-tex">\mathbb{R}^S</annotation></semantics>, where <semantics>S<annotation encoding="application/x-tex">S</annotation></semantics> should be thought of as the finite set of species involved, and a vector <semantics>c S<annotation encoding="application/x-tex">c\in\mathbb{R}^S</annotation></semantics> gives the concentration of each species. Then, the dynamical system is a differential equation

<semantics>dc(t)dt=v(c(t))<annotation encoding="application/x-tex">\frac{d c(t)}{d t}=v(c(t))</annotation></semantics>

where <semantics>c: S<annotation encoding="application/x-tex">c:\mathbb{R}\to\mathbb{R}^S</annotation></semantics> gives the concentrations as a function of time, and <semantics>v<annotation encoding="application/x-tex">v</annotation></semantics> is a vector field on <semantics> S<annotation encoding="application/x-tex">\mathbb{R}^S</annotation></semantics>.

Now imagine our motivating chemical system is open; that is, we are allowed to inject molecules of some chosen species, and remove some others. An open dynamical system is a cospan of finite sets

together with a vector field <semantics>v<annotation encoding="application/x-tex">v</annotation></semantics> on <semantics> S<annotation encoding="application/x-tex">\mathbb{R}^S</annotation></semantics>. Here the legs of the cospan mark the species that we’re allowed to inject and remove, labeled <semantics>i<annotation encoding="application/x-tex">i</annotation></semantics> (<semantics>o<annotation encoding="application/x-tex">o</annotation></semantics>) for input (output).

So, how can we build a category from this? Loosely citing a result of Fong, if the decorations of the cospan (in this case, the vector fields) can be given through a functor <semantics>F:(FinSet,+)(Set,×)<annotation encoding="application/x-tex">F:(\mathbf{FinSet},+)\to(\mathbf{Set},\times )</annotation></semantics> that is lax monoidal, then we can form a category whose objects are finite sets, and whose morphisms are (iso classes of) decorated cospans.

Indeed, this can be done in a very natural way, and therefore gives rise to the category <semantics>Dynam<annotation encoding="application/x-tex">\mathbf{Dynam}</annotation></semantics>, whose morphisms are open dynamical systems.

The black-boxing functor <semantics>:DynamRel<annotation encoding="application/x-tex">\blacksquare :\mathbf{Dynam}\to\mathbf{Rel}</annotation></semantics>

Given a dynamical system, one of the first things we might like to do is to study its fixed points; in our case, study the concentration vectors that remain constant in time. When working with an open dynamical system, it’s clear that the amounts that we choose to inject and remove will alter the change in concentration of our species, and hence it makes sense to consider the following.

For an open dynamical system <semantics>(XiSoY,v)<annotation encoding="application/x-tex">(X\xrightarrow{i} S \xleftarrow{o} Y, v)</annotation></semantics>, together with a constant inflow <semantics>I X<annotation encoding="application/x-tex">I\in\mathbb{R}^X</annotation></semantics> and constant outflow <semantics>O Y<annotation encoding="application/x-tex">O\in\mathbb{R}^Y</annotation></semantics>, a steady state (with inflows <semantics>I<annotation encoding="application/x-tex">I</annotation></semantics> and outflows <semantics>O<annotation encoding="application/x-tex">O</annotation></semantics>) is a constant vector of concentrations <semantics>c S<annotation encoding="application/x-tex">c\in\mathbb{R}^S</annotation></semantics> such that

<semantics>v(c)+i *(I)o *(O)=0<annotation encoding="application/x-tex">v(c)+i_{\ast} (I)-o_{\ast} (O)=0</annotation></semantics>

Here <semantics>i *(I)<annotation encoding="application/x-tex">i_{\ast} (I)</annotation></semantics> is the vector in <semantics> S<annotation encoding="application/x-tex">\mathbb{R}^S</annotation></semantics> given by <semantics>i *(I)(s)= xX:i(x)=sI(x)<annotation encoding="application/x-tex">i_{\ast} (I)(s)=\sum_{x\in X: i(x)=s} I(x)</annotation></semantics>; that is, the inflow concentration of all species as marked by the input leg of the cospan. As the authors concisely put it, “in a steady state, the inflows and outflows conspire to exactly compensate for the reaction velocities”.

Note that the inflow and outflow functions <semantics>I<annotation encoding="application/x-tex">I</annotation></semantics> and <semantics>O<annotation encoding="application/x-tex">O</annotation></semantics> won’t affect any species not marked by the legs of the cospan, and so any steady state <semantics>c<annotation encoding="application/x-tex">c</annotation></semantics> must be such that <semantics>v(c)=0<annotation encoding="application/x-tex">v(c)=0</annotation></semantics> when restricted to these inner species that we can’t reach.

What we want to do next is build a functor that, given an open dynamical system, records all possible combinations of input concentrations, output concentrations, inflows and outflows that hold in steady state. This process will be called black-boxing, since it discards information that can’t be seen at the inputs and outputs.

The black-boxing functor <semantics>:DynamRel<annotation encoding="application/x-tex">\blacksquare:\mathbf{Dynam}\to \mathbf{Rel}</annotation></semantics> takes a finite set <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> to the vector space <semantics> X X<annotation encoding="application/x-tex">\mathbb{R}^X\oplus\mathbb{R}^X</annotation></semantics>, and a morphism, that is, an open dynamical system <semantics>f=(XiSoY,v)<annotation encoding="application/x-tex">f=(X\xrightarrow{i} S \xleftarrow{o} Y, v)</annotation></semantics>, to the subset

<semantics>(f) X X Y Y<annotation encoding="application/x-tex">\blacksquare(f)\subseteq\mathbb{R}^X\oplus\mathbb{R}^X\oplus\mathbb{R}^Y\oplus\mathbb{R}^Y</annotation></semantics>

<semantics>(f)={(i *(c),I,o *(c),O):c  is a steady state with inflows  I  and outflows  O}<annotation encoding="application/x-tex">\blacksquare(f)=\{(i^{\ast} (c),I,o^{\ast} (c),O): c &nbsp; \text{ is a steady state with inflows } &nbsp; I &nbsp; \text{ and outflows } &nbsp; O\}</annotation></semantics>

where <semantics>i *(c)<annotation encoding="application/x-tex">i^{\ast} (c)</annotation></semantics> is the vector in <semantics> X<annotation encoding="application/x-tex">\mathbb{R}^X</annotation></semantics> defined by <semantics>i *(c)(x)=c(i(x))<annotation encoding="application/x-tex">i^{\ast} (c) (x)=c(i(x))</annotation></semantics>; that is, the concentration of the input species.

The authors prove that black-boxing is indeed a functor, which implies that if we want to study the steady states of a complex open dynamical system, we can break it up into smaller, simpler pieces and study their steady states. In other words, studying the steady states of a big system, which is given by the composition of smaller systems (as morphisms in the category <semantics>Dynam<annotation encoding="application/x-tex">\mathbf{Dynam}</annotation></semantics>) amounts to studying the steady states of each of the smaller systems, and composing them (as morphisms in <semantics>Rel<annotation encoding="application/x-tex">\mathbf{Rel}</annotation></semantics>).

Spivak’s approach

The category <semantics>𝒲<annotation encoding="application/x-tex">\mathcal{W}</annotation></semantics> of wiring diagrams

Instead of dealing with dynamical systems from the start, Spivak takes a step back and develops a syntax for boxes, which are things that admit inputs and outputs.

Let’s define the category <semantics>𝒲 𝒞<annotation encoding="application/x-tex">\mathcal{W}_\mathcal{C}</annotation></semantics> of <semantics>𝒞<annotation encoding="application/x-tex">\mathcal{C}</annotation></semantics>-boxes and wiring diagrams, for a category <semantics>𝒞<annotation encoding="application/x-tex">\mathcal{C}</annotation></semantics> with finite products. Its objects are pairs

<semantics>X=(X in,X out)<annotation encoding="application/x-tex">X=(X^\text{in},X^\text{out})</annotation></semantics>

where each of these coordinates is a finite product of objects of <semantics>𝒞<annotation encoding="application/x-tex">\mathcal{C}</annotation></semantics>. For example, we interpret the pair <semantics>(A 1×A 2,B 1×B 2×B 3)<annotation encoding="application/x-tex">(A_1\times A_2, B_1\times B_2\times B_3)</annotation></semantics> as a box with input ports <semantics>(a 1,a 2)A 1×A 2<annotation encoding="application/x-tex">(a_1 ,a_2)\in A_1\times A_2</annotation></semantics> and output ports <semantics>(b 1,b 2,b 3)B 1×B 2×B 3<annotation encoding="application/x-tex">(b_1 ,b_2 ,b_3 )\in B_1\times B_2\times B_3</annotation></semantics>.

Its morphisms are wiring diagrams <semantics>φ:XY<annotation encoding="application/x-tex">\varphi:X\to Y</annotation></semantics>, that is, pairs of maps <semantics>(φ in,φ out)<annotation encoding="application/x-tex">(\varphi^\text{in},\varphi^\text{out})</annotation></semantics> which we interpret as a rewiring of the box <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> inside of the box <semantics>Y<annotation encoding="application/x-tex">Y</annotation></semantics>. The function <semantics>φ in<annotation encoding="application/x-tex">\varphi^\text{in}</annotation></semantics> indicates whether an input port of <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> should be attached to an input of <semantics>Y<annotation encoding="application/x-tex">Y</annotation></semantics> or to an output of <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> itself; the function <semantics>φ out<annotation encoding="application/x-tex">\varphi^\text{out}</annotation></semantics> indicates how the outputs of <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> feed the outputs of <semantics>Y<annotation encoding="application/x-tex">Y</annotation></semantics>. Examples of wirings are

Composition is given by a nesting of wirings.

Given boxes <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> and <semantics>Y<annotation encoding="application/x-tex">Y</annotation></semantics>, we define their parallel composition by

<semantics>XY=(X in×Y in,X out×Y out)<annotation encoding="application/x-tex">X\boxtimes Y=(X^\text{in}\times Y^\text{in},X^\text{out}\times Y^\text{out})</annotation></semantics>

This gives a monoidal structure to the category <semantics>𝒲 𝒞<annotation encoding="application/x-tex">\mathcal{W}_\mathcal{C}</annotation></semantics>. Parallel composition is true to its name, as illustrated by

The huge advantage of this approach is that one can now fill the boxes with suitable “inhabitants”, and model many different situations that look like wirings at their core. These inhabitants will be given through functors <semantics>𝒲 𝒞Set<annotation encoding="application/x-tex">\mathcal{W}_\mathcal{C}\to\mathbf{Set}</annotation></semantics>, taking a box to the set of its desired interpretations, and giving a meaning to the wiring of boxes.

The functor <semantics>ODS:𝒲 EucSet<annotation encoding="application/x-tex">ODS:\mathcal{W}_{\mathbf{Euc}}\to\mathbf{Set}</annotation></semantics> of open dynamical systems

The first of our inhabitants will be, as you probably guessed by now, open dynamical systems. Here <semantics>𝒞=Euc<annotation encoding="application/x-tex">\mathcal{C}=\mathbf{Euc}</annotation></semantics> is the category of Euclidean spaces <semantics> n<annotation encoding="application/x-tex">\mathbb{R}^n</annotation></semantics> and smooth maps.

From the perspective of Spivak’s paper, an <semantics>( X, Y)<annotation encoding="application/x-tex">(\mathbb{R}^X,\mathbb{R}^Y)</annotation></semantics>-open dynamical system is a 3-tuple <semantics>( S,f dyn,f rdt)<annotation encoding="application/x-tex">(\mathbb{R}^S,f^\text{dyn},f^\text{rdt})</annotation></semantics> where

  • <semantics> S<annotation encoding="application/x-tex">\mathbb{R}^S</annotation></semantics> is the state space

  • <semantics>f dyn: X× S S<annotation encoding="application/x-tex">f^\text{dyn}:\mathbb{R}^X\times\mathbb{R}^S\to\mathbb{R}^S</annotation></semantics> is a vector field parametrized by the inputs <semantics> X<annotation encoding="application/x-tex">\mathbb{R}^X</annotation></semantics>, giving the differential equation of the system

  • <semantics>f rdt: S Y<annotation encoding="application/x-tex">f^\text{rdt}:\mathbb{R}^S\to\mathbb{R}^Y</annotation></semantics> is the readout function at the outputs <semantics> Y<annotation encoding="application/x-tex">\mathbb{R}^Y</annotation></semantics>.

One should notice the similarity with our previously defined dynamical systems, although it’s clear that the two definitions are not equivalent.

The functor <semantics>ODS:𝒲 EucSet<annotation encoding="application/x-tex">ODS:\mathcal{W}_{\mathbf{Euc}}\to\mathbf{Set}</annotation></semantics> exhibiting dynamical systems as inhabitants of input-output boxes, takes a box <semantics>X=(X in,X out)<annotation encoding="application/x-tex">X=(X^\text{in},X^\text{out})</annotation></semantics> to the set of all <semantics>( X in, X out)<annotation encoding="application/x-tex">(\mathbb{R}^{X^\text{in}},\mathbb{R}^{X^\text{out}})</annotation></semantics>-dynamical systems

<semantics>ODS(X)={( S,f dyn: X in× S S,f rdt: S X out)}<annotation encoding="application/x-tex">ODS(X)=\{(\mathbb{R}^S,f^\text{dyn}:\mathbb{R}^{X^\text{in}}\times\mathbb{R}^S\to\mathbb{R}^S,f^\text{rdt}:\mathbb{R}^S\to\mathbb{R}^{X^\text{out}})\}</annotation></semantics>

You can surely figure out how <semantics>ODS<annotation encoding="application/x-tex">ODS</annotation></semantics> acts on wirings by drawing a picture and doing a bit of careful bookkeeping.

Note that there’s a natural notion of parallel composition of two dynamical systems, which amounts to carrying out the processes indicated by the two dynamical systems in parallel. Spivak shows that <semantics>ODS<annotation encoding="application/x-tex">ODS</annotation></semantics> is a functor, and, furthermore, that

<semantics>ODS(XY)ODS(X)ODS(Y)<annotation encoding="application/x-tex">ODS(X\boxtimes Y)\simeq ODS(X)\boxtimes ODS(Y)</annotation></semantics>

The functor <semantics>Mat:𝒲 𝒞Set<annotation encoding="application/x-tex">Mat:\mathcal{W}_{\mathcal{C}}\to\mathbf{Set}</annotation></semantics> of <semantics>Set<annotation encoding="application/x-tex">\mathbf{Set}</annotation></semantics>-matrices

Our second inhabitants will be given by matrices of sets. For objects <semantics>X,Y<annotation encoding="application/x-tex">X,Y</annotation></semantics>, an <semantics>(X,Y)<annotation encoding="application/x-tex">(X,Y)</annotation></semantics>-matrix of sets is a function <semantics>M<annotation encoding="application/x-tex">M</annotation></semantics> that assigns to each pair <semantics>(x,y)<annotation encoding="application/x-tex">(x,y)</annotation></semantics> a set <semantics>M x,y<annotation encoding="application/x-tex">M_{x,y}</annotation></semantics>. In other words, it is a matrix indexed by <semantics>X×Y<annotation encoding="application/x-tex">X\times Y</annotation></semantics> that, instead of coefficients, has sets in each position.

The functor <semantics>Mat:𝒲 𝒞Set<annotation encoding="application/x-tex">Mat:\mathcal{W}_{\mathcal{C}}\to\mathbf{Set}</annotation></semantics> exhibiting <semantics>Set<annotation encoding="application/x-tex">\mathbf{Set}</annotation></semantics>-matrices as inhabitants of input-output boxes, takes a box <semantics>X=(X in,X out)<annotation encoding="application/x-tex">X=(X^\text{in},X^\text{out})</annotation></semantics> to the set of all <semantics>(X in,X out)<annotation encoding="application/x-tex">(X^\text{in},X^\text{out})</annotation></semantics>-matrices of sets

<semantics>Mat(X)={{M i,j} X in×X out:M i,j  is a set}<annotation encoding="application/x-tex">Mat(X)=\{\{M_{i,j}\}_{X^\text{in}\times X^\text{out}} : M_{i,j} &nbsp; \text{ is a set}\}</annotation></semantics>

Once again, it’s not too hard to figure out how <semantics>Mat<annotation encoding="application/x-tex">Mat</annotation></semantics> should act on wirings.

Like before, there’s a notion of parallel composition of two matrices of sets, and the author shows that <semantics>Mat<annotation encoding="application/x-tex">Mat</annotation></semantics> is a functor such that

<semantics>Mat(XY)Mat(X)Mat(Y)<annotation encoding="application/x-tex">Mat(X\boxtimes Y)\simeq Mat(X)\boxtimes Mat(Y)</annotation></semantics>

The steady-state natural transformation <semantics>Stst:ODSMat<annotation encoding="application/x-tex">Stst:ODS\to Mat</annotation></semantics>

Finally, we explain how to use all this to study steady states of dynamical systems.

Given an <semantics>( X, Y)<annotation encoding="application/x-tex">(\mathbb{R}^X,\mathbb{R}^Y)</annotation></semantics>-dynamical system <semantics>f=( S,f dyn,f rdt)<annotation encoding="application/x-tex">f=(\mathbb{R}^S,f^\text{dyn},f^\text{rdt})</annotation></semantics> and an element <semantics>(I,O) X× Y<annotation encoding="application/x-tex">(I,O)\in\mathbb{R}^X\times\mathbb{R}^Y</annotation></semantics>, an <semantics>(I,O)<annotation encoding="application/x-tex">(I,O)</annotation></semantics>-steady state is a state <semantics>c S<annotation encoding="application/x-tex">c\in\mathbb{R}^S</annotation></semantics> such that

<semantics>f dyn(I,c)=0   and   f rdt(c)=O<annotation encoding="application/x-tex">f^\text{dyn}(I,c)=0 &nbsp; &nbsp; \text{ and } &nbsp; &nbsp; f^\text{rdt}(c)=O</annotation></semantics>

Since dynamical systems are encoded by the functor <semantics>ODS<annotation encoding="application/x-tex">ODS</annotation></semantics>, it makes sense to study steady states through a natural transformation out of <semantics>ODS<annotation encoding="application/x-tex">ODS</annotation></semantics>. We define <semantics>Stst:ODSMat<annotation encoding="application/x-tex">Stst:ODS\to Mat</annotation></semantics> as the transformation that assigns to each box <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics>, the function

<semantics>Stst X:ODS(X)Mat(X)<annotation encoding="application/x-tex">Stst_X:ODS(X)\longrightarrow Mat(X)</annotation></semantics>

taking a dynamical system <semantics>( S,f dyn,f rdt)<annotation encoding="application/x-tex">(\mathbb{R}^S,f^\text{dyn},f^\text{rdt})</annotation></semantics> to its matrix of steady states

<semantics>M I,O={c S:f dyn(I,c)=0, f rdt(c)=O}<annotation encoding="application/x-tex">M_{I,O}=\{c\in\mathbb{R}^S : f^\text{dyn}(I,c)=0, &nbsp; f^\text{rdt}(c)=O\}</annotation></semantics>

where <semantics>(I,O) X in× X out<annotation encoding="application/x-tex">(I,O)\in \mathbb{R}^{X^\text{in}}\times \mathbb{R}^{X^\text{out}}</annotation></semantics>. The author proceeds to show that <semantics>Stst<annotation encoding="application/x-tex">Stst</annotation></semantics> is a monoidal natural transformation.

Is it possible to use this machinery to draw the same conclusion as before, that is, that the steady states of a composition of systems comes from the composition of the steady states of the parts?

Indeed, it is! Given two boxes <semantics>X 1<annotation encoding="application/x-tex">X_1</annotation></semantics> and <semantics>X 2<annotation encoding="application/x-tex">X_2</annotation></semantics>, we recover the usual notion of (serial) composition by first setting them in parallel <semantics>X 1X 2<annotation encoding="application/x-tex">X_1 \boxtimes X_2</annotation></semantics>,

and wiring this by <semantics>φ:X 1X 2Y<annotation encoding="application/x-tex">\varphi:X_1 \boxtimes X_2\to Y</annotation></semantics> as follows:

The fact that <semantics>Stst<annotation encoding="application/x-tex">Stst</annotation></semantics> is a monoidal natural transformation, combined with the facts that the functors <semantics>ODS<annotation encoding="application/x-tex">ODS</annotation></semantics> and <semantics>Mat<annotation encoding="application/x-tex">Mat</annotation></semantics> respect parallel composition, allows us to write the following diagram, where both squares are commutative

Then, chasing the diagram along the top and left sides gives the steady states of the serial composition of the dynamical systems <semantics>X 1<annotation encoding="application/x-tex">X_1</annotation></semantics> and <semantics>X 2<annotation encoding="application/x-tex">X_2</annotation></semantics>, while chasing it along the right and bottom sides gives the composition of the steady states of <semantics>X 1<annotation encoding="application/x-tex">X_1</annotation></semantics> and of <semantics>X 2<annotation encoding="application/x-tex">X_2</annotation></semantics>, and the two must agree.

The two approaches, side by side

So how are these two perspectives related? Looking at the definitions we can immediately see that Spivak’s approach has a broader scope than Baez and Pollard’s, so it’s apparent that his results won’t be implied by theirs.

For the converse direction, recall that in the first paper, a dynamical system is given by a decorated cospan <semantics>f=(XiSoY,v)<annotation encoding="application/x-tex">f=(X\xrightarrow{i} S \xleftarrow{o} Y, v)</annotation></semantics>, and a steady state with inflows <semantics>I<annotation encoding="application/x-tex">I</annotation></semantics> and outflows <semantics>O<annotation encoding="application/x-tex">O</annotation></semantics> is a constant vector of concentrations <semantics>c S<annotation encoding="application/x-tex">c\in\mathbb{R}^S</annotation></semantics> such that

<semantics>v(c)+i *(I)o *(O)=0<annotation encoding="application/x-tex">v(c)+i_{\ast} (I)-o_{\ast} (O)=0</annotation></semantics>

Thus, studying the steady states for this cospan system corresponds to studying the box system

<semantics>f=( S,f dyn: X× S S,f rdt: S Y)<annotation encoding="application/x-tex">f=(\mathbb{R}^S, f^\text{dyn}:\mathbb{R}^X\times\mathbb{R}^S\to\mathbb{R}^S, f^\text{rdt}:\mathbb{R}^S\to\mathbb{R}^Y)</annotation></semantics>

with dynamics given by <semantics>f dyn(I,c)=v(c)+i *(I)o *(f rdt(c))<annotation encoding="application/x-tex">f^\text{dyn}(I,c)=v(c)+i_{\ast} (I)-o_{\ast} (f^\text{rdt}(c))</annotation></semantics>, since its <semantics>(I,O)<annotation encoding="application/x-tex">(I,O)</annotation></semantics>-steady states are vectors <semantics>c S<annotation encoding="application/x-tex">c\in\mathbb{R}^S</annotation></semantics> such that

<semantics>f dyn(I,c)=0   and   f rdt(c)=O<annotation encoding="application/x-tex">f^\text{dyn}(I,c)=0 &nbsp; &nbsp; \text{ and } &nbsp; &nbsp; f^\text{rdt}(c)=O</annotation></semantics>

Thus, the study of the steady states of a given cospan dynamical system can be done just as well by looking at it as a box dynamical system and running it through Spivak’s machinery. However, setting two such box systems in serial composition will not yield the box system representing the composition of the cospan systems as one would (naively?) hope, so it doesn’t seem that Spivak’s compositional results will imply those of Baez and Pollard.

This is a bit disconcerting, but instead of it being discouraging, I believe it should be seen as an invitation to delve into the semantics of open dynamical systems and find the right perspective, which manages to subsume both of the approaches presented here.

by john ( at April 02, 2018 08:11 PM

The n-Category Cafe

Linguistics Using Category Theory

guest post by Cory Griffith and Jade Master

Most recently, the Applied Category Theory Seminar took a step into linguistics by discussing the 2010 paper Mathematical Foundations for a Compositional Distributional Model of Meaning, by Bob Coecke, Mehrnoosh Sadrzadeh, and Stephen Clark.

Here is a summary and discussion of that paper.

In recent years, well known advances in AI, such as the development of AlphaGo and the ongoing development of self driving cars, have sparked interest in the general idea of machines examining and trying to understand complex data. In particular, a variety of accounts of successes in natural language processing (NLP) have reached wide audiences (see, for example, The Great AI Awakening).

One key tool for NLP practitioners is the concept of distributional semantics. There is a saying due to Firth that is so often repeated in NLP papers and presentations that even mentioning its ubiquity has become a cliche:

“You shall know a word by the company it keeps.”

The idea is that if we want to know if two words have similar meanings, we should examine the words they are used in conjunction with, and in some way measure how much overlap there is. While direct ancestry of this concept can be traced at least back to Wittgenstein, and the idea of characterizing an object by its relationship with other objects is one category theorists are already fond of, distributional semantics is distinguished by its essentially statistical methods. The variations are endless and complex, but in the cases relevant to our discussion, one starts with a corpus, a suitable way of determining what the context of a word is (simply being nearby, having a grammatical relationship, being in the same corpus at all, etc) and ends up with a vector space in which the words in the corpus each specify a point. The distance between vectors (for an appropriate definition of distance) then correspond to relationships in meaning, often in surprising ways. The creators of the GloVe algorithm give the example of a vector space in which <semantics>kingman+woman=queen<annotation encoding="application/x-tex">king - man + woman = queen</annotation></semantics>.

There is also a “top down,” relatively syntax oriented analysis of meaning called categorial grammar. Categorial grammar has no accepted formal definition, but the underlying philosophy, called the principle of compositionality, is this: a meaningful sentence is composed of parts, each of which itself has a meaning. To determine the meaning of the sentence as a whole, we may combine the meanings of the constituent parts according to rules which are specified by the syntax of the sentence. Mathematically, this amounts to constructing some algebraic structure which represents grammatical rules. When this algebraic structure is a category, we call it a grammar category.

The Paper


Pregroups are the algebraic structure that this paper uses to model grammar. A pregroup P is a type of partially ordered monoid. Writing <semantics>xy<annotation encoding="application/x-tex">x \to y</annotation></semantics> to specify that <semantics>xy<annotation encoding="application/x-tex">x \leq y</annotation></semantics> in the order relation, we require the following additional property: for each <semantics>pP<annotation encoding="application/x-tex">p \in P</annotation></semantics>, there exists a left adjoint <semantics>p l<annotation encoding="application/x-tex">p^l</annotation></semantics> and a right adjoint <semantics>p r<annotation encoding="application/x-tex">p^r</annotation></semantics>, such that <semantics>p lp1pp r<annotation encoding="application/x-tex">p^l p \to 1 \to p p^r</annotation></semantics> and <semantics>pp r1p rp<annotation encoding="application/x-tex">p p^r \to 1 \to p^r p</annotation></semantics>. Since pregroups are partial orders, we can regard them as categories. The monoid multiplication and adjoints then upgrade the category of a pregroup to compact closed category. The equations referenced above are exactly the snake equations.

We can define a pregroup generated by a set <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> by freely adding adjoints, units and counits to the free monoid on <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics>. Our grammar categories will be constructed as follows: take certain symbols, such as <semantics>n<annotation encoding="application/x-tex">n</annotation></semantics> for noun and <semantics>s<annotation encoding="application/x-tex">s</annotation></semantics> for sentence, to be primitive. We call these “word classes.” Generate a pregroup from them. The morphisms in the resulting category represent “grammatical reductions” of strings of word classes, with a particular string being deemed “grammatical” if it reduces to the word class <semantics>s<annotation encoding="application/x-tex">s</annotation></semantics>. For example, construct the pregroup <semantics>Preg({n,s})<annotation encoding="application/x-tex">Preg( \{n,s\})</annotation></semantics> generated by <semantics>n<annotation encoding="application/x-tex">n</annotation></semantics> and <semantics>s<annotation encoding="application/x-tex">s</annotation></semantics>. A transitive verb can be thought of as accepting two nouns, one on the left and one on the right, and returning a sentence. Using the powerful graphical language for compact closed categories, we can represent this as

Using the adjunctions, we can turn the two inputs into outputs to get

Therefore the type of a verb is <semantics>n rsn l<annotation encoding="application/x-tex">n^r s n^l</annotation></semantics>. Multiplying this on the left and right by <semantics>n<annotation encoding="application/x-tex">n</annotation></semantics> allows us to apply the counits of <semantics>n<annotation encoding="application/x-tex">n</annotation></semantics> to reduce <semantics>n(n rsn l)n<annotation encoding="application/x-tex">n \cdot (n^r s n^l) \cdot n</annotation></semantics> to the type <semantics>s<annotation encoding="application/x-tex">s</annotation></semantics>, as witnessed by

Let <semantics>(FVect,,)<annotation encoding="application/x-tex">(\mathbf{FVect},\otimes, \mathbb{R})</annotation></semantics> be the symmetric monoidal category of finite dimensional vector spaces and linear transformations with the standard tensor product. Since any vector space we use in our applications will always come equipped with a basis, these vector spaces are all endowed with an inner product. Note that <semantics>FVect<annotation encoding="application/x-tex">\mathbf{FVect}</annotation></semantics> has a compact closed structure. The counit is the diagonal

<semantics>η l=η r: VV 1 ie ie i<annotation encoding="application/x-tex">\begin{array}{cccc} \eta_l = \eta_r \colon & \mathbb{R} & \to &V \otimes V \\ &1 &\mapsto & \sum_i \overrightarrow{e_i} \otimes \overrightarrow{e_i} \end{array}</annotation></semantics>

and the unit is a linear extension of the inner product

<semantics>ϵ l=ϵ r: VV i,jc ijv iw j i,jc ijv i,w j.<annotation encoding="application/x-tex">\begin{array}{cccc} \epsilon^l = \epsilon^r \colon &V \otimes V &\to& \mathbb{R} \\ & \sum_{i,j} c_{i j} \vec{v_{i}} \otimes \vec{w_j} &\mapsto& \sum_{i,j} c_{i j} \langle \vec{v_i}, \vec{w_j} \rangle. \end{array} </annotation></semantics>

The Model of Meaning

Let <semantics>(P,)<annotation encoding="application/x-tex">(P, \cdot)</annotation></semantics> be a pregroup. The ingenious idea that the authors of this paper had was to combine categorial grammar with distributional semantics. We can rephrase their construction in more general terms by using a compact closed functor

<semantics>F:(P,)(FVect,,).<annotation encoding="application/x-tex">F \colon (P, \cdot) \to (\mathbf{FVect}, \otimes, \mathbb{R}) .</annotation></semantics>

Unpacking this a bit, we assign each word class a vector space whose basis is a chosen finite set of context words. To each type reduction in <semantics>P<annotation encoding="application/x-tex">P</annotation></semantics>, we assign a linear transformation. Because <semantics>F<annotation encoding="application/x-tex">F</annotation></semantics> is strictly monoidal, a string of word classes <semantics>p 1p 2p n<annotation encoding="application/x-tex">p_1 p_2 \cdots p_n</annotation></semantics> maps to a tensor product of vector spaces <semantics>V 1V 2V n<annotation encoding="application/x-tex">V_1 \otimes V_2 \otimes \cdots \otimes V_n</annotation></semantics>.

To compute the meaning of a string of words you must:

  1. Assign to each word a string of symbols <semantics>p 1p 2p n<annotation encoding="application/x-tex">p_1 p_2 \cdots p_n</annotation></semantics> according to the grammatical types of the word and your choice of pregroup formalism. This is nontrivial. For example, many nouns can also be used as adjectives.

  2. Compute the correlations between each word in your string and the context words of the chosen vector space (see the example below) to get a vector <semantics>v 1v nV 1V n<annotation encoding="application/x-tex">v_1 \otimes \cdots \otimes v_n \in V_1 \otimes \cdots \otimes V_n</annotation></semantics>,

  3. choose a type reduction <semantics>f:p 1p 2p nq 1q 2q n<annotation encoding="application/x-tex">f \colon p_1 p_2 \cdots p_n \to q_1 q_2 \cdots q_n</annotation></semantics> in your grammar category (there may not always be a unique type reduction) and,

  4. apply <semantics>F(f)<annotation encoding="application/x-tex">F(f)</annotation></semantics> to your vector <semantics>v 1v n<annotation encoding="application/x-tex">v_1 \otimes \cdots \otimes v_n</annotation></semantics>.

  5. You now have a vector in whatever space you reduced to. This is the “meaning” of the string of words, according the your model.

This sweeps some things under the rug, because A. Preller proved that strict monoidal functors from a pregroup to <semantics>FVect<annotation encoding="application/x-tex">\mathbf{FVect}</annotation></semantics> actually force the relevant spaces to have dimension at most one. So for each word type, the best we can do is one context word. This is bad news, but the good news is that this problem disappears when more complicated grammar categories are used. In Lambek vs. Lambek monoidal bi-closed categories are used, which allow for this functorial description. So even though we are not really dealing with a functor when the domain is a pregroup, it is a functor in spirit and thinking of it this way will allow for generalization into more complicated models.

An Example

As before, we use the pregroup <semantics>Preg({n,s})<annotation encoding="application/x-tex">Preg(\{n,s\})</annotation></semantics>. The nouns that we are interested in are

<semantics>{Maria,John,Cynthia}<annotation encoding="application/x-tex"> \{ Maria, John, Cynthia \}</annotation></semantics>

These nouns form the basis vectors of our noun space. In the order they are listed, they can be represented as

<semantics>[1 0 0],[0 1 0],[0 0 1].<annotation encoding="application/x-tex"> \begin{bmatrix} 1 \\ 0 \\ 0 \end{bmatrix}, \begin{bmatrix} 0 \\ 1 \\ 0 \end{bmatrix}, \begin{bmatrix} 0 \\ 0 \\ 1 \end{bmatrix}. </annotation></semantics>

The “sentence space” <semantics>F(s)<annotation encoding="application/x-tex">F(s)</annotation></semantics> is taken to be a one dimensional space in which <semantics>0<annotation encoding="application/x-tex">0</annotation></semantics> corresponds to false and the basis vector <semantics>1 S<annotation encoding="application/x-tex">1_S</annotation></semantics> corresponds to true. As before, transitive verbs have type <semantics>n rsn l<annotation encoding="application/x-tex">n^r s n^l</annotation></semantics>, so using our functor <semantics>F<annotation encoding="application/x-tex">F</annotation></semantics>, verbs will live in the vector space <semantics>NSN<annotation encoding="application/x-tex">N \otimes S \otimes N</annotation></semantics>. In particular, the verb “like” can be expressed uniquely as a linear combination of its basis elements. With knowledge of who likes who, we can encode this information into a matrix where the <semantics>ij<annotation encoding="application/x-tex">ij</annotation></semantics>-th entry corresponds to the coefficient in front of <semantics>v i1 sv j<annotation encoding="application/x-tex">v_i \otimes 1_s \otimes v_j</annotation></semantics>. Specifically, we have

<semantics>[1 0 1 1 1 0 1 0 1].<annotation encoding="application/x-tex"> \begin{bmatrix} 1 & 0 & 1 \\ 1 & 1 & 0 \\ 1 & 0 & 1 \end{bmatrix}. </annotation></semantics>

The <semantics>ij<annotation encoding="application/x-tex">ij</annotation></semantics>-th entry is <semantics>1<annotation encoding="application/x-tex">1</annotation></semantics> if person <semantics>i<annotation encoding="application/x-tex">i</annotation></semantics> likes person <semantics>j<annotation encoding="application/x-tex">j</annotation></semantics> and <semantics>0<annotation encoding="application/x-tex">0</annotation></semantics> otherwise. To compute the meaning of the sentence “Maria likes Cynthia”, you compute the matrix product

<semantics>[1 0 0][1 0 1 1 1 0 1 0 1][0 0 1]=1<annotation encoding="application/x-tex"> \begin{bmatrix} 1 & 0 & 0 \end{bmatrix} \begin{bmatrix} 1 & 0 & 1 \\ 1 & 1 & 0 \\ 1 & 0 & 1 \end{bmatrix} \begin{bmatrix} 0\\ 0\\ 1 \end{bmatrix} =1 </annotation></semantics>

This means that the sentence “Maria likes Cynthia” is true.

Food for Thought

As we said above, this model does not always give a unique meaning to a string of words, because at various points there are choices that need to be made. For example, the phrase “squad helps dog bite victim” has a different meaning depending on whether you take “bite” to be a verb or a noun. Also, if you reduce “dog bite victim” before applying it to the verb, you will get a different meaning than if you reduce “squad helps dog” and apply it to the verb “bite”. On the one hand, this a good thing because those sentences should have different meanings. On the other hand, the presence of choices makes it harder use this model in a practical algorithm.

Some questions arose which we did not have a clear way to address. Tensor products of spaces of high dimension quickly achieve staggering dimensionality — can this be addressed? How would one actually fit empirical data into this model? The “likes” example, which required us to know exactly who likes who, illustrates the potentially inaccessible information that seems to be necessary to assign vectors to words in a way compatible with the formalism. Admittedly, this is a necessary consequence of the fact the evaluation is of the truth or falsity of the statement, but the issue also arises in general cases. Can this be resolved? In the paper, the authors are concerned with determining the meaning of grammatical sentences (although we can just as easily use non-grammatical strings of words), so that the computed meaning is always a vector in the sentence space <semantics>F(s)<annotation encoding="application/x-tex">F(s)</annotation></semantics>. What are the useful choices of structure for the sentence space?

This paper was not without precedent — suggestions and models related its concepts of this paper had been floating around beforehand, and could be helpful in understanding the development of the central ideas. For example, Aerts and Gabora proposed elaborating on vector space models of meaning, incidentally using tensors as part of an elaborate quantum mechanical framework. Notably, they claimed their formalism solved the “pet fish” problem - English speakers rate goldfish as very poor representatives of fish as a whole, and of pets as a whole, but consider goldfish to be excellent representatives of “pet fish.” Existing descriptions of meaning in compositional terms struggled with this. In The Harmonic Mind, first published in 2005, Smolensky and Legendre argued for the use of tensor products in marrying linear algebra and formal grammar models of meaning. Mathematical Foundations for a Compositional Distributional Model of Meaning represents a crystallization of all this into a novel and exciting construction, which continues to be widely cited and discussed.

We would like to thank Martha Lewis, Brendan Fong, Nina Otter, and the other participants in the seminar.

by john ( at April 02, 2018 07:59 PM

Tommaso Dorigo - Scientificblogging

The Magical Caves Of Frasassi

While spending a few vacation days on a trip around central Italy I made a stop in a place in the Appennini mountains, to visit some incredible caves. The caves of Frasassi were discovered in September 1971 by a few young speleologists, who had been tipped off by locals about the existence, atop a mountain near their village, of a hole in the ground, which emitted a strong draft wind - the unmistakable sign of underground hollows.

read more

by Tommaso Dorigo at April 02, 2018 03:53 PM

Jester - Resonaances

Singularity is now
Artificial intelligence (AI) is entering into our lives.  It's been 20 years now since the watershed moment of Deep Blue versus Garry Kasparov.  Today, people study the games of AlphaGo against itself to get a glimpse of what a superior intelligence would be like. But at the same time AI is getting better in copying human behavior.  Many Apple users have got emotionally attached to Siri. Computers have not only learnt  to drive cars, but also not to slow down when a pedestrian is crossing the road. The progress is very well visible to the bloggers community. Bots commenting under my posts have evolved well past !!!buy!!!viagra!!!cialis!!!hot!!!naked!!!  sort of thing. Now they refer to the topic of the post, drop an informed comment, an interesting remark,  or a relevant question, before pasting a link to a revenge porn website. Sometimes it's really a pity to delete those comments, as they can be more to-the-point than those written by human readers.   

AI is also entering the field of science at an accelerated pace, and particle physics is as usual in the avant-garde. It's not a secret that physics analyses for the LHC papers (even if finally signed by 1000s of humans) are in reality performed by neural networks, which are just beefed up versions of Alexa developed at CERN. The hottest topic in high-energy physics experiment is now machine learning,  where computers teach  humans the optimal way of clustering jets, or telling quarks from gluons. The question is when, not if, AI will become sophisticated enough to perform a creative work of theoreticians. 

It seems that the answer is now.

Some of you might have noticed a certain Alan Irvine, affiliated with the Los Alamos National Laboratory, regularly posting on arXiv single-author theoretical papers on fashionable topics such as the ATLAS diphoton excess, LHCb B-meson anomalies, DAMPE spectral feature, etc. Many of us have received emails from this author requesting citations. Recently I got one myself; it seemed overly polite, but otherwise it didn't differ in relevance or substance from other similar requests. During the last two and half years,  A. Irvine has accumulated a decent h-factor of 18.  His papers have been submitted to prestigious journals in the field, such as the PRL, JHEP, or PRD, and some of them were even accepted after revisions. The scandal broke out a week ago when a JHEP editor noticed that the extensive revision, together with a long cover letter, was submitted within 10 seconds from receiving the referee's comments. Upon investigation, it turned out that A. Irvine never worked in Los Alamos, nobody in the field has ever met him in person, and the IP from which the paper was submitted was that of the well-known Ragnarok Thor server. A closer analysis of his past papers showed that, although linguistically and logically correct, they were merely a compilation of equations and text from the previous literature without any original addition. 

Incidentally, arXiv administrators have been aware that, since a few years, all source files in daily hep-ph listings were downloaded for an unknown purpose by automated bots. When you have excluded the impossible, whatever remains, however improbable, must be the truth. There is no doubt that A. Irvine is an AI bot, that was trained on the real hep-ph input to produce genuinely-looking  particle theory papers.     

The works of A. Irvine have been quietly removed from arXiv and journals, but difficult questions remain. What was the purpose of it? Was it a spoof? A parody? A social experiment? A Facebook research project? A Russian provocation?  And how could it pass unnoticed for so long within  the theoretical particle community?  What's most troubling is that, if there was one, there can easily be more. Which other papers on arXiv are written by AI? How can we recognize them?  Should we even try, or maybe the dam is already broken and we have to accept the inevitable?  Is Résonaances written by a real person? How can you be sure that you are real?

Update: obviously, this post is an April Fools' prank. It is absolutely unthinkable that the creative process of writing modern particle theory papers can ever be automatized. Also, the neural network referred to in the LHC papers is nothing like Alexa; it's simply a codename for PhD students.  Finally, I assure you that Résonaances is written by a hum 00105e0 e6b0 343b 9c74 0804 e7bc 0804 e7d5 0804 [core dump]

by Mad Hatter ( at April 02, 2018 08:26 AM

April 01, 2018

Lubos Motl - string vacua and pheno

Stephen Hawking writes a post-mortem paper
Stephen Hawking had a funeral in Cambridge yesterday. Some 500 people attended. I think that the family members were wise not to completely destroy the body because it could also include the soul. Hours later, the decision already produced its fruits.

Stephen Hawking just posted a new paper to the arXiv:
Imaginary time as a path to resurrection (screenshot)
It's just five pages long but it's using some very hard mathematics so I haven't had the time to fully comprehend it yet.

The abstract looks simple and intriguing, however:
We exploit the machinery of imaginary time to circumvent any particular point on the temporal real axis. The methodology may also be considered a refined realization of the process called "the resurrection" by the laymen. We describe a successful experiment in which a lifetime started on the anniversary of Galileo Galilei's death was interrupted on Albert Einstein's birthday and, through the complex plane, continued on the anniversary of the resurrection of Jesus Christ.
Is there a reader who understands the contours in the complex plane well enough?

I have some doubts about the applicability of the method. He could have easily continued himself through the contour. After all, the same trick was already made by Hawking's Jewish colleague 1985 or 1988 years ago. And no one wants to live forever, anyway. But a more important question is: May it be applied to objects such as food? If you like a particular fried chicken from Kentucky, can you make sure that you eat it as many times as you want?

OK, Hawking managed to be resurrected and write a paper again. But can he walk again? Only if he came again, we could admit that his derivation allows the second coming of Stephen Hawking.

by Luboš Motl ( at April 01, 2018 05:52 AM

March 31, 2018

Lubos Motl - string vacua and pheno

Boyle-Finn-Turok anti-universe paper is plain moronic
I usually capitalize the universe but in this whole blog post, I chose to be compatible with them

Neil Turok has been among the men who have spent years by hyping theories about "cyclic universes" and related laymen's ideas that have never explained anything in physics and that are really not capable of explaining anything, for very good reasons. So this enterprise became a religious activity of a sort – he needs to publish new papers just to create the illusion that the previous papers weren't a stupid waste of time.

The latest, 5-pages-long paper by Boyle, Finn, and Turok is called
CPT symmetric universe.
The birth of a universe from nothing is bad, they effectively claim. Instead, the universe should be pair-created. It's the universe and the anti-universe that are created out of nothing which is nicer. The pair, universe plus anti-universe, are supposed to preserve the CPT. The anti-universe is interpreted as the "universe before the Big Bang", they propose.

Concerning the anti-universe, I can never fail to quote a favorite joke of mine which I learned from my diploma adviser in Prague. Aside from the universe, there also exists the anti-universe where everything is anti-. For example, the hardest science over there is anti-physics and it's researched by anti-Semites. ;-)

OK, let us return to the Turok et al. paper.

First, if they want to really preserve the CPT, then the "anti-universe" must be a precise copy of the "universe". So it's just a redundancy, a mirror image added to the history of the "universe", and there's no extra information in it. So it should be removed. There is no analogy with particles and antiparticles because when a particle pair is created, the two members of the pair may undergo different fates.

Clearly, they must mean an "approximate anti-universe" which is just macroscopically similar but whose events aren't exact mirror images of the events in our "universe". So CPT is still violated, although in some macroscopic perspective, it's approximately preserved.

Second, there's a question whether the "anti-universe" should be drawn as some history for \(t\gt 0\), as the normal "universe", or as \(t\lt 0\). They explicitly propose the latter. But that can't be consistent with the claim that CPT is saved from spontaneous breaking. Why? Because the second law of thermodynamics (not to mention related consequences of the arrow of time) demands the entropy to increase with time.

So either they will have the entropy that is an even function of time\[

S(-t) = S(t)

\] in which case the entropy is decreasing at \(t\lt 0\) and they violate the second law of thermodynamics. Or\[

S(-t) \neq S(t)

\] in which case the evolution of the "anti-universe" violates the CPT and their claim about the preservation of CPT is wrong. You just can't obey the second law and "spontaneously unbroken CPT" simultaneously!

They included one short paragraph about the second law of thermodynamics:
Also note that density pertubations grow as we get further from the bang in either direction, and hence the thermodynamic arrow of time points away from the bang in both directions (to the future and past).
This paragraph automatically includes a violation of the second law of thermodynamics. To allow the rules of the thermodynamics to be "reversed" in the "anti-universe" is really silly because the future must be defined as the side of the temporal axis where the entropy is higher than in the past. So if you have two parts of the universe where the entropy grows if you get further away from the Big Bang, then the correct way to draw these "two universe" on the \(t\)-axis is to draw both of them as branches of \(t\gt 0\).

That picture means that at \(t=0\), two universes are just created out of nothing and they may be considered anti-objects of each other. Well, it's different from how they want to spin it. On top of that, if some universal laws govern the pair creation, then the two universes are precise "anti-objects" to each other. They must be perfectly entangled and they're just two copies of the same Universe.

At any rate, the events in the "anti-universe" cannot possibly affect us because that "anti-universe" is just a parallel universe that is forever detached from ours. If you draw the "anti-universe" as a branch at \(t\gt 0\), it's clear that this branch in no meaningful way occurred "before our time", so the events in that branch can't be considered causes of anything we may observe.

To generalize the prefix "anti-" from antiparticles and anti-Semites to the universes could have been a good idea a priori. But if you look at the basic possibilities, you can see that it is actually not a good idea. While it's very useful to talk about antiparticles, antibranes, anti-instantons, and other things, it's not useful to talk about anti-universes.

A reason why "anti-" is silly for the whole universe is simply the fact that, as Richard Feynman and John Wheeler figured out, an antiparticle is a particle moving backwards in time (perhaps one with the negative energy, as it happens in the context of the Feynman diagrams). So to talk about "anti-", one needs a pre-existing notion of time and its arrow (which is needed either directly, for the directions of time, or indirectly, for the signs of energy, or both).

So only things that may be embedded within a spacetime with a well-defined arrow of time – e.g. particles and branes – may be associated with their anti-objects. The whole universe itself isn't one of these things because the universe doesn't exist as an object embedded within (another?) universe with a well-defined arrow of time.

But none of these things is really understood by most of the laymen. Turok et al. wrote a confusing package of some basic exciting notions – anti-stuff and Big Bang – and tons of stupid people will buy this piece of pop science regardless of the fact that it's complete junk.

Even if I return to the more acceptable notion of "history before the Big Bang", I think that the room for any meaningful physics of that kind is extremely limited. In other words, I am almost certain that all these ideas must be wrong. There's always a problem with the entropy before the Big Bang that should be even lower than the entropy at the moment of the Big Bang – and the latter is rather low if not zero. (Cyclic universes run into a conflict with the second law of thermodynamics, too.)

And then there's a problem with any predictive consequences of the pre-Big-Bang stage. One reason is that during the Big Bang, the spacetime curvature is huge, perhaps Planckian. That means that all gadgets – including all measurement apparatuses – break there. Because physical quantities are only meaningful to the extent to which they're measurable by apparatuses, I think it's right to say that even the continuity or predictability of the observables during the Big Bang era disappears. If all clocks break etc., the time itself becomes meaningless in the vicinity of the Big Bang, and that's why you shouldn't ask how the post-Big-Bang time is connected to another branch of time. It's a physically meaningless question.

A related fact is that the events are in the regime of "extreme quantum gravity" or "string theory" in the vicinity of the Big Bang. Quantum coherence and phases of the amplitudes matter a lot. It seems that all the people are imagining some simple union of classical geometries. That's almost certainly an inadequate description in that extreme epoch, in one way or another.

Stephen Hawking and Jim Hartle have proposed a much more promising point to address the initial conditions of the whole universe: their Hartle-Hawking state, the (especially initial) wave function of the universe. Whatever happens in the very early moments of the universe and/or "before that moment" should be incorporated in a state that may be defined a little bit after the Big Bang when the state of the world is no longer "insanely curved" and otherwise problematic.

I think that even if our Big Bang were just an event in some longer sequence of events within eternal inflation or something like that, there should exist a calculation of the probabilities that are relevant for our universe that simply "integrates out" all the conceivable pre-history. The Feynman path integral has the nice property that you may get the resulting probability amplitudes directly, through an explicit formula.

It only makes sense to talk about a "history in time" if the "time" is an uncontroversial continuation of the time that we actually observe in our world. If there is any other information or quantum information affecting us, that information and its impact on us should be converted to the variables that are compatible with the existence of time or spacetime of our type. In particular, all pre-Big-Bang influences should be convertible to a Hartle-Hawking state at the beginning of our time right after the Big Bang.

I think it's almost rigorously provable that no other treatment of the hypothetical pre-Big-Bang evolution of the universe may have defensible and calculable consequences for our post-Big-Bang evolution. But even if you doubt this ambitious statement of mine, it's a fact that no convincing explanation of any observable fact about the universe using a pre-Big-Bang (or cyclic) concepts has been proposed in the literature so far.

by Luboš Motl ( at March 31, 2018 05:45 PM

March 29, 2018

The n-Category Cafe

On the Magnitude Function of Domains in Euclidean Space, II

joint post with Heiko Gimperlein and Magnus Goffeng.

In the previous post, On the Magnitude Function of Domains in Euclidean Space, I, Heiko and Magnus explained the main theorem in their paper

(Remember that here a domain <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> in <semantics>R n<annotation encoding="application/x-tex">R^n</annotation></semantics> means a subset equal to the closure of its interior.)

The main theorem involves the asymptoic behaviour of the magnitude function <semantics> X(R)<annotation encoding="application/x-tex">\mathcal{M}_X(R)</annotation></semantics> as <semantics>R<annotation encoding="application/x-tex">R\to\infty</annotation></semantics> and also the continuation of the magnitude function to a meromorphic function on the complex numbers.

In this post we have tried to tease out some of the analytical ideas that Heiko and Magnus use in the proof of their main theorem.

Heiko and Magnus build on the work of Mark Meckes, Juan Antonio Barceló and Tony Carbery and give a recipe of calculating the magnitude function of a compact domain <semantics>X n<annotation encoding="application/x-tex">X\subset \mathbb{R}^n</annotation></semantics> (for <semantics>n=2m1<annotation encoding="application/x-tex">n=2m-1</annotation></semantics> an odd integer) by finding a solution to a differential equation subject to boundary conditions which involve certain derivatives of the function at the boundary <semantics>X<annotation encoding="application/x-tex">\partial X</annotation></semantics> and then integrating over the boundary certain other derivatives of the solution.

In this context, switching from one set of derivatives at the boundary to another set of derivatives involves what analysts call a Dirichlet to Neumann operator. In order to understand the magnitude function it turns out that it suffices to consider this Dirichlet to Neumann operator (which is actually parametrized by the scale factor in the magnitude function). Heavy machinary of semiclassical analysis can then be employed to prove properties of this parameter-dependent operator and hence of the magntiude function.

We hope that some of this is explained below!

[Remember that throughout this post we have <semantics>n=2m1<annotation encoding="application/x-tex">n=2m-1</annotation></semantics> is an odd positive integer.]

The work of Meckes and Barceló-Carbery

As a reader of this blog you might well know that magnitude of finite metric spaces is usually defined using weightings. Mark Meckes showed that the natural extension of magnitude to infinite subsets of Euclidean space can be defined using potential functions.

Before going anywhere, however, recall that the Laplacian operator <semantics>Δ<annotation encoding="application/x-tex">\Delta</annotation></semantics> is the differential operator on functions on <semantics> n<annotation encoding="application/x-tex">\mathbb{R}^n</annotation></semantics> given by <semantics>Δf= i=1 n 2fx i 2<annotation encoding="application/x-tex">\Delta f=\sum_{i=1}^n \frac{\partial ^2f}{\partial x_i^2}</annotation></semantics>.

Now, for <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> a compact subset of <semantics> n<annotation encoding="application/x-tex">\mathbb{R}^n</annotation></semantics> with smooth boundary, a potential function <semantics>h<annotation encoding="application/x-tex">h</annotation></semantics> for <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> is a function <semantics>h: n<annotation encoding="application/x-tex">h\colon \mathbb{R}^n\to \mathbb{R}</annotation></semantics> with properties including the following:

  • <semantics>h=1<annotation encoding="application/x-tex">h= 1</annotation></semantics> on <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics>;
  • <semantics>(idΔ) mh=0<annotation encoding="application/x-tex">(\id- \Delta)^m h = 0</annotation></semantics> (weakly) on <semantics> nX<annotation encoding="application/x-tex">\mathbb{R}^n\setminus X</annotation></semantics>;
  • <semantics>h<annotation encoding="application/x-tex">h</annotation></semantics> is <semantics>m1<annotation encoding="application/x-tex">m-1</annotation></semantics> times differentiable on <semantics> n<annotation encoding="application/x-tex">\mathbb{R}^n</annotation></semantics>, and the <semantics>m<annotation encoding="application/x-tex">m</annotation></semantics>th derivative exists in an <semantics>L 2<annotation encoding="application/x-tex">L^2</annotation></semantics>-sense;
  • <semantics>h(x)0<annotation encoding="application/x-tex">h(x)\to 0</annotation></semantics> as <semantics>x<annotation encoding="application/x-tex">x\to \infty</annotation></semantics>.

You can see an example below in the next section.

Barceló and Carbery built on the results of Meckes to show that for a compact convex domain <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> in <semantics> n<annotation encoding="application/x-tex">\mathbb{R}^n</annotation></semantics> with smooth boundary the following recipe can be used to calculate the magnitude of <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics>.

First define <semantics>𝒟 i<annotation encoding="application/x-tex">\mathcal{D}^i</annotation></semantics> to be the order <semantics>i<annotation encoding="application/x-tex">i</annotation></semantics> differential operator on the boundary by

<semantics>𝒟 2j=Δ j,𝒟 2j+1=νΔ j,<annotation encoding="application/x-tex"> \mathcal{D}^{2j}= \Delta^{j},\,\,\,\,\, \mathcal{D}^{2j+1}=\textstyle\frac{\partial}{\partial \nu}\Delta^{j}, </annotation></semantics>

where <semantics>ν<annotation encoding="application/x-tex">\textstyle\frac{\partial}{\partial \nu}</annotation></semantics> means the derivative in the normal direction to the boundary.

The Barceló-Carbery Recipe for Magnitude. Suppose <semantics>X n<annotation encoding="application/x-tex">X\in \mathbb{R}^n</annotation></semantics> is a compact domain with smooth boundary.

  1. Find a solution <semantics>u: nX<annotation encoding="application/x-tex">u\colon \mathbb{R}^n\setminus X\to \mathbb{R}</annotation></semantics> with <semantics>u(x)0<annotation encoding="application/x-tex">u(x)\to 0</annotation></semantics> as <semantics>x<annotation encoding="application/x-tex">x\to \infty</annotation></semantics> of the differential equation <semantics>(IdΔ) mu=0on nX<annotation encoding="application/x-tex"> (Id-\Delta)^m u=0\,\,\,\, \text{on}\,\,\mathbb{R}^n\setminus X </annotation></semantics> subject to the boundary conditions <semantics>u=1,𝒟 1(u)=0,𝒟 2(u)=0,,𝒟 m1(u)=0,onX.<annotation encoding="application/x-tex"> u=1,\, \mathcal{D}^1(u) =0,\, \mathcal{D}^2(u) =0,\, \dots, \mathcal{D}^{m-1}(u) =0, \,\,\,\, \text{on}\,\, \partial X. </annotation></semantics>

  2. The magnitude is then calculated by <semantics>mag(X)=1n!ω n(vol(X)+ m/2<jm(1) j(mj) X𝒟 2j1(u)ds).<annotation encoding="application/x-tex"> \text{mag}(X)=\frac{1}{n!\,\omega_n}\left(\text{vol}(X)+\sum_{m/2\lt j\le m}(-1)^j\binom{m}{j}\int_{\partial X} \mathcal{D}^{2j-1}(u)\,\mathrm{d}{s}\right). </annotation></semantics>

Barcelo and Carbery actually stated their result for convex domains, but if we assume smoothness of the boundary then we can drop the convexity assumption.

The potential function <semantics>h<annotation encoding="application/x-tex">h</annotation></semantics> of <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> is related to the <semantics>u<annotation encoding="application/x-tex">u</annotation></semantics> in the recipe by extending it to all of <semantics> n<annotation encoding="application/x-tex">\mathbb{R}^n</annotation></semantics> by taking it to be <semantics>1<annotation encoding="application/x-tex">1</annotation></semantics> on <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics>.

Let’s have a look at a simple example that we’ll return to through this post.

A one-dimensional example

Consider the union of two intervals in the line: <semantics>X:=[a 1,a 2][a 3,a 4]<annotation encoding="application/x-tex">X:=[a_1, a_2]\cup[a_3, a_4]\subset \mathbb{R}</annotation></semantics> for <semantics>a 1<a 2<a 3<a 4<annotation encoding="application/x-tex">a_1\lt a_2\lt a_3 \lt a_4</annotation></semantics>. The differential equation to solve in the Barceló-Carbery recipe is then

<semantics>uu=0on X=(,a 1][a 2,a 3][a 4,),<annotation encoding="application/x-tex"> u-u''=0 \,\,\,\, \text{on }\,\,\mathbb{R}\setminus X=(-\infty, a_1]\cup [a_2,a_3]\cup [a_4,\infty), </annotation></semantics>

and the boundary conditions are

<semantics>u(x)=1for x{a 1,a 2,a 3,a 4}.<annotation encoding="application/x-tex"> u(x)=1\,\,\,\,\text{for }\,\,x\in \{a_1, a_2, a_3, a_4\}. </annotation></semantics>

This is easy to solve by hand and you find the solution

<semantics>u(x)={e (a 1x) x(,a 1], (e (xa 2)+e (a 3x))/(e (a 3a 2)+1) x[a 2,a 3], e (xa 4) x[a 4,).<annotation encoding="application/x-tex"> u(x)=\begin{cases} e^{-(a_1-x)} & x\in (-\infty, a_1], \\ (e^{-(x-a_2)}+e^{-(a_3-x)})/(e^{-(a_3-a_2)}+1) & x\in [a_2, a_3], \\ e^{-(x-a_4)} & x\in [a_4, \infty). \end{cases} </annotation></semantics>

Here is the graph of the potential function.

The graph of the potential function

Now according to Barceló and Carbery’s recipe we can calculate the magnitude as

<semantics>mag(X) =12(vol(X)(u(a 1)+u(a 2)u(a 3)+u(a 4)))<annotation encoding="application/x-tex"> \begin{aligned} \mathrm{mag}(X) & =\frac{1}{2}\left(\mathrm{vol}(X)-(-u'(a_1)+ u'(a_2)-u'(a_3)+u'(a_4))\right) \end{aligned} </annotation></semantics>

But it is easy to compute from the formula for <semantics>u<annotation encoding="application/x-tex">u</annotation></semantics> above that

<semantics>u(a 1)=1=u(a 4),u(a 2)=tanh(a 3a 22)=u(a 3)<annotation encoding="application/x-tex"> u'(a_1)=1=-u'(a_4), \quad -u'(a_2)=\tanh\left(\frac{a_3-a_2}{2}\right)=u'(a_3) </annotation></semantics>

and so

<semantics>mag(X) =12(a 2a 1+a 4a 3)+1+tanh(a 3a 22)<annotation encoding="application/x-tex"> \begin{aligned} \mathrm{mag}(X)&=\tfrac{1}{2}(a_2-a_1+a_4-a_3)+1 +\tanh(\frac{a_3-a_2}{2}) \end{aligned} </annotation></semantics>

which we can write as

<semantics>mag(X) =12vol(X)+χ(X)2exp(a 3a 2)+1.<annotation encoding="application/x-tex"> \begin{aligned} \mathrm{mag}(X) &= \tfrac{1}{2}\mathrm{vol}(X)+\chi(X)-\frac{2}{\exp(a_3-a_2)+1}. \end{aligned} </annotation></semantics>

A key point to note here is that to calculate the magnitude we don’t actually need to know the whole potential function <semantics>u<annotation encoding="application/x-tex">u</annotation></semantics>, we only need to know certain of its derivatives at the boundary. So we start with a differential equation, specify sufficiently many derivatives at the boundary to give a unique solution and then find values of other derivatives at the boundary. This is a process which is well studied in the area of boundary value problems and is embedded in the notion of the Dirichlet to Neummann operator which we now look at.

The Dirichlet to Neumann operator

As you surely know, when solving a differential equation, you impose boundary conditions in order to pin down the solution. You might impose different boundary conditions in different situations. For instance, the classical Dirichlet boundary conditions for a problem of second order fix the value of the function on the boundary, whereas the classical Neumann boundary conditions fix the normal derivative of the function on the boundary.

For the calculating magnitude, the boundary value problem is of order <semantics>2m<annotation encoding="application/x-tex">2m</annotation></semantics>, not of order <semantics>2<annotation encoding="application/x-tex">2</annotation></semantics>. We think of the boundary conditions <semantics>f=1,𝒟 1(f)=0,𝒟 2(f)=0,,𝒟 m1(f)=0<annotation encoding="application/x-tex">f=1,\, \mathcal{D}^1(f) =0, \, \mathcal{D}^2(f) =0, \,\dots, \mathcal{D}^{m-1}(f) =0</annotation></semantics>, which involves derivatives of order <semantics>0<annotation encoding="application/x-tex">0</annotation></semantics> up to <semantics>m1<annotation encoding="application/x-tex">m-1</annotation></semantics>, as analogues of Dirichlet boundary conditions. To compute the magnitude we need to determine the derivatives of the solution of order <semantics>m<annotation encoding="application/x-tex">m</annotation></semantics> up to <semantics>2m1<annotation encoding="application/x-tex">2m-1</annotation></semantics>, which we think of as analogues of Neumann boundary conditions.

Given Dirichlet boundary conditions we want to determine the corresponding Neumann boundary conditions. Let’s think what this means.

If you have a differential equation <semantics>Lf=0<annotation encoding="application/x-tex">L f =0</annotation></semantics> on a domain <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> then specifying the boundary condition means imposing a set of equations of the form

<semantics>(δ 1f)(x)=v 1(x),,(δ pf)(x)=v p(x),for all xX<annotation encoding="application/x-tex"> (\delta_1f)(x)=v_1(x),\,\,\dots,\,\,(\delta_p f)(x)=v_p(x), \,\, \text{for all }\,\,x\in \partial X </annotation></semantics>

where each <semantics>δ i<annotation encoding="application/x-tex">\delta_i</annotation></semantics> is a differential operator on the boundary <semantics>X<annotation encoding="application/x-tex">\partial X</annotation></semantics> and each <semantics>v iFun(X)<annotation encoding="application/x-tex">v_i\in \mathrm{Fun}(\partial X)</annotation></semantics> belongs to a suitable space of functions on the boundary. (We will avoid technicalities and complicated notation by using <semantics>Fun(X)<annotation encoding="application/x-tex">Fun(\partial X)</annotation></semantics> to stand for some space of functions, which might vary depending on context.)

When the boundary conditions give a unique solution to the differential equation – such as in the Barceló-Carbery Recipe – then for any other set <semantics>{δ˜ j} j=1 q<annotation encoding="application/x-tex">\{\tilde {\delta}_j\}_{j=1}^q</annotation></semantics> of differential operators on the boundary there is a map, the Dirichlet to Neumann operator, between tuples of function on the boundary:

<semantics>Λ: i=1 pFun(X) j=1 qFun(X); i=1 pv i j=1 qδ˜ ju v,<annotation encoding="application/x-tex"> \begin{aligned} \Lambda\colon \bigoplus_{i=1}^p \mathrm{Fun}(\partial X) &\to \bigoplus_{j=1}^q \mathrm{Fun}(\partial X); \\ \bigoplus_{i=1}^p v_i &\mapsto\bigoplus_{j=1}^q {\tilde\delta}_j u_{\mathbf{v}}, \end{aligned} </annotation></semantics>

where <semantics>u v<annotation encoding="application/x-tex">u_{\mathbf{v}}</annotation></semantics> is the unique solution to <semantics>Lu=0<annotation encoding="application/x-tex">L u=0</annotation></semantics> subject to the boundary conditions <semantics>δ iu=v i<annotation encoding="application/x-tex">\delta_i u=v_i</annotation></semantics>, <semantics>i=1,,p<annotation encoding="application/x-tex">i=1,\ldots, p</annotation></semantics>.

This Dirichlet to Neumann operator will be a key ingredient in our approach to the parameter-dependent boundary problem for the magnitude function below.

In our toy one-dimensional example we have a single differential operator <semantics>δ 1=id<annotation encoding="application/x-tex">\delta_1= \id</annotation></semantics> and <semantics>v 11<annotation encoding="application/x-tex">v_1\equiv 1</annotation></semantics>, so this is a classical Dirichlet boundary condition; and we have <semantics>δ˜ 1<annotation encoding="application/x-tex">\tilde\delta_1</annotation></semantics> being the normal derivative to the boundary and therefore this is a classical Neumann boundary condition.

Let’s see what this operator <semantics>Λ<annotation encoding="application/x-tex">\Lambda</annotation></semantics> is in this example.

The Dirichlet to Neumann operator in our toy example

The boundary of <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> consists of the four points <semantics>{a 1,a 2,a 3,a 4}<annotation encoding="application/x-tex">\{a_1, a_2, a_3, a_4\}</annotation></semantics>, and we can identify the space of functions on the boundary <semantics>Fun(X)<annotation encoding="application/x-tex">\mathrm{Fun}(\partial X)</annotation></semantics> with <semantics> 4<annotation encoding="application/x-tex">\mathbb{C}^4</annotation></semantics>. We define the linear map <semantics>Λ:Fun(X)Fun(X)<annotation encoding="application/x-tex">\Lambda\colon\mathrm{Fun}(\partial X)\to \mathrm{Fun}(\partial X)</annotation></semantics>, ie. <semantics>Λ: 4 4<annotation encoding="application/x-tex">\Lambda\colon\mathbb{C}^4\to \mathbb{C}^4</annotation></semantics> as

<semantics>Λ((z 1 z 2 z 3 z 4))(u(a 1) u(a 2) u(a 3) u(a 4)),<annotation encoding="application/x-tex"> \Lambda\left(\begin{pmatrix} z_1\\ z_2\\ z_3\\ z_4\end{pmatrix} \right) \coloneqq \begin{pmatrix} u'(a_1)\\ -u'(a_2)\\ u'(a_3)\\ -u'(a_4) \end{pmatrix} \, , </annotation></semantics>

where the function <semantics>u<annotation encoding="application/x-tex">u</annotation></semantics> solves the boundary value problem

<semantics>u=uinX and(u(a 1) u(a 2) u(a 3) u(a 4))=(z 1 z 2 z 3 z 4).<annotation encoding="application/x-tex"> {u'}'=u\,\,\,\,\text{in}\,\, \mathbb{R}\setminus X\,\,\text{ and}\,\,\,\, \begin{pmatrix} u(a_1)\\ u(a_2)\\ u(a_3)\\ u(a_4)\end{pmatrix} = \begin{pmatrix} z_1\\ z_2\\ z_3\\ z_4\end{pmatrix}\, . </annotation></semantics>

It is not difficult to compute that

<semantics>Λ=(1 0 0 0 0 coth(a 3a 2) csch(a 3a 2) 0 0 csch(a 3a 2) coth(a 3a 2) 0 0 0 0 1).<annotation encoding="application/x-tex"> \Lambda= \begin{pmatrix} 1&0&0&0\\ 0& \coth(a_3-a_2)&-\mathrm{csch}(a_3-a_2)&0\\ 0& -\mathrm{csch}(a_3-a_2)&\coth(a_3-a_2)&0\\ 0&0&0& 1 \end{pmatrix}\, . </annotation></semantics>

Using the Barceló-Carbery recipe we have that the magnitude can be obtained from the sum of the entries of the matrix of <semantics>Λ<annotation encoding="application/x-tex">\Lambda</annotation></semantics>: writing <semantics>1=(1,1,1,1) T<annotation encoding="application/x-tex">\vec{1}=(1,1,1,1)^T</annotation></semantics>,

<semantics>mag(X)=vol(X)2+121,Λ1 4=vol(X)2+1+tanh(a 3a 22),<annotation encoding="application/x-tex"> \mathrm{mag}(X)=\frac{\mathrm{vol}(X)}{2}+\frac{1}{2}\langle \vec{1},\Lambda\vec{1}\rangle_{\mathbb{C}^4}=\frac{\mathrm{vol}(X)}{2}+1+\tanh\left(\frac{a_3-a_2}{2}\right)\, , </annotation></semantics>

as we had before.

Of course, in this case we did calculate the potential function in order to calculate <semantics>Λ<annotation encoding="application/x-tex">\Lambda</annotation></semantics>. For domains in higher dimensions it is rarely possible to compute the potential function. This is the reason to introduce heavier guns from global analysis allowing us to study the operator <semantics>Λ<annotation encoding="application/x-tex">\Lambda</annotation></semantics> without explicitly solving the problem.

What we are really interested is the magnitude function, its meromorphicity and asymptotic behaviour, so we need to study the above boundary value problems with a parameter which will represent the scale factor.

Introducing a parameter

Remember that the magnitude function, <semantics> X<annotation encoding="application/x-tex">\mathcal{M}_X</annotation></semantics>, is defined in terms of the magnitude of the dilates of <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics>, ie. <semantics> X(R)mag(RX)<annotation encoding="application/x-tex">\mathcal{M}_X(R)\coloneqq\mathrm{mag}(R\cdot X)</annotation></semantics> for <semantics>R>0<annotation encoding="application/x-tex">R\gt 0</annotation></semantics>, where <semantics>RX<annotation encoding="application/x-tex">R\cdot X</annotation></semantics> is the same space <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> but with the metric scaled up by a factor of <semantics>R<annotation encoding="application/x-tex">R</annotation></semantics>.

The Barceló-Carbery recipe for the magnitude from above can be generalized to include the scale factor <semantics>R<annotation encoding="application/x-tex">R</annotation></semantics> and in such a way so that it is on an equal footing with the derivatives, essentially by replacing <semantics>(IdΔ)<annotation encoding="application/x-tex">(Id-\Delta)</annotation></semantics> with <semantics>(R 2Δ)<annotation encoding="application/x-tex">(R^2-\Delta)</annotation></semantics>. This approach is well studied in the literature on parameter-dependent pseudo-differential operators.

First define <semantics>𝔻 R i<annotation encoding="application/x-tex">\mathbb{D}_R^i</annotation></semantics> to be the order <semantics>i<annotation encoding="application/x-tex">i</annotation></semantics> differential operator on the boundary <semantics>X<annotation encoding="application/x-tex">\partial X</annotation></semantics> given by <semantics>𝔻 R 2j=(R 2Δ) j,𝔻 R 2j+1=ν(R 2Δ) j.<annotation encoding="application/x-tex"> \mathbb{D}_R^{2j}= (R^2-\Delta)^{j},\,\,\,\, \mathbb{D}_R^{2j+1} = \textstyle\frac{\partial}{\partial \nu}(R^2-\Delta)^{j}. </annotation></semantics>

The Gimperlein-Goffeng Recipe for the Magnitude Function. Suppose that <semantics>X n<annotation encoding="application/x-tex">X\subset \mathbb{R}^n</annotation></semantics> is a compact domain with smooth boundary.

  1. Find a solution <semantics>u R: nX<annotation encoding="application/x-tex">u_R\colon \mathbb{R}^n\setminus X\to \mathbb{R}</annotation></semantics> with <semantics>u R(x)0<annotation encoding="application/x-tex">u_R(x)\to 0</annotation></semantics> as <semantics>x<annotation encoding="application/x-tex">x\to \infty</annotation></semantics> of the differential equation <semantics>(R 2Δ) mu=0on nX<annotation encoding="application/x-tex"> (R^2-\Delta)^m u=0\,\,\,\,\text{on }\,\,\mathbb{R}^n\setminus X </annotation></semantics> subject to the boundary conditions on <semantics>𝔻 R 0(u),,𝔻 R m1(u)<annotation encoding="application/x-tex">\mathbb{D}_R^0(u),\dots, \mathbb{D}_R^{m-1}(u)</annotation></semantics>: <semantics>𝔻 R 2i(u)=R 2i,𝔻 R 2i+1(u)=0,on X.<annotation encoding="application/x-tex"> \mathbb{D}_R^{2i}(u) =R^{2 i},\,\, \mathbb{D}_R^{2i+1}(u) =0, \,\,\,\, \text{on } \partial X. </annotation></semantics>

  2. The magnitude is then calculated by <semantics>mag(RX)=1n!ω n(vol(X)R n m/2<jmR n2j X𝔻 R 2j1u RdS).<annotation encoding="application/x-tex"> \text{mag}(R\cdot X)=\frac{1}{n!\,\omega_n}\left(\text{vol}(X)R^n-\sum_{m/2\lt j\le m} R^{n-2j}\int_{\partial X} \mathbb{D}_R^{2j-1}u_R \,\mathrm{d}{S}\right). </annotation></semantics>

The eagle-eyed amongst you will notice that setting <semantics>R=1<annotation encoding="application/x-tex">R=1</annotation></semantics> does not immediately recover the Barceló-Carbery recipe. However, you can recover that with some algebraic manipulation and binomial identities.

Again we can use the Dirichlet to Neumann operator, but note that this will depend on a parameter <semantics>R<annotation encoding="application/x-tex">R</annotation></semantics>. We think of <semantics>Λ(R)<annotation encoding="application/x-tex">\Lambda(R)</annotation></semantics> as an operator valued function of the scaling parameter <semantics>R<annotation encoding="application/x-tex">R</annotation></semantics>. If we start with the values of the differential operators <semantics>𝔻 R 0(u),,𝔻 R m1(u)<annotation encoding="application/x-tex">\mathbb{D}_R^0(u),\dots, \mathbb{D}_R^{m-1}(u)</annotation></semantics> on the boundary it should return the values of the operators <semantics>𝔻 R m(u),𝔻 R m+2(u),,𝔻 R n(u)<annotation encoding="application/x-tex">\mathbb{D}_R^{m'}(u),\,\,\mathbb{D}_R^{m'+2}(u),\,\,\dots, \mathbb{D}_R^{n}(u)</annotation></semantics>, where <semantics>m=m<annotation encoding="application/x-tex">m'=m</annotation></semantics> if <semantics>m<annotation encoding="application/x-tex">m</annotation></semantics> is odd and <semantics>m=m+1<annotation encoding="application/x-tex">m'=m+1</annotation></semantics> is <semantics>m<annotation encoding="application/x-tex">m</annotation></semantics> is even. By the formula above we can use this operator to calculate the magnitude function.

Let’s look at the case of our toy example again.

The parameter-dependent operator in the toy example

In our running example of two disjoint intervals on the real line, <semantics>X:=[a 1,a 2][a 3,a 4]<annotation encoding="application/x-tex">X:=[a_1, a_2]\cup[a_3, a_4]\subset \mathbb{R}</annotation></semantics>, you can calculate to find

<semantics>Λ(R)=R(1 0 0 0 0 coth(R(a 2b 1)) csch(R(a 2b 1)) 0 0 csch(R(a 2b 1)) coth(R(a 2b 1)) 0 0 0 0 1).<annotation encoding="application/x-tex"> \Lambda(R)= R\begin{pmatrix} 1&0&0&0\\ 0& \coth(R(a_2-b_1))&-\mathrm{csch}(R(a_2-b_1))&0\\ 0& -\mathrm{csch}(R(a_2-b_1))&\coth(R(a_2-b_1))&0\\ 0&0&0& 1 \end{pmatrix}\,. </annotation></semantics>

Again, writing <semantics>1=(1,1,1,1) T<annotation encoding="application/x-tex">\vec{1}=(1,1,1,1)^T</annotation></semantics>, we compute the magnitude, using the Gimperlein-Goffeng recipe, as the sum of all the entries:

<semantics> X(R)=vol(X)2R+12R1,Λ(R)1 4=vol(X)2R+1+tanh(R(a 2b 1)2).<annotation encoding="application/x-tex"> \mathcal{M}_X(R)=\frac{\mathrm{vol}(X)}{2}R+\frac{1}{2R}\langle \vec{1},\Lambda(R)\vec{1}\rangle_{\mathbb{C}^4}=\frac{\mathrm{vol}(X)}{2}R+1+\tanh\left(\frac{R(a_2-b_1)}{2}\right). </annotation></semantics>

It is worth noting that you can see that the operator <semantics>Λ(R)<annotation encoding="application/x-tex">\Lambda(R)</annotation></semantics> depends meromorphically on <semantics>R<annotation encoding="application/x-tex">R\in \mathbb{C}</annotation></semantics>, rather than just being defined for <semantics>R>0<annotation encoding="application/x-tex">R\gt 0</annotation></semantics>, and <semantics>Λ(R)<annotation encoding="application/x-tex">\Lambda(R)</annotation></semantics> an asymptotic expansion as <semantics>Re(R)<annotation encoding="application/x-tex">\mathrm{Re}(R)\to \infty</annotation></semantics>. Therefore, the same holds for <semantics> X(R)<annotation encoding="application/x-tex">\mathcal{M}_X(R)</annotation></semantics>.

Proving the main theorem!

As described in the previous post, the main theorem of the paper is about a meromorphic extension of the magnitude function and about the asymptotic behaviour of the magnitude function <semantics> X(R)<annotation encoding="application/x-tex">\mathcal{M}_X(R)</annotation></semantics> as <semantics>R<annotation encoding="application/x-tex">R\to \infty</annotation></semantics>. As we’ve seen above, the magnitude function can be calculated from the parameter-dependent Dirichlet to Neumann operator <semantics>Λ(R)<annotation encoding="application/x-tex">\Lambda(R)</annotation></semantics>. Now heavy machinery from geometric and semiclassical analysis – such as meromorphic Fredholm theorem and parameter-dependent pseudo-differential operators – can be used to extend <semantics>Λ(R)<annotation encoding="application/x-tex">\Lambda(R)</annotation></semantics> to a meromorphic operator valued function and study its asymptotic expansion as <semantics>R<annotation encoding="application/x-tex">R \to \infty</annotation></semantics>. The properties of the magnitude function then follow.

That is probably enough for now, but in the next post, there should be a slightly less trivial example and some thoughts and comments of a more general nature.

by willerton ( at March 29, 2018 08:58 PM

Robert Helling - atdotde

Machine Learning for Physics?!?
Today was the last day of a nice workshop here at the Arnold Sommerfeld Center organised by Thomas Grimm and Sven Krippendorf on the use of Big Data and Machine Learning in string theory. While the former (at this workshop mainly in the form of developments following Kreuzer/Skarke and taking it further for F-theory constructions, orbifolds and the like) appears to be quite advanced as of today, the latter is still in its very early days. At best.

I got the impression that for many physicists that have not yet spent too much time with this, deep learning and in particular deep neural networks are expected to be some kind of silver bullet that can answer all kinds of questions that humans have not been able to answer despite some effort. I think this hope is at best premature and looking at the (admittedly impressive) examples where it works (playing Go, classifying images, speech recognition, event filtering at LHC) these seem to be more like those problems where humans have at least a rough idea how to solve them (if it is not something that humans do everyday like understanding text) and also roughly how one would code it but that are too messy or vague to be treated by a traditional program.

So, during some of the less entertaining talks I sat down and thought about problems where I would expect neural networks to perform badly. And then, if this approach fails even in simpler cases that are fully under control one should maybe curb the expectations for the more complex cases that one would love to have the answer for. In the case of the workshop that would be guessing some topological (discrete) data (that depends very discontinuously on the model parameters). Here a simple problem would be a 2-torus wrapped by two 1-branes. And the computer is supposed to compute the number of matter generations arising from open strings at the intersections, i.e. given two branes (in terms of their slope w.r.t. the cycles of the torus) how often do they intersect? Of course these numbers depend sensitively on the slope (as a real number) as for rational slopes [latex]p/q[/latex] and [latex]m/n[/latex] the intersection number is the absolute value of [latex]pn-qm[/latex]. My guess would be that this is almost impossible to get right for a neural network, let alone the much more complicated variants of this simple problem.

Related but with the possibility for nicer pictures is the following: Can a neural network learn the shape of the Mandelbrot set? Let me remind those of you who cannot remember the 80ies anymore, for a complex number c you recursively apply the function
[latex]f_c(z)= z^2 +c[/latex]
starting from 0 and ask if this stays bounded (a quick check shows that once you are outside [latex]|z| < 2[/latex] you cannot avoid running to infinity). You color the point c in the complex plane according to the number of times you have to apply f_c to 0 to leave this circle. I decided to do this for complex numbers x+iy in the rectangle -0.74
I have written a small mathematica program to compute this image. Built into mathematica is also a neural network: You can feed training data to the function Predict[], for me these were 1,000,000 points in this rectangle and the number of steps it takes to leave the 2-ball. Then mathematica thinks for about 24 hours and spits out a predictor function. Then you can plot this as well:

There is some similarity but clearly it has no idea about the fractal nature of the Mandelbrot set. If you really believe in magic powers of neural networks, you might even hope that once it learned the function for this rectangle one could extrapolate to outside this rectangle. Well, at least in this case, this hope is not justified: The neural network thinks the correct continuation looks like this:
Ehm. No.

All this of course with the caveat that I am no expert on neural networks and I did not attempt anything to tune the result. I only took the neural network function built into mathematica. Maybe, with a bit of coding and TensorFlow one can do much better. But on the other hand, this is a simple two dimensional problem. At least for traditional approaches this should be much simpler than the other much higher dimensional problems the physicists are really interested in.

by Robert Helling ( at March 29, 2018 07:35 PM

Axel Maas - Looking Inside the Standard Model

Asking questions leads to a change of mind
In this entry, I would like to digress a bit from my usual discussion of our physics research subject. Rather, I would like to talk a bit about how I do this kind of research. There is a twofold motivation for me to do this.

One is that I am currently teaching, together with somebody from the philosophy department, a course on science philosophy of physics. It cam to me as a surprise that one thing the students of philosophy are interested in is, how I think. What are the objects, or subjects, and how I connect them when doing research. Or even when I just think about a physics theory. The other is the review I have have recently written. Both topics may seem unrelated at first. But there is deep connection. It is less about what I have written in the review, but rather what led me up to this point. This requires some historical digression in my own research.

In the very beginning, I started out with doing research on the strong interactions. One of the features of the strong interactions is that the supposed elementary particles, quarks and gluons, are never seen separately, but only in combinations as hadrons. This is a phenomenon which is called confinement. It always somehow presented as a mystery. And as such, it is interesting. Thus, one question in my early research was how to understand this phenomenon.

Doing that I came across an interesting result from the 1970ies. It appears that a, at first sight completely unrelated, effect is very intimately related to confinement. At least in some theories. This is the Brout-Englert-Higgs effect. However, we seem to observe the particles responsible for and affected by the Higgs effect. And indeed, at that time, I was still thinking that the particles affected by the Brout-Englert-Higgs effect, especially  the Higgs and the W and Z bosons, are just ordinary, observable particles. When one reads my first paper of this time on the Higgs, this is quite obvious. But then there was the results of the 1970ies. It stated that, on a very formal level, there should be no difference between confinement and the Brout-Englert-Higgs effect, in a very definite way.

Now the implications of that serious sparked my interest. But I thought this would help me to understand confinement, as it was still very ingrained into me that confinement is a particular feature of the strong interactions. The mathematical connection I just took as a curiosity. And so I started to do extensive numerical simulations of the situation.

But while trying to do so, things which did not add up started to accumulate. This is probably most evident in a conference proceeding where I tried to put sense into something which, with hindsight, could never be interpreted in the way I did there. I still tried to press the result into the scheme of thinking that the Higgs and the W/Z are physical particles, which we observe in experiment, as this is the standard lore. But the data would not fit this picture, and the more and better data I gathered, the more conflicted the results became. At some point, it was clear that something was amiss.

At that point, I had two options. Either keep with the concepts of confinement and the Brout-Englert-Higgs effect as they have been since the 1960ies. Or to take the data seriously, assuming that these conceptions were wrong. It is probably signifying my difficulties that it took me more than a year to come to terms with the results. In the end, the decisive point was that, as a theoretician, I needed to take my theory seriously, no matter the results. There is no way around it. And it gave a prediction which did not fit my view of the experiments than necessarily either my view was incorrect or the theory. The latter seemed more improbable than the first, as it fits experiment very well. So, finally, I found an explanation, which was consistent. And this explanation accepted the curious mathematical statement from the 1970ies that confinement and the Brout-Englert-Higgs effect are qualitatively the same, but not quantitatively. And thus the conclusion was what we observe are not really the Higgs and the W/Z bosons, but rather some interesting composite objects, just like hadrons, which due to a quirk of the theory just behave almost as if they are the elementary particles.

This was still a very challenging thought to me. After all, this was quite contradictory to usual notions. Thus, it came as a very great relief to me that during a trip a couple months later someone pointed me to a few, almost forgotten by most, papers from the early 1980ies, which gave, for a completely different reason, the same answer. Together with my own observation, this made click, and everything started to fit together - the 1970ies curiosity, the standard notions, my data. That I published in the mid of 2012, even though this still lacked some more systematic stuff. But it required still to shift my thinking from agreement to really understanding. That came then in the years to follow.

The important click was to recognize that confinement and the Brout-Englert-Higgs effect are, just as pointed out in the 1970ies mathematically, really just two faces to the same underlying phenomena. On a very abstract level, essentially all particles which make up the standard model, are really just a means to an end. What we observe are objects which are described by them, but which they are not themselves. They emerge, just like hadrons emerge in the strong interaction, but with very different technical details. This is actually very deeply connected with the concept of gauge symmetry, but this becomes quickly technical. Of course, since this is fundamentally different from the usual way, this required confirmation. So we went, made predictions which could distinguish between the standard way of thinking and this way of thinking, and tested them. And it came out as we predicted. So, seems we are on the right track. And all details, all the if, how, and why, and all the technicalities and math you can find in the review.

To make now full circle to the starting point: That what happened during this decade in my mind was that the way I thought about how the physical theory I tried to describe, the standard model, changed. In the beginning I was thinking in terms of particles and their interactions. Now, very much motivated by gauge symmetry, and, not incidental, by its more deeper conceptual challenges, I think differently. I think no longer in terms of the elementary particles as entities themselves, but rather as auxiliary building blocks of actually experimentally accessible quantities. The standard 'small-ball' analogy went fully away, and there formed, well, hard to say, a new class of entities, which does not necessarily has any analogy. Perhaps the best analogy is that of, no, I really do not know how to phrase it. Perhaps at a later time I will come across something. Right now, it is more math than words.

This also transformed the way how I think about the original problem, confinement. I am curious, where this, and all the rest, will lead to. For now, the next step will be to go ahead from simulations, and see whether we can find some way how to test this actually in experiment. We have some ideas, but in the end, it may be that present experiments will not be sensitive enough. Stay tuned.

by Axel Maas ( at March 29, 2018 01:09 PM

Lubos Motl - string vacua and pheno

Dark matter probably exists because in a galaxy, it doesn't
Galaxies seem to rotate much like the vinyl records – the angular speed or at least the normal speed is pretty much independent from the star's distance from the galactic center. That's very different from planets in the Solar System where the speed of Mercury vastly exceeds that of Neptune, the most distant planet (apologies to the dwarf planet Pluto and its fanatical fans).

Even with a more realistic, non-central distribution of the matter, one can determine that it indeed means that the motion of stars in galaxies is different from what is predicted by Einstein's equations (which predict the dependence closer to the Solar System) – assuming that only the visible mass (shining stars and gas) is substituted as \(T_{\mu\nu}\), the stress energy tensor, on the right hand side.

But the discrepancy may be fixed. You just add another term to the tensor so that the total one is equal to whatever Einstein's curvature tensor you determine from the spacetime geometry. (I can't guarantee that the total or partial stress-energy tensor will always obey energy conditions and other inequalities expected from general physical considerations but it seems to be so in reality.)

Zwicky figured out his colleagues were spherical bastards – which are defined as people who are bastards independently of the direction from which you observe them.

For almost a century, it's been a dominant explanation – Fritz Zwicky gave the first similar arguments (based on the virial theorem) that dark matter had to exist. It's not shocking that some matter in the Universe is invisible or different from the hydrogen in stars. It may be a new elementary particle such as a WIMP, SIMP, or an axion, or some MACHO or a small black hole, or something else. Some options are being eliminated by terrestrial direct search experiments but lots of them remain viable.

The other explanation is that there's no extra dark term in \(T_{\mu\nu}\). Instead, this school of thought says, Einstein's equations aren't quite right. They have to be modified – even their Newton's limit has to be modified. We call these explanations MOND – Modified Newton's Dynamics.

In recent years, I have been bombarded – and the public has been bombarded – by the hype produced by the MOND movement. But the evidence remained as weak as it was decades ago. MOND prevents you from deciding about the amount of visible matter and dark matter separately, so it is bound to be more predictive than dark matter theories. That's a double-edged sword. If the MOND predictions worked, the theory would be more predictive, and would have passed a tougher test. It would therefore be more strongly confirmed than dark matter that may be adjusted.

But that outcome depends on the assumption that MOND really passes the tests. Well, it seems OK for some large classes of galaxies but not larger structures and some special galaxies, especially small ones.

There are observations, like those of the Bullet Cluster above, that seem to imply that the visible matter and dark matter may be separated – like the red and blue spots are separated on the picture above. (Or as a soul that flies away from your body according to some religious believers.)

Well, this observation has been around for quite some time. I find it obvious that if a rational person believes that the observation above shows that dark matter and visible matter may separate by the distance comparable to the galactic radius, they may sometimes get separated even more than that, and then they have enough velocity to escape from each other.

If that's so, there should be galaxies that contain only visible matter – and those that only include dark matter (but the latter are harder to see because everything is dark in them). And indeed, a dozen of authors led by Pieter van Dokkum claim exactly that observation in Nature:
A galaxy lacking dark matter (full text at arXiv)
As Joshua Sokol summarized in the Quanta Magazine, galaxy NGC 1052–DF2, a half-transparent smear of light 65 million light-years away in the constellation Cetus, hosts some 200 million suns’ worth of stars, and negligible amounts of gas and dust. And that’s it. The need for dark matter over there is zero. It's just not there. If it's not there, it proves that the dark-matter-like phenomena may be separated from the visible matter, the visible matter may be "purified". That apparently implies that MOND – that predicts some modified laws associated with any galaxies – must be wrong and dark matter theories which allow us to separate dark matter from visible matter are the only remaining option.

Some people protest that it shouldn't happen according to theories of structure formation (i.e. the science about the conception and birth of the earliest galaxies). Give me a break. The claim that dark matter should "always" be underneath the visible matter may be at most an approximate rule-of-thumb. Moreover, we no longer live in the era of structure formation, dear dinosaurs' ancestors. So the dark matter could have escaped from that small galaxy and many others. When Gagarin was able to fly on the orbit around Earth, it was just a bit harder for Armstrong and Aldrin (and the janitors who helped them to get there, including a building-sized computer that was as strong as 0.001% of your iPhone) to escape from Earth's gravitational grip and get to the Moon. The cosmic velocities only differ by a factor of \(\sqrt{2}\) or how much. If you can escape by one Earth radius, you may escape completely, too.

That's why I still find it much more likely than 50% – and the probability has surely approached a bit closer to 100% today – that dark matter of some kind is the right explanation of all these phenomena. That's why I think that her MOND activities are just another reason to consider Sabine Hossenfelder a crackpot and Erik Verlinde a guy who became a fringe pop-science theorist. That's why I remain unexcited by frequent mails from a David that I am receiving and that claim that a guy named Milgrom is a Copernicus of our century or something like that. Most likely, all these people are just elaborating upon a concept that has been ruled out and that is getting increasingly clearly falsified.

It's possible that we will never observe dark matter "more directly" than by its impact on the motion of stars and dust in the galaxies. That scenario is totally consistent and shouldn't be considered an argument against dark matter.

By the way, a new paper claims some 4.2-sigma disagreement between GR and the August 2017 merger from LIGO. I don't plan to study the paper beyond the abstract because it talks about "quantum black holes". I don't believe that quantum mechanics is relevant for the explanation of these large objects observed by LIGO, so these people probably don't have a clue what they're doing. They're the same people who have written about some crazy LIGO echoes in late 2016 – I have probably dedicated much more time to them than they will ever deserve.

by Luboš Motl ( at March 29, 2018 12:48 PM

March 28, 2018

Marco Frasca - The Gauge Connection

Paper with a proof of confinement has been accepted

Recently, I wrote a paper together with Masud Chaichian (see here) containing a mathematical proof of confinement of a non-Abelian gauge theory based on Kugo-Ojima criterion. This paper underwent an extended review by several colleagues well before its submission. One of them has been Taichiro Kugo, one of the discoverers of the confinement criterion, that helped a lot to improve the paper and clarify some points. Then, after a review round of about two months, the paper has been accepted in Physics Letters B, one of the most important journals in particle physics.

This paper contains the exact beta function of a Yang-Mills theory. This confirms that confinement arises by the combination of the running coupling and the propagator. This idea was around in some papers in these latter years. It emerged as soon as people realized that the propagator by itself was not enough to grant confinement, after extended studies on the lattice.

It is interesting to point out that confinement is rooted in the BRST invariance and asymptotic freedom. The Kugo-Ojima confinement criterion permits to close the argument in a rigorous way yielding the exact beta funtion of the theory.

by mfrasca at March 28, 2018 09:34 AM

March 26, 2018

John Baez - Azimuth

Applied Category Theory Course

It just became a lot easier to learn about applied category theory, thanks to this free book:

• Brendan Fong and David Spivak, Seven Sketches in Compositionality: An Invitation to Applied Category Theory.

I’ve started an informal online course based on this book on the Azimuth Forum. I’m getting pretty sick of the superficial quality of my interactions on social media. This could be a way to do something more interesting.

The idea is that you can read chapters of this book, discuss them, try the exercises in the book, ask and answer questions, and maybe team up to create software that implements some of the ideas. I’ll try to keep things moving forward. For example, I’ll explain some stuff and try to help answer questions that people are stuck on. I may also give some talks or run discussions on Google Hangouts or similar software—but only when I have time: I’m more of a text-based guy. I may get really busy some times, and leave the rest of you alone for a while. But I like writing about math for at least 15 minutes a day, and more when I have time. Furthermore, I’m obsessed with applied category theory and plan to stay that way for at least a few more years.

If this sounds interesting, let me know here—and please visit the Azimuth Forum and register! Use your full real name as your username, with no spaces. I will add spaces and that will become your username. Use a real working email address. If you don’t, the registration process may not work.

Over 70 people have registered so far, so this process will take a while.

The main advantage of the Forum over this blog is that you can initiate new threads and edit your comments. Like here you can write equations in LaTeX. Like here, that ability is severely limited: for example you can’t define macros, and you can’t use TikZ. (Maybe someone could fix that.) But equations are better typeset over there—and more importantly, the ability to edit comments makes it a lot easier to correct errors in your LaTeX.

Please let me know what you think.

What follows is the preface to Fong and Spivak’s book, just so you can get an idea of what it’s like.


Category theory is becoming a central hub for all of pure mathematics. It is unmatched in its ability to organize and layer abstractions, to find commonalities between structures of all sorts, and to facilitate communication between different mathematical communities. But it has also been branching out into science, informatics, and industry. We believe that it has the potential to be a major cohesive force in the world, building rigorous bridges between disparate worlds, both theoretical and practical. The motto at MIT is mens et manus, Latin for mind and hand. We believe that category theory—and pure math in general—has stayed in the realm of mind for too long; it is ripe to be brought to hand.

Purpose and audience

The purpose of this book is to offer a self-contained tour of applied category theory. It is an invitation to discover advanced topics in category theory through concrete real-world examples. Rather than try to give a comprehensive treatment of these topics—which include adjoint functors, enriched categories, proarrow equipments, toposes, and much more–we merely provide a taste. We want to give readers some insight into how it feels to work with these structures as well as some ideas about how they might show up in practice.

The audience for this book is quite diverse: anyone who finds the above description intriguing. This could include a motivated high school student who hasn’t seen calculus yet but has loved reading a weird book on mathematical logic they found at the library. Or a machine learning researcher who wants to understand what vector spaces, design theory, and dynamical systems could possibly have in common. Or a pure mathematician who wants to imagine what sorts of applications their work might have. Or a recently-retired programmer who’s always had an eerie feeling that category theory is what they’ve been looking for to tie it all together, but who’s found the usual books on the subject impenetrable.

For example, we find it something of a travesty that in 2018 there seems to be no introductory material available on monoidal categories. Even beautiful modern introductions to category theory, e.g. by Riehl or Leinster, do not include anything on this rather central topic. The basic idea is certainly not too abstract; modern human intuition seems to include a pre-theoretical understanding of monoidal categories that is just waiting to be formalized. Is there anyone who wouldn’t correctly understand the basic idea being communicated in the following diagram?

Many applied category theory topics seem to take monoidal categories as their jumping off point. So one aim of this book is to provide a reference—even if unconventional—for this important topic.

We hope this book inspires both new visions and new questions. We intend it to be self-contained in the sense that it is approachable with minimal prerequisites, but not in the sense that the complete story is told here. On the contrary, we hope that readers use this as an invitation to further reading, to orient themselves in what is becoming a large literature, and to discover new applications for themselves.

This book is, unashamedly, our take on the subject. While the abstract structures we explore are important to any category theorist, the specific topics have simply been chosen to our personal taste. Our examples are ones that we find simple but powerful, concrete but representative, entertaining but in a way that feels important and expansive at the same time. We hope our readers will enjoy themselves and learn a lot in the process.

How to read this book

The basic idea of category theory—which threads through every chapter—is that if one pays careful attention to structures and coherence, the resulting systems will be extremely reliable and interoperable. For example, a category involves several structures: a collection of objects, a collection of morphisms relating objects, and a formula for combining any chain of morphisms into a morphism. But these structures need to cohere or work together in a simple commonsense way: a chain of chains is a chain, so combining a chain of chains should be the same as combining the chain. That’s it!

We will see structures and coherence come up in pretty much every definition we give: “here are some things and here are how they fit together.” We ask the reader to be on the lookout for structures and coherence as they read the book, and to realize that as we layer abstraction on abstraction, it is the coherence that makes everything function like a well-oiled machine.

Each chapter in this book is motivated by a real-world topic, such as electrical circuits, control theory, cascade failures, information integration, and hybrid systems. These motivations lead us into and through various sorts of category-theoretic concepts.

We generally have one motivating idea and one category-theoretic purpose per chapter, and this forms the title of the chapter, e.g. Chapter 4 is “Collaborative design: profunctors, categorification, and monoidal categories.” In many math books, the difficulty is roughly a monotonically-increasing function of the page number. In this book, this occurs in each chapter, but not so much in the book as a whole. The chapters start out fairly easy and progress in difficulty.

The upshot is that if you find the end of a chapter very difficult, hope is certainly not lost: you can start on the next one and make good progress. This format lends itself to giving you a first taste now, but also leaving open the opportunity for you to come back at a later date and get more deeply into it. But by all means, if you have the gumption to work through each chapter to its end, we very much encourage that!

We include many exercises throughout the text. Usually these exercises are fairly straightforward; the only thing they demand is that the reader’s mind changes state from passive to active, rereads the previous paragraphs with intent, and puts the pieces together. A reader becomes a student when they work the exercises; until then they are more of a tourist, riding on a bus and listening off and on to the tour guide. Hey, there’s nothing wrong with that, but we do encourage you to get off the bus and make contact with the natives as often as you can.

by John Baez at March 26, 2018 12:55 AM

March 21, 2018

Jester - Resonaances

21cm to dark matter
The EDGES discovery of the 21cm absorption line at the cosmic dawn has been widely discussed on blogs and in popular press. Quite deservedly so.  The observation opens a new window on the epoch when the universe as we know it was just beginning. We expect a treasure trove of information about the standard processes happening in the early universe, as well as novel constraints on hypothetical particles that might have been present then. It is not a very long shot to speculate that, if confirmed, the EDGES discovery will be awarded a Nobel prize. On the other hand, the bold claim bundled with their experimental result -  that the unexpectedly large strength of the signal is an indication of interaction between the ordinary matter and cold dark matter - is very controversial. 

But before jumping to dark matter it is worth reviewing the standard physics leading to the EDGES signal. In the lowest energy (singlet) state, hydrogen may absorb a photon and jump to a slightly excited (triplet) state which differs from the true ground state just by the arrangement of the proton and electron spins. Such transitions are induced by photons of wavelength of 21cm, or frequency of 1.4 GHz, or energy of 5.9 𝜇eV, and they may routinely occur at the cosmic dawn when Cosmic Microwave Background (CMB) photons of the right energy hit neutral hydrogen atoms hovering in the universe. The evolution of the CMB and hydrogen temperatures is shown in the picture here as a function of the cosmological redshift z (large z is early time, z=0 is today). The CMB temperature is red and it decreases with time as (1+z) due to the expansion of the universe. The hydrogen temperature in blue is a bit more tricky. At the recombination time around z=1100 most proton and electrons combine to form neutral atoms, however a small fraction of free electrons and protons survives. Interactions between the electrons and CMB photons via Compton scattering are strong enough to keep the two (and consequently the hydrogen as well) at equal temperatures for some time.  However, around z=200 the CMB and hydrogen temperatures decouple, and the latter subsequently decreases much faster with time, as (1+z)^2. At the cosmic dawn, z~17, the hydrogen gas is already 7 times colder than the CMB, after which light from the first stars heats it up and ionizes it again.

The quantity directly relevant for the 21cm absorption signal is the so-called spin temperature Ts, which is a measure of the relative occupation number of the singlet and triplet hydrogen states. Just before the cosmic dawn, the spin temperature equals the CMB one, and as  a result there is no net absorption or emission of 21cm photons. However, it is believed that the light from the first stars initially lowers the spin temperature down to the hydrogen one. Therefore, there should be absorption of 21cm CMB photons by the hydrogen in the epoch between z~20 and z~15. After taking into account the cosmological redshift, one should now observe a dip in the radio frequencies between 70 and 90 MHz. This is roughly what EDGES finds. The depth of the dip is described by the formula:
 As the spin temperature cannot be lower than that of the hydrogen, the standard physics predicts TCMB/Ts ≼ 7 corresponding  T21 ≽ -0.2K. The surprise is that EDGES observes a larger dip, T21 ≈ -0.5K, 3.8 astrosigma away from the predicted value, as if TCMB/Ts were of order 15.

If the EDGES result is taken at face value, it means that TCMB/Ts at the cosmic dawn was much larger than predicted in the standard scenario.  Either there was a lot more photon radiation at the relevant wavelengths, or the hydrogen gas was much colder than predicted. Focusing on the latter possibility, one could imagine that the hydrogen was cooled due to interactions with cold dark matter  made of relatively light (less than GeV) particles. However, this idea very difficult to realize in practice, because it requires the interaction cross section to be thousands of barns at the relevant epoch! Not picobarns typical for WIMPs. Many orders of magnitude more than the total proton-proton cross section at the LHC. Even in nuclear processes such values are rarely seen.  And we are talking here about dark matter, whose trademark is interacting weakly.   Obviously, the idea is running into all sorts of constraints that have been laboriously accumulated over the years.     
One can try to save this idea by a series of evasive tricks. If the interaction cross section scales as 1/v^4, where v is the relative velocity between colliding matter and dark matter particles, it could be enhanced at the cosmic dawn when the typical velocities were at its minimum. The 1/v^4 behavior is not unfamiliar, as it is characteristic of the electromagnetic forces in the non-relativistic limit. Thus, one could envisage a model where dark matter has a minuscule electric charge, one thousandth or less that of the proton. This trick buys some mileage, but the obstacles remain enormous. The cross section is still large enough for the dark and ordinary matter to couple strongly during the recombination epoch, contrary to what is concluded from precision observations of the CMB. Therefore the milli-charge particles can constitute only  a small fraction of dark matter, less then 1 percent. Finally, one needs to avoid constraints from direct detection, colliders, and emission by stars and supernovae.  A plot borrowed from this paper shows that a tiny region of viable parameter space remains around 100 MeV mass and 10^-5 charge, though my guess is that this will also go away upon a more careful analysis.

So, milli-charge dark matter cooling hydrogen does not stand scrutiny as an explanation for the EDGES anomaly. This does not mean that all exotic explanations must be so implausible. Better models are being and will be proposed, and one of them could even be correct. For example, models where new particles lead to an injection of additional 21cm photons at early times seem to be more encouraging.  My bet? Future observations will confirm the 21cm absorption signal, but the amplitude and other features will turn out to be consistent with the standard 𝞚CDM predictions. Given the number of competing experiments in the starting blocks, the issue should be clarified within the next few years. What is certain is that, this time,  we will learn a lot whether or not the anomalous signal persists :)

by Mad Hatter ( at March 21, 2018 03:31 PM

Clifford V. Johnson - Asymptotia

London Event Tomorrow!

Perhaps you were intrigued by the review of The Dialogues, my non-fiction graphic novel about science, in Saturday’s Spectator? Well, I’ll be talking about the book tomorrow (Thursday) at the bookshop Libreria in London at 7:00 pm. Maybe see you there! #thedialoguesbook

-cvj Click to continue reading this post

The post London Event Tomorrow! appeared first on Asymptotia.

by Clifford at March 21, 2018 01:24 PM

March 20, 2018

Marco Frasca - The Gauge Connection

Good news from Moriond

Some days ago, Rencontres of Moriond 2018 ended with the CERN presenting a wealth of results also about the Higgs particle. The direction that the two great experiments, ATLAS and CMS, took is that of improving the measurements on the Standard Model as no evidence has been seen so far of possible new particles. Also, the studies of the properties of the Higgs particle have been refined as promised and the news are really striking.

In a communicate to the public (see here), CERN finally acknowledge, for the first time, a significant discrepancy between data from CMS and Standard Model for the signal strengths in the Higgs decay channels. They claim a 17% difference. This is what I advocated for some years and I have published in reputable journals. I will discuss this below. I would like only to show you the CMS results in the figure below.

ATLAS, by its side, is seeing significant discrepancy in the ZZ channel (2\sigma) and a 1\sigma compatibility for the WW channel. Here are their results.

On the left the WW channel is shown and on the right there are the combined \gamma\gamma and ZZ channels.

The reason of the discrepancy is due, as I have shown in some papers (see here, here and here), to the improper use of perturbation theory to evaluate the Higgs sector. The true propagator of the theory is a sum of Yukawa-like propagators with a harmonic oscillator spectrum. I solved exactly this sector of the Standard Model. So, when the full propagator is taken into account, the discrepancy is toward an increase of the signal strength. Is it worth a try?

This means that this is not physics beyond the Standard Model but, rather, the Standard Model in its full glory that is teaching something new to us about quantum field theory. Now, we are eager to see the improvements in the data to come with the new run of LHC starting now. In the summer conferences we will have reasons to be excited.

by mfrasca at March 20, 2018 09:17 AM

March 17, 2018

Cormac O’Raifeartaigh - Antimatter (Life in a puzzling universe)

Remembering Stephen Hawking

Like many physicists, I woke to some sad news early last Wednesday morning, and to a phoneful of requests from journalists for a soundbyte. In fact, although I bumped into Stephen at various conferences, I only had one significant meeting with him – he was intrigued by my research group’s discovery that Einstein once attempted a steady-state model of the universe. It was a slightly scary but very funny meeting during which his famous sense of humour was fully at play.

4Hawking_1 (1)

Yours truly talking steady-state cosmology with Stephen Hawking

I recalled the incident in a radio interview with RTE Radio 1 on Wednesday. As I say in the piece, the first words that appeared on Stephen’s screen were “I knew..” My heart sank as I assumed he was about to say “I knew about that manuscript“. But when I had recovered sufficiently to look again, what Stephen was actually saying was “I knew ..your father”. Phew! You can find the podcast here.

Image result for cormac o raifeartaigh stephen hawking

Hawking in conversation with my late father (LHS) and with Ernest Walton (RHS)

RTE TV had a very nice obituary on the Six One News, I have a cameo appearence a few minutes into the piece here.

In my view, few could question Hawking’s brilliant contributions to physics, or his outstanding contribution to the public awareness of science. His legacy also includes the presence of many brilliant young physicists at the University of Cambridge today. However, as I point out in a letter in today’s Irish Times, had Hawking lived in Ireland, he probably would have found it very difficult to acquire government funding for his work. Indeed, he would have found that research into the workings of the universe does not qualify as one of the “strategic research areas” identified by our national funding body, Science Foundation Ireland. I suspect the letter will provoke an angry from certain quarters, but it is tragically true.


The above notwithstanding, it’s important not to overstate the importance of one scientist. Indeed, today’s Sunday Times contains a good example of the dangers of science history being written by journalists. Discussing Stephen’s 1974 work on black holes, Bryan Appleyard states  “The paper in effect launched the next four decades of cutting edge physics. Odd flowers with odd names bloomed in the garden of cosmic speculation – branes, worldsheets , supersymmetry …. and, strangest of all, the colossal tree of string theory”.

What? String theory, supersymmetry and brane theory are all modern theories of particle physics (the study of the world of the very small). While these theories were used to some extent by Stephen in his research in cosmology (the study of the very large), it is ludicrous to suggest that they were launched by his work.


by cormac at March 17, 2018 08:27 PM

March 16, 2018

Sean Carroll - Preposterous Universe

Stephen Hawking’s Scientific Legacy

Stephen Hawking died Wednesday morning, age 76. Plenty of memories and tributes have been written, including these by me:

I can also point to my Story Collider story from a few years ago, about how I turned down a job offer from Hawking, and eventually took lessons from his way of dealing with the world.

Of course Hawking has been mentioned on this blog many times.

When I started writing the above pieces (mostly yesterday, in a bit of a rush), I stumbled across this article I had written several years ago about Hawking’s scientific legacy. It was solicited by a magazine at a time when Hawking was very ill and people thought he would die relatively quickly — it wasn’t the only time people thought that, only to be proven wrong. I’m pretty sure the article was never printed, and I never got paid for it; so here it is!

(If you’re interested in a much better description of Hawking’s scientific legacy by someone who should know, see this article in The Guardian by Roger Penrose.)

Stephen Hawking’s Scientific Legacy

Stephen Hawking is the rare scientist who is also a celebrity and cultural phenomenon. But he is also the rare cultural phenomenon whose celebrity is entirely deserved. His contributions can be characterized very simply: Hawking contributed more to our understanding of gravity than any physicist since Albert Einstein.

“Gravity” is an important word here. For much of Hawking’s career, theoretical physicists as a community were more interested in particle physics and the other forces of nature — electromagnetism and the strong and weak nuclear forces. “Classical” gravity (ignoring the complications of quantum mechanics) had been figured out by Einstein in his theory of general relativity, and “quantum” gravity (creating a quantum version of general relativity) seemed too hard. By applying his prodigious intellect to the most well-known force of nature, Hawking was able to come up with several results that took the wider community completely by surprise.

By acclimation, Hawking’s most important result is the realization that black holes are not completely black — they give off radiation, just like ordinary objects. Before that famous paper, he proved important theorems about black holes and singularities, and afterward studied the universe as a whole. In each phase of his career, his contributions were central.

The Classical Period

While working on his Ph.D. thesis in Cambridge in the mid-1960’s, Hawking became interested in the question of the origin and ultimate fate of the universe. The right tool for investigating this problem is general relativity, Einstein’s theory of space, time, and gravity. According to general relativity, what we perceive as “gravity” is a reflection of the curvature of spacetime. By understanding how that curvature is created by matter and energy, we can predict how the universe evolves. This may be thought of as Hawking’s “classical” period, to contrast classical general relativity with his later investigations in quantum field theory and quantum gravity.

Around the same time, Roger Penrose at Oxford had proven a remarkable result: that according to general relativity, under very broad circumstances, space and time would crash in on themselves to form a singularity. If gravity is the curvature of spacetime, a singularity is a moment in time when that curvature becomes infinitely big. This theorem showed that singularities weren’t just curiosities; they are an important feature of general relativity.

Penrose’s result applied to black holes — regions of spacetime where the gravitational field is so strong that even light cannot escape. Inside a black hole, the singularity lurks in the future. Hawking took Penrose’s idea and turned it around, aiming at the past of our universe. He showed that, under similarly general circumstances, space must have come into existence at a singularity: the Big Bang. Modern cosmologists talk (confusingly) about both the Big Bang “model,” which is the very successful theory that describes the evolution of an expanding universe over billions of years, and also the Big Bang “singularity,” which we still don’t claim to understand.

Hawking then turned his own attention to black holes. Another interesting result by Penrose had shown that it’s possible to extract energy from a rotating black hole, essentially by bleeding off its spin until it’s no longer rotating. Hawking was able to demonstrate that, although you can extract energy, the area of the event horizon surrounding the black hole will always increase in any physical process. This “area theorem” was both important in its own right, and also evocative of a completely separate area of physics: thermodynamics, the study of heat.

Thermodynamics obeys a set of famous laws. For example, the first law tells us that energy is conserved, while the second law tells us that entropy — a measure of the disorderliness of the universe — never decreases for an isolated system. Working with James Bardeen and Brandon Carter, Hawking proposed a set of laws for “black hole mechanics,” in close analogy with thermodynamics. Just as in thermodynamics, the first law of black hole mechanics ensures that energy is conserved. The second law is Hawking’s area theorem, that the area of the event horizon never decreases. In other words, the area of the event horizon of a black hole is very analogous to the entropy of a thermodynamic system — they both tend to increase over time.

Black Hole Evaporation

Hawking and his collaborators were justly proud of the laws of black hole mechanics, but they viewed them as simply a formal analogy, not a literal connection between gravity and thermodynamics. In 1972, a graduate student at Princeton University named Jacob Bekenstein suggested that there was more to it than that. Bekenstein, on the basis of some ingenious thought experiments, suggested that the behavior of black holes isn’t simply like thermodynamics, it actually is thermodynamics. In particular, black holes have entropy.

Like many bold ideas, this one was met with resistance from experts — and at this point, Stephen Hawking was the world’s expert on black holes. Hawking was certainly skeptical, and for good reason. If black hole mechanics is really just a form of thermodynamics, that means black holes have a temperature. And objects that have a temperature emit radiation — the famous “black body radiation” that played a central role in the development of quantum mechanics. So if Bekenstein were right, it would seemingly imply that black holes weren’t really black (although Bekenstein himself didn’t quite go that far).

To address this problem seriously, you need to look beyond general relativity itself, since Einstein’s theory is purely “classical” — it doesn’t incorporate the insights of quantum mechanics. Hawking knew that Russian physicists Alexander Starobinsky and Yakov Zel’dovich had investigated quantum effects in the vicinity of black holes, and had predicted a phenomenon called “superradiance.” Just as Penrose had showed that you could extract energy from a spinning black hole, Starobinsky and Zel’dovich showed that rotating black holes could emit radiation spontaneously via quantum mechanics. Hawking himself was not an expert in the techniques of quantum field theory, which at the time were the province of particle physicists rather than general relativists. But he was a quick study, and threw himself into the difficult task of understanding the quantum aspects of black holes, so that he could find Bekenstein’s mistake.

Instead, he surprised himself, and in the process turned theoretical physics on its head. What Hawking eventually discovered was that Bekenstein was right — black holes do have entropy — and that the extraordinary implications of this idea were actually true — black holes are not completely black. These days we refer to the “Bekenstein-Hawking entropy” of black holes, which emit “Hawking radiation” at their “Hawking temperature.”

There is a nice hand-waving way of understanding Hawking radiation. Quantum mechanics says (among other things) that you can’t pin a system down to a definite classical state; there is always some intrinsic uncertainty in what you will see when you look at it. This is even true for empty space itself — when you look closely enough, what you thought was empty space is really alive with “virtual particles,” constantly popping in and out of existence. Hawking showed that, in the vicinity of a black hole, a pair of virtual particles can be split apart, one falling into the hole and the other escaping as radiation. Amazingly, the infalling particle has a negative energy as measured by an observer outside. The result is that the radiation gradually takes mass away from the black hole — it evaporates.

Hawking’s result had obvious and profound implications for how we think about black holes. Instead of being a cosmic dead end, where matter and energy disappear forever, they are dynamical objects that will eventually evaporate completely. But more importantly for theoretical physics, this discovery raised a question to which we still don’t know the answer: when matter falls into a black hole, and then the black hole radiates away, where does the information go?

If you take an encyclopedia and toss it into a fire, you might think the information contained inside is lost forever. But according to the laws of quantum mechanics, it isn’t really lost at all; if you were able to capture every bit of light and ash that emerged from the fire, in principle you could exactly reconstruct everything that went into it, even the print on the book pages. But black holes, if Hawking’s result is taken at face value, seem to destroy information, at least from the perspective of the outside world. This conundrum is the “black hole information loss puzzle,” and has been nagging at physicists for decades.

In recent years, progress in understanding quantum gravity (at a purely thought-experiment level) has convinced more people that the information really is preserved. In 1997 Hawking made a bet with American physicists Kip Thorne and John Preskill; Hawking and Thorne said that information was destroyed, Preskill said that somehow it was preserved. In 2007 Hawking conceded his end of the bet, admitting that black holes don’t destroy information. However, Thorne has not conceded for his part, and Preskill himself thinks the concession was premature. Black hole radiation and entropy continue to be central guiding principles in our search for a better understanding of quantum gravity.

Quantum Cosmology

Hawking’s work on black hole radiation relied on a mixture of quantum and classical ideas. In his model, the black hole itself was treated classically, according to the rules of general relativity; meanwhile, the virtual particles near the black hole were treated using the rules of quantum mechanics. The ultimate goal of many theoretical physicists is to construct a true theory of quantum gravity, in which spacetime itself would be part of the quantum system.

If there is one place where quantum mechanics and gravity both play a central role, it’s at the origin of the universe itself. And it’s to this question, unsurprisingly, that Hawking devoted the latter part of his career. In doing so, he established the agenda for physicists’ ambitious project of understanding where our universe came from.

In quantum mechanics, a system doesn’t have a position or velocity; its state is described by a “wave function,” which tells us the probability that we would measure a particular position or velocity if we were to observe the system. In 1983, Hawking and James Hartle published a paper entitled simply “Wave Function of the Universe.” They proposed a simple procedure from which — in principle! — the state of the entire universe could be calculated. We don’t know whether the Hartle-Hawking wave function is actually the correct description of the universe. Indeed, because we don’t actually have a full theory of quantum gravity, we don’t even know whether their procedure is sensible. But their paper showed that we could talk about the very beginning of the universe in a scientific way.

Studying the origin of the universe offers the prospect of connecting quantum gravity to observable features of the universe. Cosmologists believe that tiny variations in the density of matter from very early times gradually grew into the distribution of stars and galaxies we observe today. A complete theory of the origin of the universe might be able to predict these variations, and carrying out this program is a major occupation of physicists today. Hawking made a number of contributions to this program, both from his wave function of the universe and in the context of the “inflationary universe” model proposed by Alan Guth.

Simply talking about the origin of the universe is a provocative step. It raises the prospect that science might be able to provide a complete and self-contained description of reality — a prospect that stretches beyond science, into the realms of philosophy and theology. Hawking, always provocative, never shied away from these implications. He was fond of recalling a cosmology conference hosted by the Vatican, at which Pope John Paul II allegedly told the assembled scientists not to inquire into the origin of the universe, “because that was the moment of creation and therefore the work of God.” Admonitions of this sort didn’t slow Hawking down; he lived his life in a tireless pursuit of the most fundamental questions science could tackle.


by Sean Carroll at March 16, 2018 11:23 PM

Ben Still - Neutrino Blog

Particle Physics Brick by Brick
It has been a very long time since I last posted and I apologise for that. I have been working the LEGO analogy, as described in the pentaquark series and elsewhere, into a book. The book is called Particle Physics Brick by Brick and the aim is to stretch the LEGO analogy to breaking point while covering as much of the standard model of particle physics as possible. I have had enormous fun writing it and I hope that you will enjoy it as much if you choose to buy it.

It has been available in the UK since September 2017 and you can buy it from Foyles / Waterstones / Blackwell's / AmazonUK where it is receiving ★★★★★ reviews

It is released in the US this Wednesday 21st March 2018 and you can buy it from all good book stores and 

I just wanted to share a few reviews of the book as well because it makes me happy!

Spend a few hours perusing these pages and you'll be in a much better frame of mind to understand your place in the cosmos... The astronomically large objects of the universe are no easier to grasp than the atomically small particles of matter. That's where Ben Still comes in, carrying a box of Legos. A British physicist with a knack for explaining abstract concepts... He starts by matching the weird properties and interactions described by the Standard Model of particle physics with the perfectly ordinary blocks of a collection of Legos. Quarks and leptons, gluons and charms are assigned to various colors and combinations of plastic bricks. Once you've got that system in mind, hang on: Still races off to illustrate the Big Bang, the birth of stars, electromagnetism and all matter of fantastical-sounding phenomenon, like mesons and beta decay. "Given enough plastic bricks, the rules in this book and enough time," Still concludes, "one might imagine that a plastic Universe could be built by us, brick by brick." Remember that the next time you accidentally step on one barefoot.--Ron Charles, The Washington Post

Complex topics explained simply An excellent book. I am Head of Physics at a school and have just ordered 60 copies of this for our L6th students for summer reading before studying the topic on particle physics early next year. Highly recommended. - Ben ★★★★★ AmazonUK

It's beautifully illustrated and very eloquently explains the fundamentals of particle ...
This is a gem of a pop science book. It's beautifully illustrated and very eloquently explains the fundamentals of particle physics without hitting you over the head with quantum field theory and Lagrangian dynamics. The author has done an exceptional job. This is a must have for all students and academics of both physics and applied maths! - Jamie ★★★★★ AmazonUK

by Ben ( at March 16, 2018 09:32 PM

March 15, 2018

Jester - Resonaances

Where were we?
Last time this blog was active, particle physics was entering a sharp curve. That the infamous 750 GeV resonance had petered out was not a big deal in itself - one expects these things to happen every now and then.  But the lack of any new physics at the LHC when it had already collected a significant chunk of data was a reason to worry. We know that we don't know everything yet about the fundamental interactions, and that there is a deeper layer of reality that needs to be uncovered (at least to explain dark matter, neutrino masses, baryogenesis, inflation, and physics at energies above the Planck scale). For a hundred years, increasing the energy of particle collisions has been the best way to increase our understanding of the basic constituents of nature. However, with nothing at the LHC and the next higher energy collider decades away, a feeling was growing that the progress might stall.

In this respect, nothing much has changed during the time when the blog was dormant, except that these sentiments are now firmly established. Crisis is no longer a whispered word, but it's openly discussed in corridors, on blogs, on arXiv, and in color magazines.  The clear message from the LHC is that the dominant paradigms about the physics at the weak scale were completely misguided. The Standard Model seems to be a perfect effective theory at least up to a few TeV, and there is no indication at what energy scale new particles have to show up. While everyone goes through the five stages of grief at their own pace, my impression is that most are already well past the denial. The open question is what should be the next steps to make sure that exploration of fundamental interactions will not halt. 

One possible reaction to a crisis is more of the same.  Historically, such an approach has often been efficient, for example it worked for a long time in the case of the Soviet economy. In our case one could easily go on with more models, more epicycles, more parameter space,  more speculations.  But the driving force for all these SusyWarpedCompositeStringBlackHairyHole enterprise has always been the (small but still) possibility of being vindicated by the LHC. Without serious prospects of experimental verification, model building is reduced to intellectual gymnastics that can hardly stir imagination.  Thus the business-as-usual is not an option in the long run: it couldn't elicit any enthusiasm among the physicists or the public,  it wouldn't attract new bright students, and thus it would be a straight path to irrelevance.

So, particle physics has to change. On the experimental side we will inevitably see, just for economical reasons, less focus on high-energy colliders and more on smaller experiments. Theoretical particle physics will also have to evolve to remain relevant.  Certainly, the emphasis needs to be shifted away from empty speculations in favor of more solid research. I don't pretend to know all the answers or have a clear vision of the optimal strategy, but I see three promising directions.

One is astrophysics where there are much better prospects of experimental progress.  The cosmos is a natural collider that is constantly testing fundamental interactions independently of current fashions or funding agencies.  This gives us an opportunity to learn more  about dark matter and neutrinos, and also about various hypothetical particles like axions or milli-charged matter. The most recent story of the 21cm absorption signal shows that there are still treasure troves of data waiting for us out there. Moreover, new observational windows keep opening up, as recently illustrated by the nascent gravitational wave astronomy. This avenue is of course a non-brainer, already explored since a long time by particle theorists, but I expect it will further gain in importance in the coming years. 

Another direction is precision physics. This, also, has been an integral part of particle physics research for quite some time, but it should grow in relevance. The point is that one can probe very heavy particles, often beyond the reach of present colliders,  by precisely measuring low-energy observables. In the most spectacular example, studying proton decay may give insight into new particles with masses of order 10^16 GeV - unlikely to be ever attainable directly. There is a whole array of observables that can probe new physics well beyond the direct LHC reach: a myriad of rare flavor processes, electric dipole moments of the electron and neutron, atomic parity violation, neutrino scattering,  and so on. This road may be long and tedious but it is bound to succeed: at some point some experiment somewhere must observe a phenomenon that does not fit into the Standard Model. If we're very lucky, it  may be that the anomalies currently observed by the LHCb in certain rare B-meson decays are already the first harbingers of a breakdown of the Standard Model at higher energies.

Finally, I should mention formal theoretical developments. The naturalness problem of the cosmological constant and of the Higgs mass may suggest some fundamental misunderstanding of quantum field theory on our part. Perhaps this should not be too surprising.  In many ways we have reached an amazing proficiency in QFT when applied to certain precision observables or even to LHC processes. Yet at the same time QFT is often used and taught in the same way as magic in Hogwarts: mechanically,  blindly following prescriptions from old dusty books, without a deeper understanding of the sense and meaning.  Recent years have seen a brisk development of alternative approaches: a revival of the old S-matrix techniques, new amplitude calculation methods based on recursion relations, but also complete reformulations of the QFT basics demoting the sacred cows like fields, Lagrangians, and gauge symmetry. Theory alone rarely leads to progress, but it may help to make more sense of the data we already have. Could better understanding or complete reformulating of QFT bring new answers to the old questions? I think that is  not impossible. 

All in all, there are good reasons to worry, but also tons of new data in store and lots of fascinating questions to answer.  How will the B-meson anomalies pan out? What shall we do after we hit the neutrino floor? Will the 21cm observations allow us to understand what dark matter is? Will China build a 100 TeV collider? Or maybe a radio telescope on the Moon instead?  Are experimentalists still needed now that we have machine learning? How will physics change with the centre of gravity moving to Asia?  I will tell you my take on such and other questions and  highlight old and new ideas that could help us understand the nature better.  Let's see how far I'll get this time ;)

by Mad Hatter ( at March 15, 2018 10:43 PM

Tommaso Dorigo - Scientificblogging

Some Notes On Jester's Take On The Future Of HEP
I am very glad to observe that Adam Falkowsky has resumed his blogging activities (for how long, that's early to say). He published the other day a blog entry titled "Where were we", in which he offers his view of the present status of things in HEP and the directions he foresees for the field.
I was about to leave a comment there, but since I am a very discontinuous blog reader (you either write or read, in this business -no time for both things together) I feared I would then miss any reply or ensuing discussion. Not that I mean to say anything controversial or flippant; on the contrary, I mostly agree with Adam's assessment of the situation. With some distinguos.

read more

by Tommaso Dorigo at March 15, 2018 11:23 AM

March 14, 2018

Tommaso Dorigo - Scientificblogging

RIP Stephen Hawking
I do not keep crocodiles[*] in my drawer, so this short piece will have to do today.... Stephen Hawking, the world-renowned British cosmologist, passed away yesterday, and with him we lost not only a bright thinker and all-round scientist, but also a person who inspired two or three generations of students and researchers, thanks of his will to live and take part in active research in spite of the difficulties he had to face, which he always managed to take with irony. Confined on a wheelchair by ALS, and incapable of even speaking without electronic assistance, he always displayed uncommon sharpness and wit.

read more

by Tommaso Dorigo at March 14, 2018 11:20 AM

March 12, 2018

Clifford V. Johnson - Asymptotia

Signing Times

Well, @WalterIsaacson was signing at the same table as me at #sxsw so we got to catch up between doing our penmanship. Excited to read his Leonardo book. And he’s put #thedialoguesbook on his reading list! #graphicnovel A post shared by Clifford Johnson (@asymptotia) on Mar 11, 2018 at 1:38pm … Click to continue reading this post

The post Signing Times appeared first on Asymptotia.

by Clifford at March 12, 2018 10:59 PM

March 11, 2018

John Baez - Azimuth

Hypergraph Categories of Cospans


Two students in the Applied Category Theory 2018 school wrote a blog article about Brendan Fong’s theory of decorated cospans:

• Jonathan Lorand and Fabrizio Genovese, Hypergraph categories of cospans, The n-Category Café, 28 February 2018.

Jonathan Lorand is a math grad student at the University of Zurich working on symplectic and Poisson geometry with Alberto Cattaneo. Fabrizio Genovese is a grad student in computer science at the University of Oxford, working with Bob Coecke and Dan Marsden on categorical quantum mechanics, quantum field theory and the like.

Brendan was my student, so it’s nice to see newer students writing a clear summary of some of his thesis work, namely this paper:

• Brendan Fong, Decorated cospans, Theory and Applications of Categories 30 (2015), 1096–1120.

I wrote a summary of it myself, so I won’t repeat it here:

• John Baez, Decorated cospans, Azimuth, 1 May 2015.

What’s especially interesting to me is that both Jonathan and Fabrizio know some mathematical physics, and they’re part of a group who will be working with me on some problems as part of the Applied Category Theory 2018 school! Brendan and Blake Pollard and I used symplectic geometry and decorated cospans to study the black-boxing of electrical circuits and Markov processes… maybe we should try to go further with that project!

by John Baez at March 11, 2018 09:14 PM

March 08, 2018

Clifford V. Johnson - Asymptotia

An Exhibit!

There’s actually an exhibit of process art for my book in the Fine Arts library at USC! Maybe of interest. There will be a companion exhibit about graphic novels over in the science and engineering library. Opening shortly. There’s actually an exhibit of process art for my book in the … Click to continue reading this post

The post An Exhibit! appeared first on Asymptotia.

by Clifford at March 08, 2018 06:07 AM

March 02, 2018

Cormac O’Raifeartaigh - Antimatter (Life in a puzzling universe)

Snowbound academics are better academics

Like most people in Ireland, I am working at home today. We got quite a dump of snow in the last two days, and there is no question of going anywhere until the roads clear. Worse, our college closed quite abruptly and I was caught on the hop – there are a lot of things (flash drives, books and papers) sitting smugly in my office that I need for my usual research.


The college on Monday evening

That said, I must admit I’m finding it all quite refreshing. For the first time in years, I have time to read interesting things in my daily email; all those postings from academic listings that I never seem to get time to read normally. I’m enjoying it so much, I wonder how much stuff I miss the rest of the time.


The view from my window as I write this

This morning, I thoroughly enjoyed a paper by Nicholas Campion on the representation of astronomy and cosmology in the works of William Shakespeare. I’ve often wondered about this as Shakespeare lived long enough to know of Galileo’s ground-breaking astronomical observations. However, anyone expecting coded references to new ideas about the universe in Shakespeare’s sonnets and plays will be disappointed; apparently he mainly sticks to classical ideas, with a few vague references to the changing order.

I’m also reading about early attempts to measure the parallax of light from a comet, especially by the great Danish astronomer Tycho de Brahe. This paper comes courtesy of the History of Astronomy Discussion Group listings, a really useful resource for anyone interested in the history of astronomy.

While I’m reading all this, I’m also trying to keep abreast of a thoroughly modern debate taking place worldwide, concerning the veracity of an exciting new result in cosmology on the formation of the first stars. It seems a group studying the cosmic microwave background think they have found evidence of a signal representing the absorption of radiation from the first stars. This is exciting enough if correct, but the dramatic part is that the signal is much larger than expected, and one explanation is that this effect may be due to the presence of Dark Matter.

If true, the result would be a major step in our understanding of the formation of stars,  plus a major step in the demonstration of the existence of Dark Matter. However, it’s early days – there are many possible sources of a spurious signal and signals that are larger than expected have a poor history in modern physics! There is a nice article on this in The Guardian, and you can see some of the debate on Peter Coles’s blog In the Dark.  Right or wrong, it’s a good example of how scientific discovery works – if the team can show they have taken all possible spurious results into account, and if other groups find the same result, skepticism will soon be converted into excited acceptance.

All in all, a great day so far. My only concern is that this is the way academia should be – with our day-to-day commitments in teaching and research, it’s easy to forget there is a larger academic world out there.


Of course, the best part is the walk into the village when it finally stops chucking down. can’t believe my local pub is open!


Dunmore East in the snow today


by cormac at March 02, 2018 01:44 PM

March 01, 2018

Sean Carroll - Preposterous Universe

Dark Matter and the Earliest Stars

So here’s something intriguing: an observational signature from the very first stars in the universe, which formed about 180 million years after the Big Bang (a little over one percent of the current age of the universe). This is exciting all by itself, and well worthy of our attention; getting data about the earliest generation of stars is notoriously difficult, and any morsel of information we can scrounge up is very helpful in putting together a picture of how the universe evolved from a relatively smooth plasma to the lumpy riot of stars and galaxies we see today. (Pop-level writeups at The Guardian and Science News, plus a helpful Twitter thread from Emma Chapman.)

But the intrigue gets kicked up a notch by an additional feature of the new results: the data imply that the cosmic gas surrounding these early stars is quite a bit cooler than we expected. What’s more, there’s a provocative explanation for why this might be the case: the gas might be cooled by interacting with dark matter. That’s quite a bit more speculative, of course, but sensible enough (and grounded in data) that it’s worth taking the possibility seriously.

[Update: skepticism has already been raised about the result. See this comment by Tim Brandt below.]

Illustration: NR Fuller, National Science Foundation

Let’s think about the stars first. We’re not seeing them directly; what we’re actually looking at is the cosmic microwave background (CMB) radiation, from about 380,000 years after the Big Bang. That radiation passes through the cosmic gas spread throughout the universe, occasionally getting absorbed. But when stars first start shining, they can very gently excite the gas around them (the 21cm hyperfine transition, for you experts), which in turn can affect the wavelength of radiation that gets absorbed. This shows up as a tiny distortion in the spectrum of the CMB itself. It’s that distortion which has now been observed, and the exact wavelength at which the distortion appears lets us work out the time at which those earliest stars began to shine.

Two cool things about this. First, it’s a tour de force bit of observational cosmology by Judd Bowman and collaborators. Not that collecting the data is hard by modern standards (observing the CMB is something we’re good at), but that the researchers were able to account for all of the different ways such a distortion could be produced other than by the first stars. (Contamination by such “foregrounds” is a notoriously tricky problem in CMB observations…) Second, the experiment itself is totally charming. EDGES (Experiment to Detect Global EoR [Epoch of Reionization] Signature) is a small-table-sized gizmo surrounded by a metal mesh, plopped down in a desert in Western Australia. Three cheers for small science!

But we all knew that the first stars had to be somewhen, it was just a matter of when. The surprise is that the spectral distortion is larger than expected (at 3.8 sigma), a sign that the cosmic gas surrounding the stars is colder than expected (and can therefore absorb more radiation). Why would that be the case? It’s not easy to come up with explanations — there are plenty of ways to heat up gas, but it’s not easy to cool it down.

One bold hypothesis is put forward by Rennan Barkana in a companion paper. One way to cool down gas is to have it interact with something even colder. So maybe — cold dark matter? Barkana runs the numbers, given what we know about the density of dark matter, and finds that we could get the requisite amount of cooling with a relatively light dark-matter particle — less than five times the mass of the proton, well less than expected in typical models of Weakly Interacting Massive Particles. But not completely crazy. And not really constrained by current detection limits from underground experiments, which are generally sensitive to higher masses.

The tricky part is figuring out how the dark matter could interact with the ordinary matter to cool it down. Barkana doesn’t propose any specific model, but looks at interactions that depend sharply on the relative velocity of the particles, as v^{-4}. You might get that, for example, if there was an extremely light (perhaps massless) boson mediating the interaction between dark and ordinary matter. There are already tight limits on such things, but not enough to completely squelch the idea.

This is all extraordinarily speculative, but worth keeping an eye on. It will be full employment for particle-physics model-builders, who will be tasked with coming up with full theories that predict the right relic abundance of dark matter, have the right velocity-dependent force between dark and ordinary matter, and are compatible with all other known experimental constraints. It’s worth doing, as currently all of our information about dark matter comes from its gravitational interactions, not its interactions directly with ordinary matter. Any tiny hint of that is worth taking very seriously.

But of course it might all go away. More work will be necessary to verify the observations, and to work out the possible theoretical implications. Such is life at the cutting edge of science!

by Sean Carroll at March 01, 2018 12:00 AM

February 25, 2018

February 08, 2018

Sean Carroll - Preposterous Universe

Why Is There Something, Rather Than Nothing?

A good question!

Or is it?

I’ve talked before about the issue of why the universe exists at all (1, 2), but now I’ve had the opportunity to do a relatively careful job with it, courtesy of Eleanor Knox and Alastair Wilson. They are editing an upcoming volume, the Routledge Companion to the Philosophy of Physics, and asked me to contribute a chapter on this topic. Final edits aren’t done yet, but I’ve decided to put the draft on the arxiv:

Why Is There Something, Rather Than Nothing?
Sean M. Carroll

It seems natural to ask why the universe exists at all. Modern physics suggests that the universe can exist all by itself as a self-contained system, without anything external to create or sustain it. But there might not be an absolute answer to why it exists. I argue that any attempt to account for the existence of something rather than nothing must ultimately bottom out in a set of brute facts; the universe simply is, without ultimate cause or explanation.

As you can see, my basic tack hasn’t changed: this kind of question might be the kind of thing that doesn’t have a sensible answer. In our everyday lives, it makes sense to ask “why” this or that event occurs, but such questions have answers only because they are embedded in a larger explanatory context. In particular, because the world of our everyday experience is an emergent approximation with an extremely strong arrow of time, such that we can safely associate “causes” with subsequent “effects.” The universe, considered as all of reality (i.e. let’s include the multiverse, if any), isn’t like that. The right question to ask isn’t “Why did this happen?”, but “Could this have happened in accordance with the laws of physics?” As far as the universe and our current knowledge of the laws of physics is concerned, the answer is a resounding “Yes.” The demand for something more — a reason why the universe exists at all — is a relic piece of metaphysical baggage we would be better off to discard.

This perspective gets pushback from two different sides. On the one hand we have theists, who believe that they can answer why the universe exists, and the answer is God. As we all know, this raises the question of why God exists; but aha, say the theists, that’s different, because God necessarily exists, unlike the universe which could plausibly have not. The problem with that is that nothing exists necessarily, so the move is pretty obviously a cheat. I didn’t have a lot of room in the paper to discuss this in detail (in what after all was meant as a contribution to a volume on the philosophy of physics, not the philosophy of religion), but the basic idea is there. Whether or not you want to invoke God, you will be left with certain features of reality that have to be explained by “and that’s just the way it is.” (Theism could possibly offer a better account of the nature of reality than naturalism — that’s a different question — but it doesn’t let you wiggle out of positing some brute facts about what exists.)

The other side are those scientists who think that modern physics explains why the universe exists. It doesn’t! One purported answer — “because Nothing is unstable” — was never even supposed to explain why the universe exists; it was suggested by Frank Wilczek as a way of explaining why there is more matter than antimatter. But any such line of reasoning has to start by assuming a certain set of laws of physics in the first place. Why is there even a universe that obeys those laws? This, I argue, is not a question to which science is ever going to provide a snappy and convincing answer. The right response is “that’s just the way things are.” It’s up to us as a species to cultivate the intellectual maturity to accept that some questions don’t have the kinds of answers that are designed to make us feel satisfied.

by Sean Carroll at February 08, 2018 05:19 PM

February 07, 2018

Axel Maas - Looking Inside the Standard Model

How large is an elementary particle?
Recently, in the context of a master thesis, our group has begun to determine the size of the W boson. The natural questions on this project is: Why do you do that? Do we not know it already? And does elementary particles have a size at all?

It is best to answer these questions in reverse order.

So, do elementary particles have a size at all? Well, elementary particles are called elementary as they are the most basic constituents. In our theories today, they start out as pointlike. Only particles made from other particles, so-called bound states like a nucleus or a hadron, have a size. And now comes the but.

First of all, we do not yet know whether our elementary particles are really elementary. They may also be bound states of even more elementary particles. But in experiments we can only determine upper bounds to the size. Making better experiments will reduce this upper bound. Eventually, we may see that a particle previously thought of as point-like has a size. This has happened quite frequently over time. It always opened up a new level of elementary particle theories. Therefore measuring the size is important. But for us, as theoreticians, this type of question is only important if we have an idea about what could be the more elementary particles. And while some of our research is going into this direction, this project is not.

The other issue is that quantum effects give all elementary particles an 'apparent' size. This comes about by how we measure the size of a particle. We do this by shooting some other particle at it, and measure how strongly it becomes deflected. A truly pointlike particle has a very characteristic reflection profile. But quantum effects allow for additional particles to be created and destroyed in the vicinity of any particle. Especially, they allow for the existence of another particle of the same type, at least briefly. We cannot distinguish whether we hit the original particle or one of these. Since they are not at the same place as the original particle, their average distance looks like a size. This gives even a pointlike particle an apparent size, which we can measure. In this sense even an elementary particle has a size.

So, how can we then distinguish this size from an actual size of a bound state? We can do this by calculations. We determine the apparent size due to the quantum fluctuations and compare it to the measurement. Deviations indicate an actual size. This is because for a real bound state we can scatter somewhere in its structure, and not only in its core. This difference looks pictorially like this:

So, do we know the size already? Well, as said, we can only determine upper limits. Searching for them is difficult, and often goes via detours. One of such detours are so-called anomalous couplings. Measuring how they depend on energy provides indirect information on the size. There is an active program at CERN underway to do this experimentally. The results are so far say that the size of the W is below 0.0000000000000001 meter. This seems tiny, but in the world of particle physics this is not that strong a limit.

And now the interesting question: Why do we do this? As written, we do not want to make the W a bound state of something new. But one of our main research topics is driven by an interesting theoretical structure. If the standard model is taken seriously, the particle which we observe in an experiment and call the W is actually not the W of the underlying theory. Rather, it is a bound state, which is very, very similar to the elementary particle, but actually build from the elementary particles. The difference has been so small that identifying one with the other was a very good approximation up to today. But with better and better experiments may change. Thus, we need to test this.

Because then the thing we measure is a bound state it should have a, probably tiny, size. This would be a hallmark of this theoretical structure. And that we understood it. If the size is such that it could be actually measured at CERN, then this would be an important test of our theoretical understanding of the standard model.

However, this is not a simple quantity to calculate. Bound states are intrinsically complicated. Thus, we use simulations for this purpose. In fact, we actually go over the same detour as the experiments, and will determine an anomalous coupling. From this we then infer the size indirectly. In addition, the need to perform efficient simulations forces us to simplify the problem substantially. Hence, we will not get the perfect number. But we may get the order of magnitude, or be perhaps within a factor of two, or so. And this is all we need to currently say whether a measurement is possible, or whether this will have to wait for the next generation of experiments. And thus whether we will know whether we understood the theory within a few years or within a few decades.

by Axel Maas ( at February 07, 2018 11:18 AM

February 05, 2018

Matt Strassler - Of Particular Significance

In Memory of Joe Polchinski, the Brane Master

This week, the community of high-energy physicists — of those of us fascinated by particles, fields, strings, black holes, and the universe at large — is mourning the loss of one of the great theoretical physicists of our time, Joe Polchinski. It pains me deeply to write these words.

Everyone who knew him personally will miss his special qualities — his boyish grin, his slightly wicked sense of humor, his charming way of stopping mid-sentence to think deeply, his athleticism and friendly competitiveness. Everyone who knew his research will feel the absence of his particular form of genius, his exceptional insight, his unique combination of abilities, which I’ll try to sketch for you below. Those of us who were lucky enough to know him both personally and scientifically — well, we lose twice.

Image result for joe polchinski

Polchinski — Joe, to all his colleagues — had one of those brains that works magic, and works magically. Scientific minds are as individual as personalities. Each physicist has a unique combination of talents and skills (and weaknesses); in modern lingo, each of us has a superpower or two. Rarely do you find two scientists who have the same ones.

Joe had several superpowers, and they were really strong. He had a tremendous knack for looking at old problems and seeing them in a new light, often overturning conventional wisdom or restating that wisdom in a new, clearer way. And he had prodigious technical ability, which allowed him to follow difficult calculations all the way to the end, on paths that would have deterred most of us.

One of the greatest privileges of my life was to work with Joe, not once but four times. I think I can best tell you a little about him, and about some of his greatest achievements, through the lens of that unforgettable experience.

[To my colleagues: this post was obviously written in trying circumstances, and it is certainly possible that my memory of distant events is foggy and in error.  I welcome any corrections that you might wish to suggest.]

Our papers between 1999 and 2006 were a sequence of sorts, aimed at understanding more fully the profound connection between quantum field theory — the language of particle physics — and string theory — best-known today as a candidate for a quantum theory of gravity. In each of those papers, as in many thousands of others written after 1995, Joe’s most influential contribution to physics played a central role. This was the discovery of objects known as “D-branes”, which he found in the context of string theory. (The term is a generalization of the word `membrane’.)

I can already hear the polemical haters of string theory screaming at me. ‘A discovery in string theory,’ some will shout, pounding the table, ‘an untested and untestable theory that’s not even wrong, should not be called a discovery in physics.’ Pay them no mind; they’re not even close, as you’ll see by the end of my remarks.

The Great D-scovery

In 1989, Joe, working with two young scientists, Jin Dai and Rob Leigh, was exploring some details of string theory, and carrying out a little mathematical exercise. Normally, in string theory, strings are little lines or loops that are free to move around anywhere they like, much like particles moving around in this room. But in some cases, particles aren’t in fact free to move around; you could, for instance, study particles that are trapped on the surface of a liquid, or trapped in a very thin whisker of metal. With strings, there can be a new type of trapping that particles can’t have — you could perhaps trap one end, or both ends, of the string within a surface, while allowing the middle of the string to move freely. The place where a string’s end may be trapped — whether a point, a line, a surface, or something more exotic in higher dimensions — is what we now call a “D-brane”.  [The `D’ arises for uninteresting technical reasons.]

Joe and his co-workers hit the jackpot, but they didn’t realize it yet. What they discovered, in retrospect, was that D-branes are an automatic feature of string theory. They’re not optional; you can’t choose to study string theories that don’t have them. And they aren’t just surfaces or lines that sit still. They’re physical objects that can roam the world. They have mass and create gravitational effects. They can move around and scatter off each other. They’re just as real, and just as important, as the strings themselves!


Fig. 1: D branes (in green) are physical objects on which a fundamental string (in red) can terminate.

It was as though Joe and his collaborators started off trying to understand why the chicken crossed the road, and ended up discovering the existence of bicycles, cars, trucks, buses, and jet aircraft.  It was that unexpected, and that rich.

And yet, nobody, not even Joe and his colleagues, quite realized what they’d done. Rob Leigh, Joe’s co-author, had the office next to mine for a couple of years, and we wrote five papers together between 1993 and 1995. Yet I think Rob mentioned his work on D-branes to me just once or twice, in passing, and never explained it to me in detail. Their paper had less than twenty citations as 1995 began.

In 1995 the understanding of string theory took a huge leap forward. That was the moment when it was realized that all five known types of string theory are different sides of the same die — that there’s really only one string theory.  A flood of papers appeared in which certain black holes, and generalizations of black holes — black strings, black surfaces, and the like — played a central role. The relations among these were fascinating, but often confusing.

And then, on October 5, 1995, a paper appeared that changed the whole discussion, forever. It was Joe, explaining D-branes to those of us who’d barely heard of his earlier work, and showing that many of these black holes, black strings and black surfaces were actually D-branes in disguise. His paper made everything clearer, simpler, and easier to calculate; it was an immediate hit. By the beginning of 1996 it had 50 citations; twelve months later, the citation count was approaching 300.

So what? Great for string theorists, but without any connection to experiment and the real world.  What good is it to the rest of us? Patience. I’m just getting to that.

What’s it Got to Do With Nature?

Our current understanding of the make-up and workings of the universe is in terms of particles. Material objects are made from atoms, themselves made from electrons orbiting a nucleus; and the nucleus is made from neutrons and protons. We learned in the 1970s that protons and neutrons are themselves made from particles called quarks and antiquarks and gluons — specifically, from a “sea” of gluons and a few quark/anti-quark pairs, within which sit three additional quarks with no anti-quark partner… often called the `valence quarks’.  We call protons and neutrons, and all other particles with three valence quarks, `baryons”.   (Note that there are no particles with just one valence quark, or two, or four — all you get is baryons, with three.)

In the 1950s and 1960s, physicists discovered short-lived particles much like protons and neutrons, with a similar sea, but which  contain one valence quark and one valence anti-quark. Particles of this type are referred to as “mesons”.  I’ve sketched a typical meson and a typical baryon in Figure 2.  (The simplest meson is called a “pion”; it’s the most common particle produced in the proton-proton collisions at the Large Hadron Collider.)



Fig. 2: Baryons (such as protons and neutrons) and mesons each contain a sea of gluons and quark-antiquark pairs; baryons have three unpaired “valence” quarks, while mesons have a valence quark and a valence anti-quark.  (What determines whether a quark is valence or sea involves subtle quantum effects, not discussed here.)

But the quark/gluon picture of mesons and baryons, back in the late 1960s, was just an idea, and it was in competition with a proposal that mesons are little strings. These are not, I hasten to add, the “theory of everything” strings that you learn about in Brian Greene’s books, which are a billion billion times smaller than a proton. In a “theory of everything” string theory, often all the types of particles of nature, including electrons, photons and Higgs bosons, are tiny tiny strings. What I’m talking about is a “theory of mesons” string theory, a much less ambitious idea, in which only the mesons are strings.  They’re much larger: just about as long as a proton is wide. That’s small by human standards, but immense compared to theory-of-everything strings.

Why did people think mesons were strings? Because there was experimental evidence for it! (Here’s another example.)  And that evidence didn’t go away after quarks were discovered. Instead, theoretical physicists gradually understood why quarks and gluons might produce mesons that behave a bit like strings. If you spin a meson fast enough (and this can happen by accident in experiments), its valence quark and anti-quark may separate, and the sea of objects between them forms what is called a “flux tube.” See Figure 3. [In certain superconductors, somewhat similar flux tubes can trap magnetic fields.] It’s kind of a thick string rather than a thin one, but still, it shares enough properties with a string in string theory that it can produce experimental results that are similar to string theory’s predictions.


Fig. 3: One reason mesons behave like strings in experiment is that a spinning meson acts like a thick string, with the valence quark and anti-quark at the two ends.

And so, from the mid-1970s onward, people were confident that quantum field theories like the one that describes quarks and gluons can create objects with stringy behavior. A number of physicists — including some of the most famous and respected ones — made a bolder, more ambitious claim: that quantum field theory and string theory are profoundly related, in some fundamental way. But they weren’t able to be precise about it; they had strong evidence, but it wasn’t ever entirely clear or convincing.

In particular, there was an important unresolved puzzle. If mesons are strings, then what are baryons? What are protons and neutrons, with their three valence quarks? What do they look like if you spin them quickly? The sketches people drew looked something like Figure 3. A baryon would perhaps become three joined flux tubes (with one possibly much longer than the other two), each with its own valence quark at the end.  In a stringy cartoon, that baryon would be three strings, each with a free end, with the strings attached to some sort of junction. This junction of three strings was called a “baryon vertex.”  If mesons are little strings, the fundamental objects in a string theory, what is the baryon vertex from the string theory point of view?!  Where is it hiding — what is it made of — in the mathematics of string theory?


Fig. 4: A fast-spinning baryon looks vaguely like the letter Y — three valence quarks connected by flux tubes to a “baryon vertex”.  A cartoon of how this would appear from a stringy viewpoint, analogous to Fig. 3, leads to a mystery: what, in string theory, is this vertex?!

[Experts: Notice that the vertex has nothing to do with the quarks. It’s a property of the sea — specifically, of the gluons. Thus, in a world with only gluons — a world whose strings naively form loops without ends — it must still be possible, with sufficient energy, to create a vertex-antivertex pair. Thus field theory predicts that these vertices must exist in closed string theories, though they are linearly confined.]


The baryon puzzle: what is a baryon from the string theory viewpoint?

No one knew. But isn’t it interesting that the most prominent feature of this vertex is that it is a location where a string’s end can be trapped?

Everything changed in the period 1997-2000. Following insights from many other physicists, and using D-branes as the essential tool, Juan Maldacena finally made the connection between quantum field theory and string theory precise. He was able to relate strings with gravity and extra dimensions, which you can read about in Brian Greene’s books, with the physics of particles in just three spatial dimensions, similar to those of the real world, with only non-gravitational forces.  It was soon clear that the most ambitious and radical thinking of the ’70s was correct — that almost every quantum field theory, with its particles and forces, can alternatively be viewed as a string theory. It’s a bit analogous to the way that a painting can be described in English or in Japanese — fields/particles and strings/gravity are, in this context, two very different languages for talking about exactly the same thing.

The saga of the baryon vertex took a turn in May 1998, when Ed Witten showed how a similar vertex appears in Maldacena’s examples. [Note added: I had forgotten that two days after Witten’s paper, David Gross and Hirosi Ooguri submitted a beautiful, wide-ranging paper, whose section on baryons contains many of the same ideas.] Not surprisingly, this vertex was a D-brane — specifically a D-particle, an object on which the strings extending from freely-moving quarks could end. It wasn’t yet quite satisfactory, because the gluons and quarks in Maldacena’s examples roam free and don’t form mesons or baryons. Correspondingly the baryon vertex isn’t really a physical object; if you make one, it quickly diffuses away into nothing. Nevertheless, Witten’s paper made it obvious what was going on. To the extent real-world mesons can be viewed as strings, real-world protons and neutrons can be viewed as strings attached to a D-brane.


The baryon puzzle, resolved.  A baryon is made from three strings and a point-like D-brane. [Note there is yet another viewpoint in which a baryon is something known as a skyrmion, a soliton made from meson fields — but that is an issue for another day.]

It didn’t take long for more realistic examples, with actual baryons, to be found by theorists. I don’t remember who found one first, but I do know that one of the earliest examples showed up in my first paper with Joe, in the year 2000.


Working with Joe

That project arose during my September 1999 visit to the KITP (Kavli Institute for Theoretical Physics) in Santa Barbara, where Joe was a faculty member. Some time before that I happened to have studied a field theory (called N=1*) that differed from Maldacena’s examples only slightly, but in which meson-like objects do form. One of the first talks I heard when I arrived at KITP was by Rob Myers, about a weird property of D-branes that he’d discovered. During that talk I made a connection between Myers’ observation and a feature of the N=1* field theory, and I had one of those “aha” moments that physicists live for. I suddenly knew what the string theory that describes the N=1*  field theory must look like.

But for me, the answer was bad news. To work out the details was clearly going to require a very difficult set of calculations, using aspects of string theory about which I knew almost nothing [non-holomorphic curved branes in high-dimensional curved geometry.] The best I could hope to do, if I worked alone, would be to write a conceptual paper with lots of pictures, and far more conjectures than demonstrable facts.

But I was at KITP.  Joe and I had had a good personal rapport for some years, and I knew that we found similar questions exciting. And Joe was the brane-master; he knew everything about D-branes. So I decided my best hope was to persuade Joe to join me. I engaged in a bit of persistent cajoling. Very fortunately for me, it paid off.

I went back to the east coast, and Joe and I went to work. Every week or two Joe would email some research notes with some preliminary calculations in string theory. They had such a high level of technical sophistication, and so few pedagogical details, that I felt like a child; I could barely understand anything he was doing. We made slow progress. Joe did an important warm-up calculation, but I found it really hard to follow. If the warm-up string theory calculation was so complex, had we any hope of solving the full problem?  Even Joe was a little concerned.

Image result for polchinski joeAnd then one day, I received a message that resounded with a triumphant cackle — a sort of “we got ’em!” that anyone who knew Joe will recognize. Through a spectacular trick, he’d figured out how use his warm-up example to make the full problem easy! Instead of months of work ahead of us, we were essentially done.

From then on, it was great fun! Almost every week had the same pattern. I’d be thinking about a quantum field theory phenomenon that I knew about, one that should be visible from the string viewpoint — such as the baryon vertex. I knew enough about D-branes to develop a heuristic argument about how it should show up. I’d call Joe and tell him about it, and maybe send him a sketch. A few days later, a set of notes would arrive by email, containing a complete calculation verifying the phenomenon. Each calculation was unique, a little gem, involving a distinctive investigation of exotically-shaped D-branes sitting in a curved space. It was breathtaking to witness the speed with which Joe worked, the breadth and depth of his mathematical talent, and his unmatched understanding of these branes.

[Experts: It’s not instantly obvious that the N=1* theory has physical baryons, but it does; you have to choose the right vacuum, where the theory is partially Higgsed and partially confining. Then to infer, from Witten’s work, what the baryon vertex is, you have to understand brane crossings (which I knew about from Hanany-Witten days): Witten’s D5-brane baryon vertex operator creates a  physical baryon vertex in the form of a D3-brane 3-ball, whose boundary is an NS 5-brane 2-sphere located at a point in the usual three dimensions. And finally, a physical baryon is a vertex with n strings that are connected to nearby D5-brane 2-spheres. See chapter VI, sections B, C, and E, of our paper from 2000.]

Throughout our years of collaboration, it was always that way when we needed to go head-first into the equations; Joe inevitably left me in the dust, shaking my head in disbelief. That’s partly my weakness… I’m pretty average (for a physicist) when it comes to calculation. But a lot of it was Joe being so incredibly good at it.

Fortunately for me, the collaboration was still enjoyable, because I was almost always able to keep pace with Joe on the conceptual issues, sometimes running ahead of him. Among my favorite memories as a scientist are moments when I taught Joe something he didn’t know; he’d be silent for a few seconds, nodding rapidly, with an intent look — his eyes narrow and his mouth slightly open — as he absorbed the point.  “Uh-huh… uh-huh…”, he’d say.

But another side of Joe came out in our second paper. As we stood chatting in the KITP hallway, before we’d even decided exactly which question we were going to work on, Joe suddenly guessed the answer! And I couldn’t get him to explain which problem he’d solved, much less the solution, for several days!! It was quite disorienting.

This was another classic feature of Joe. Often he knew he’d found the answer to a puzzle (and he was almost always right), but he couldn’t say anything comprehensible about it until he’d had a few days to think and to turn his ideas into equations. During our collaboration, this happened several times. (I never said “Use your words, Joe…”, but perhaps I should have.) Somehow his mind was working in places that language doesn’t go, in ways that none of us outside his brain will ever understand. In him, there was something of an oracle.

Looking Toward The Horizon

Our interests gradually diverged after 2006; I focused on the Large Hadron Collider [also known as the Large D-brane Collider], while Joe, after some other explorations, ended up thinking about black hole horizons and the information paradox. But I enjoyed his work from afar, especially when, in 2012, Joe and three colleagues (Ahmed Almheiri, Don Marolf, and James Sully) blew apart the idea of black hole complementarity, widely hoped to be the solution to the paradox. [I explained this subject here, and also mentioned a talk Joe gave about it here.]  The wreckage is still smoldering, and the paradox remains.

Then Joe fell ill, and we began to lose him, at far too young an age.  One of his last gifts to us was his memoirs, which taught each of us something about him that we didn’t know.  Finally, on Friday last, he crossed the horizon of no return.  If there’s no firewall there, he knows it now.

What, we may already wonder, will Joe’s scientific legacy be, decades from now?  It’s difficult to foresee how a theorist’s work will be viewed a century hence; science changes in unexpected ways, and what seems unimportant now may become central in future… as was the path for D-branes themselves in the course of the 1990s.  For those of us working today, D-branes in string theory are clearly Joe’s most important discovery — though his contributions to our understanding of black holes, cosmic strings, and aspects of field theory aren’t soon, if ever, to be forgotten.  But who knows? By the year 2100, string theory may be the accepted theory of quantum gravity, or it may just be a little-known tool for the study of quantum fields.

Yet even if the latter were to be string theory’s fate, I still suspect it will be D-branes that Joe is remembered for. Because — as I’ve tried to make clear — they’re real.  Really real.  There’s one in every proton, one in every neutron. Our bodies contain them by the billion billion billions. For that insight, that elemental contribution to human knowledge, our descendants can blame Joseph Polchinski.

Thanks for everything, Joe.  We’ll miss you terribly.  You so often taught us new ways to look at the world — and even at ourselves.

Image result for joe polchinski


by Matt Strassler at February 05, 2018 03:59 PM

January 29, 2018

Sean Carroll - Preposterous Universe

Guest Post: Nicole Yunger Halpern on What Makes Extraordinary Science Extraordinary

Nicole Yunger Halpern is a theoretical physicist at Caltech’s Institute for Quantum Information and Matter (IQIM).  She blends quantum information theory with thermodynamics and applies the combination across science, including to condensed matter; black-hole physics; and atomic, molecular, and optical physics. She writes for Quantum Frontiers, the IQIM blog, every month.

What makes extraordinary science extraordinary?

Political junkies watch C-SPAN. Sports fans watch ESPN. Art collectors watch Christie’s. I watch scientists respond to ideas.

John Preskill—Caltech professor, quantum-information theorist, and my PhD advisor—serves as the Chief Justice John Roberts of my C-SPAN. Ideas fly during group meetings, at lunch outside a campus cafeteria, and in John’s office. Many ideas encounter a laconicism compared with which Ernest Hemingway babbles. “Hmm,” I hear. “Ok.” “Wait… What?”

The occasional idea provokes an “mhm.” The final syllable has a higher pitch than the first. Usually, the inflection change conveys agreement and interest. Receiving such an “mhm” brightens my afternoon like a Big Dipper sighting during a 9 PM trudge home.

Hearing “That’s cool,” “Nice,” or “I’m excited,” I cartwheel internally.

What distinguishes “ok” ideas from “mhm” ideas? Peeling the Preskillite trappings off this question reveals its core: What distinguishes good science from extraordinary science?

I’ve been grateful for opportunities to interview senior scientists, over the past few months, from coast to coast. The opinions I collected varied. Several interviewees latched onto the question as though they pondered it daily. A couple of interviewees balked (I don’t know; that’s tricky…) but summoned up a sermon. All the responses fired me up: The more wisps of mist withdrew from the nature of extraordinary science, the more I burned to contribute.

I’ll distill, interpret, and embellish upon the opinions I received. Italics flag lines that I assembled to capture ideas that I heard, as well as imperfect memories of others’ words. Quotation marks surround lines that others constructed. Feel welcome to chime in, in the “comments” section.

One word surfaced in all, or nearly all, my conversations: “impact.” Extraordinary science changes how researchers across the world think. Extraordinary science reaches beyond one subdiscipline.

This reach reminded me of answers to a question I’d asked senior scientists when in college: “What do you mean by ‘beautiful’?”  Replies had varied, but a synopsis had crystallized: “Beautiful science enables us to explain a lot with a little.” Schrodinger’s equation, which describes how quantum systems evolve, fits on one line. But the equation describes electrons bound to atoms, particles trapped in boxes, nuclei in magnetic fields, and more. Beautiful science, which overlaps with extraordinary science, captures much of nature in a small net.

Inventing a field constitutes extraordinary science. Examples include the fusion of quantum information with high-energy physics. Entanglement, quantum computation, and error correction are illuminating black holes, wormholes, and space-time.

Extraordinary science surprises us, revealing faces that we never expected nature to wear. Many extraordinary experiments generate data inexplicable with existing theories. Some extraordinary theory accounts for puzzling data; some extraordinary theory provokes experiments. I graduated from the Perimeter Scholars International Masters program,  at the Perimeter Institute for Theoretical Physics, almost five years ago. Canadian physicist Art McDonald presented my class’s commencement address. An interest in theory, he said, brought you to this institute. Plunge into theory, if you like. Theorem away. But keep a bead on experiments. Talk with experimentalists; work to understand them. McDonald won a Nobel Prize, two years later, for directing the Sudbury Neutrino Observatory (SNO). (SNOLab, with the Homestake experiment, revealed properties of subatomic particles called “neutrinos.” A neutrino’s species can change, and neutrinos have tiny masses. Neutrinos might reveal why the universe contains more matter than antimatter.)

Not all extraordinary theory clings to experiment like bubblegum to hair. Elliott Lieb and Mary Beth Ruskai proved that quantum entropies obey an inequality called “strong subadditivity” (SSA).  Entropies quantify uncertainty about which outcomes measurements will yield. Experimentalists could test SSA’s governance of atoms, ions, and materials. But no physical platform captures SSA’s essence.

Abstract mathematics underlies Lieb and Ruskai’s theorem: convexity and concavity (properties of functions), the Golden-Thompson inequality (a theorem about exponentials of matrices), etc. Some extraordinary theory dovetails with experiment; some wings away.

One interviewee sees extraordinary science in foundational science. At our understanding’s roots lie ideas that fertilize diverse sprouts. Other extraordinary ideas provide tools for calculating, observing, or measuring. Richard Feynman sped up particle-physics computations, for instance, by drawing diagrams.  Those diagrams depict high-energy physics as the creation, separation, recombination, and annihilation of particles. Feynman drove not only a technical, but also a conceptual, advance. Some extraordinary ideas transform our views of the world.

Difficulty preoccupied two experimentalists. An experiment isn’t worth undertaking, one said, if it isn’t difficult. A colleague, said another, “does the impossible and makes it look easy.”

Simplicity preoccupied two theorists. I wrung my hands, during year one of my PhD, in an email to John. The results I’d derived—now that I’d found them— looked as though I should have noticed them months earlier. What if the results lacked gristle? “Don’t worry about too simple,” John wrote back. “I like simple.”

Another theorist agreed: Simplification promotes clarity. Not all simple ideas “go the distance.” But ideas run farther when stripped down than when weighed down by complications.

Extraordinary scientists have a sense of taste. Not every idea merits exploration. Identifying the ideas that do requires taste, or style, or distinction. What distinguishes extraordinary science? More of the theater critic and Julia Child than I expected five years ago.

With gratitude to the thinkers who let me pick their brains.

by Sean Carroll at January 29, 2018 05:45 PM

Georg von Hippel - Life on the lattice

Looking for guest blogger(s) to cover LATTICE 2018
Since I will not be attending LATTICE 2018 for some excellent personal reasons, I am looking for a guest blogger or even better several guest bloggers from the lattice community who would be interested in covering the conference. Especially for advanced PhD students or junior postdocs, this might be a great opportunity to get your name some visibility. If you are interested, drop me a line either in the comment section or by email (my university address is easy to find).

by Georg v. Hippel ( at January 29, 2018 11:49 AM

January 25, 2018

Alexey Petrov - Symmetry factor

Rapid-response (non-linear) teaching: report

Some of you might remember my previous post about non-linear teaching, where I described a new teaching strategy that I came up with and was about to implement in teaching my undergraduate Classical Mechanics I class. Here I want to report on the outcomes of this experiment and share some of my impressions on teaching.

Course description

Our Classical Mechanics class is a gateway class for our physics majors. It is the first class they take after they are done with general physics lectures. So the students are already familiar with the (simpler version of the) material they are going to be taught. The goal of this class is to start molding physicists out of physics students. It is a rather small class (max allowed enrollment is 20 students; I had 22 in my class), which makes professor-student interaction rather easy.

Rapid-response (non-linear) teaching: generalities

To motivate the method that I proposed, I looked at some studies in experimental psychology, in particular in memory and learning studies. What I was curious about is how much is currently known about the process of learning and what suggestions I can take from the psychologists who know something about the way our brain works in retaining the knowledge we receive.

As it turns out, there are some studies on this subject (I have references, if you are interested). The earliest ones go back to 1880’s when German psychologist Hermann Ebbinghaus hypothesized the way our brain retains information over time. The “forgetting curve” that he introduced gives approximate representation of information retention as a function of time. His studies have been replicated with similar conclusions in recent experiments.

EbbinghausCurveThe upshot of these studies is that loss of learned information is pretty much exponential; as can be seen from the figure on the left, in about a day we only retain about 40% of what we learned.

Psychologists also learned that one of the ways to overcome the loss of information is to (meaningfully) retrieve it: this is how learning  happens. Retrieval is critical for robust, durable, and long-term learning. It appears that every time we retrieve learned information, it becomes more accessible in the future. It is, however, important how we retrieve that stored information: simple re-reading of notes or looking through the examples will not be as effective as re-working the lecture material. It is also important how often we retrieve the stored info.

So, here is what I decided to change in the way I teach my class in light of the above-mentioned information (no pun intended).

Rapid-response (non-linear) teaching: details

To counter the single-day information loss, I changed the way homework is assigned: instead of assigning homework sets with 3-4-5 problems per week, I introduced two types of homework assignments: short homeworks and projects.

Short homework assignments are single-problem assignments given after each class that must be done by the next class. They are designed such that a student needs to re-derive material that was discussed previously in class (with small new twist added). For example, if the block-down-to-incline problem was discussed in class, the short assignment asks to redo the problem with a different choice of coordinate axes. This way, instead of doing an assignment in the last minute at the end of the week, the students are forced to work out what they just learned in class every day (meaningful retrieval)!

The second type of assignments, project homework assignments are designed to develop understanding of how topics in a given chapter relate to each other. There are as many project assignments as there are chapters. Students get two weeks to complete them.

At the end, the students get to solve approximately the same number of problems over the course of the semester.

For a professor, the introduction of short homework assignments changes the way class material is presented. Depending on how students performed on the previous short homework, I adjusted the material (both speed and volume) that we discussed in class. I also designed examples for the future sections in such a way that I could repeat parts of the topic that posed some difficulties in comprehension. Overall, instead of a usual “linear” propagation of the course, we moved along something akin to helical motion, returning and spending more time on topics that students found more difficult (hence “rapid-response or non-linear” teaching).

Other things were easy to introduce: for instance, using Socrates’ method in doing examples. The lecture itself was an open discussion between the prof and students.


So, I have implemented this method in teaching Classical Mechanics I class in Fall 2017 semester. It was not an easy exercise, mostly because it was the first time I was teaching GraphNonlinearTeachingthis class and had no grader help. I would say the results confirmed my expectations: introduction of short homework assignments helps students to perform better on the exams. Now, my statistics is still limited: I only had 20 students in my class. Yet, among students there were several who decided to either largely ignore short homework assignments or did them irregularly. They were given zero points for each missed short assignment. All students generally did well on their project assignments, yet there appears some correlation (see graph above) between the total number of points acquired on short homework assignments and exam performance (measured by a total score on the Final and two midterms). This makes me thing that short assignments were beneficial for students. I plan to teach this course again next year, which will increase my statistics.

I was quite surprised that my students generally liked this way of teaching. In fact, they were disappointed that I decided not to apply this method for the Mechanics II class that I am teaching this semester. They also found that problems assigned in projects were considerably harder than the problems from the short assignments (this is how it was supposed to be).

For me, this was not an easy semester. I had to develop my set of lectures — so big thanks go to my colleagues Joern Putschke and Rob Harr who made their notes available. I spent a lot of time preparing this course, which, I think, affected my research outcome last semester. Yet, most difficulties are mainly Wayne State-specifics: Wayne State does not provide TAs for small classes, so I had to not only design all homework assignments, but also grade them (on top of developing the lectures from the ground up). During the semester, it was important to grade short assignments in the same day I received them to re-tune lectures, this did take a lot of my time. I would say TAs would certainly help to run this course — so I’ll be applying for some internal WSU educational grants to continue development of this method. I plan to employ it again next year to teach Classical Mechanics.


by apetrov at January 25, 2018 08:18 PM

January 22, 2018

Axel Maas - Looking Inside the Standard Model

Finding - and curing - disagreements
The topic of grand-unified theories came up in the blog several times, most recently last year in January. To briefly recap, such theories, called GUTs for short, predict that all three forces between elementary particles emerge from a single master force. That would explain a lot of unconnected observations we have in particle physics. For example, why atoms are electrically neutral. The latter we can describe, but not yet explain.

However, if such a GUT exists, then it must not only explain the forces, but also somehow why we see the numbers and kinds of elementary particles we observe in nature. And now things become complicated. As discussed in the last entry on GUTs there maybe a serious issue in how we determine which particles are actually described by such a theory.

To understand how this issue comes about, I need to put together many different things my research partners and I have worked on during the last couple of years. All of these issues are actually put into an expert language in the review of which I talked in the previous entry. It is now finished, and if your interested, you can get it free from here. But it is very technical.

So, let me explain it less technically.

Particle physics is actually superinvolved. If we would like to write down a theory which describes what we see, and only what we see, it would be terribly complicated. It is much more simple to introduce redundancies in the description, so-called gauge symmetries. This makes life much easier, though still not easy. However, the most prominent feature is that we add auxiliary particles to the game. Of course, they cannot be really seen, as they are just auxiliary. Some of them are very obviously unphysical, called therefore ghosts. They can be taken care of comparatively simply. For others, this is less simple.

Now, it turns out that the weak interaction is a very special beast. In this case, there is a unique one-to-one identification between a really observable particle and an auxiliary particle. Thus, it is almost correct to identify both. But this is due to the very special structure of this part of particle physics.

Thus, a natural question is whether, even if it is special, it is justified to do the same for other theories. Well, in some cases, this seems to be the case. But we suspected that this may not be the case in general. And especially not in GUTs.

Now, recently we were going about this much more systematically. You can again access the (very, very technical) result for free here. There, we looked at a very generic class of such GUTs. Well, we actually looked at the most relevant part of them, and still by far not all of them. We also ignored a lot of stuff, e.g. what would become quarks and leptons, and concentrated only on the generalization of the weak interaction and the Higgs.

We then checked, based on our earlier experiences and methods, whether a one-to-one identification of experimentally accessible and auxiliary particles works. And it does essentially never. Visually, this result looks like

On the left, it is seen that everything works nicely with a one-to-one identification in the standard model. On the right, if one-to-one identification would work in a GUT, everything would still be nice. But a our more precise calculation shows that the actually situation, which would be seen in an experiment, is different. There is non one-to-one identification possible. And thus the prediction of the GUT differs from what we already see inn experiments. Thus, a previously good GUT candidate is no longer good.

Though more checks are needed, as always, this is a baffling, and at the same time very discomforting, result.

Baffling as we did originally expect to have problems under very special circumstances. It now appears that actually the standard model of particles is the very special case, and having problems is the standard.

It is discomforting because in the powerful method of perturbation theory the one-to-one identification is essentially always made. As this tool is widely used, this seems to question the validity of many predictions on GUTs. That could have far-reaching consequences. Is this the case? Do we need to forget everything about GUTs we learned so far?

Well, not really, for two reasons. One is that we also showed that methods almost as easily handleable as perturbation theory can be used to fix the problems. This is good, because more powerful methods, like the simulations we used before, are much more cumbersome. However, this leaves us with the problem of having made so far wrong predictions. Well, this we cannot change. But this is just normal scientific progress. You try, you check, you fail, you improve, and then you try again.

And, in fact, this does not mean that GUTs are wrong. Just that we need to consider somewhat different GUTs, and make the predictions more carefully next time. Which GUTs we need to look at we still need to figure out, and that will not be simple. But, fortunately, the improved methods mentioned beforehand can use much of what has been done so far, so most technical results are still unbelievable useful. This will help enormously in finding GUTs which are applicable, and yield a consistent picture, without the one-to-one identification. GUTs are not dead. They likely just need a bit of changing.

This is indeed a dramatic development. But one which fits logically and technically to the improved understanding of the theoretical structures underlying particle physics, which were developed over the last decades. Thus, we are confident that this is just the next logical step in our understanding of how particle physics works.

by Axel Maas ( at January 22, 2018 04:54 PM

January 17, 2018

Sean Carroll - Preposterous Universe

Beyond Falsifiability

I have a backlog of fun papers that I haven’t yet talked about on the blog, so I’m going to try to work through them in reverse chronological order. I just came out with a philosophically-oriented paper on the thorny issue of the scientific status of multiverse cosmological models:

Beyond Falsifiability: Normal Science in a Multiverse
Sean M. Carroll

Cosmological models that invoke a multiverse – a collection of unobservable regions of space where conditions are very different from the region around us – are controversial, on the grounds that unobservable phenomena shouldn’t play a crucial role in legitimate scientific theories. I argue that the way we evaluate multiverse models is precisely the same as the way we evaluate any other models, on the basis of abduction, Bayesian inference, and empirical success. There is no scientifically respectable way to do cosmology without taking into account different possibilities for what the universe might be like outside our horizon. Multiverse theories are utterly conventionally scientific, even if evaluating them can be difficult in practice.

This is well-trodden ground, of course. We’re talking about the cosmological multiverse, not its very different relative the Many-Worlds interpretation of quantum mechanics. It’s not the best name, as the idea is that there is only one “universe,” in the sense of a connected region of space, but of course in an expanding universe there will be a horizon past which it is impossible to see. If conditions in far-away unobservable regions are very different from conditions nearby, we call the collection of all such regions “the multiverse.”

There are legitimate scientific puzzles raised by the multiverse idea, but there are also fake problems. Among the fakes is the idea that “the multiverse isn’t science because it’s unobservable and therefore unfalsifiable.” I’ve written about this before, but shockingly not everyone immediately agreed with everything I have said.

Back in 2014 the Edge Annual Question was “What Scientific Theory Is Ready for Retirement?”, and I answered Falsifiability. The idea of falsifiability, pioneered by philosopher Karl Popper and adopted as a bumper-sticker slogan by some working scientists, is that a theory only counts as “science” if we can envision an experiment that could potentially return an answer that was utterly incompatible with the theory, thereby consigning it to the scientific dustbin. Popper’s idea was to rule out so-called theories that were so fuzzy and ill-defined that they were compatible with literally anything.

As I explained in my short write-up, it’s not so much that falsifiability is completely wrong-headed, it’s just not quite up to the difficult task of precisely demarcating the line between science and non-science. This is well-recognized by philosophers; in my paper I quote Alex Broadbent as saying

It is remarkable and interesting that Popper remains extremely popular among natural scientists, despite almost universal agreement among philosophers that – notwithstanding his ingenuity and philosophical prowess – his central claims are false.

If we care about accurately characterizing the practice and principles of science, we need to do a little better — which philosophers work hard to do, while some physicists can’t be bothered. (I’m not blaming Popper himself here, nor even trying to carefully figure out what precisely he had in mind — the point is that a certain cartoonish version of his views has been elevated to the status of a sacred principle, and that’s a mistake.)

After my short piece came out, George Ellis and Joe Silk wrote an editorial in Nature, arguing that theories like the multiverse served to undermine the integrity of physics, which needs to be defended from attack. They suggested that people like me think that “elegance [as opposed to data] should suffice,” that sufficiently elegant theories “need not be tested experimentally,” and that I wanted to “to weaken the testability requirement for fundamental physics.” All of which is, of course, thoroughly false.

Nobody argues that elegance should suffice — indeed, I explicitly emphasized the importance of empirical testing in my very short piece. And I’m not suggesting that we “weaken” anything at all — I’m suggesting that we physicists treat the philosophy of science with the intellectual care that it deserves. The point is not that falsifiability used to be the right criterion for demarcating science from non-science, and now we want to change it; the point is that it never was, and we should be more honest about how science is practiced.

Another target of Ellis and Silk’s ire was Richard Dawid, a string theorist turned philosopher, who wrote a provocative book called String Theory and the Scientific Method. While I don’t necessarily agree with Dawid about everything, he does make some very sensible points. Unfortunately he coins the term “non-empirical theory confirmation,” which was an extremely bad marketing strategy. It sounds like Dawid is saying that we can confirm theories (in the sense of demonstrating that they are true) without using any empirical data, but he’s not saying that at all. Philosophers use “confirmation” in a much weaker sense than that of ordinary language, to refer to any considerations that could increase our credence in a theory. Of course there are some non-empirical ways that our credence in a theory could change; we could suddenly realize that it explains more than we expected, for example. But we can’t simply declare a theory to be “correct” on such grounds, nor was Dawid suggesting that we could.

In 2015 Dawid organized a conference on “Why Trust a Theory?” to discuss some of these issues, which I was unfortunately not able to attend. Now he is putting together a volume of essays, both from people who were at the conference and some additional contributors; it’s for that volume that this current essay was written. You can find other interesting contributions on the arxiv, for example from Joe Polchinski, Eva Silverstein, and Carlo Rovelli.

Hopefully with this longer format, the message I am trying to convey will be less amenable to misconstrual. Nobody is trying to change the rules of science; we are just trying to state them accurately. The multiverse is scientific in an utterly boring, conventional way: it makes definite statements about how things are, it has explanatory power for phenomena we do observe empirically, and our credence in it can go up or down on the basis of both observations and improvements in our theoretical understanding. Most importantly, it might be true, even if it might be difficult to ever decide with high confidence whether it is or not. Understanding how science progresses is an interesting and difficult question, and should not be reduced to brandishing bumper-sticker mottos to attack theoretical approaches to which we are not personally sympathetic.

by Sean Carroll at January 17, 2018 04:44 PM