Particle Physics Planet

March 22, 2018

Christian P. Robert - xi'an's og

back to Wales [54th Gregynog Statistical Conference]

Today, provided the Air France strike let me fly to Birmingham airport!, I am back at Gregynog Hall, Wales, for the weekend conference organised there every year by some Welsh and English statistics departments, including Warwick. Looking forward to the relaxed gathering in the glorious Welsh countryside (and hoping that my knee will have sufficiently recovered for some trail running around Gregynog Hall…!) Here are the slides of the talk I will present tomorrow:

by xi'an at March 22, 2018 11:18 PM

Peter Coles - In the Dark

A photograph of Sir John Herschel

I didn’t have time to write a post today before it became necessary for me to go to the pub, so I thought I’d just share this marvellous photograph of astronomer Sir John Herschel taken in 1867 by pioneering portrait photographer Julia Margaret Cameron.

by telescoper at March 22, 2018 08:34 PM

ZapperZ - Physics and Physicists

Fermilab Accelerator Complex
This is a neat animation video of the Fermilab Accelerator Complex as it is now, and all the various experiments and capabilities that it has.

Of course, the "big ring", which was the Tevatron, is no longer running now, and thus, no high-energy particle collider experiments being conducted anymore.


by ZapperZ ( at March 22, 2018 05:14 PM

ZapperZ - Physics and Physicists

An Astrophysicist Describes Stephen Hawking's Last Paper
The astrophysicist in this case is, of course, Ethan Siegel, who I've cited here a few times.

In this article, he describes what Hawking's last paper is all about, if you want simple description of it. The link to the preprint (we'll update this post if and when it is published) is also given if you don't have it already.

Here is, in a nutshell, what they do. They create a (deformed) conformal field theory that is mathematically equivalent (or dual) to an eternally inflating spacetime, and investigate some mathematical properties of that field theory. They look, in particular, at where the border of a spacetime that inflates for an eternity (forward in time) versus one that doesn't, and choose that as the interesting problem to consider. They then look at the geometries that arise from this field theory, try to map that back onto our physically inflating Universe, and draw a conclusion from that. Based on what they find, they contend that the exit from inflation doesn't give you something eternally inflating into the future, with disconnected pockets where hot Big Bangs occur, but rather that the exit is finite and smooth. In other words, it gives you a single Universe, not a series of disconnected Universes embedded in a larger multiverse.

There! Do you even need to read the actual paper after that?


BTW, let's also give some love to his co-author, Thomas Hertog, who seems to be left out in many of this discussion and news articles.


by ZapperZ ( at March 22, 2018 12:39 PM

Emily Lakdawalla - The Planetary Society Blog

Funpost! The transportation of humans to foreign planets
For this week's Funpost, Jason answers 11 questions about humans exploring Mars.

March 22, 2018 11:00 AM

Emily Lakdawalla - The Planetary Society Blog

#LPSC2018: Titan Is Terrific!
Emily's first report from the Lunar and Planetary Science Conference is on the solar system's most atmospheriffic satellite, Saturn's moon Titan.

March 22, 2018 01:44 AM

March 21, 2018

Christian P. Robert - xi'an's og

spacings on a torus

While in Brussels last week I noticed an interesting question on X validated that I considered in the train back home and then more over the weekend. This is a question about spacings, namely how long on average does it take to cover an interval of length L when drawing unit intervals at random (with a torus handling of the endpoints)? Which immediately reminded me of Wilfrid Kendall (Warwick) famous gif animation of coupling from the past via leaves covering a square region, from the top (forward) and from the bottom (backward)…

The problem is rather easily expressed in terms of uniform spacings, more specifically on the maximum spacing being less than 1 (or 1/L depending on the parameterisation). Except for the additional constraint at the boundary, which is not independent of the other spacings. Replacing this extra event with an independent spacing, there exists a direct formula for the expected stopping time, which can be checked rather easily by simulation. But the exact case appears to be add a few more steps to the draws, 3/2 apparently. The following graph displays the regression of the Monte Carlo number of steps over 10⁴ replicas against the exact values:

by xi'an at March 21, 2018 11:18 PM

Jester - Resonaances

21cm to dark matter
The EDGES discovery of the 21cm absorption line at the cosmic dawn has been widely discussed on blogs and in popular press. Quite deservedly so.  The observation opens a new window on the epoch when the universe as we know it was just beginning. We expect a treasure trove of information about the standard processes happening in the early universe, as well as novel constraints on hypothetical particles that might have been present then. It is not a very long shot to speculate that, if confirmed, the EDGES discovery will be awarded a Nobel prize. On the other hand, the bold claim bundled with their experimental result -  that the unexpectedly large strength of the signal is an indication of interaction between the ordinary matter and cold dark matter - is very controversial. 

But before jumping to dark matter it is worth reviewing the standard physics leading to the EDGES signal. In the lowest energy (singlet) state, hydrogen may absorb a photon and jump to a slightly excited (triplet) state which differs from the true ground state just by the arrangement of the proton and electron spins. Such transitions are induced by photons of wavelength of 21cm, or frequency of 1.4 GHz, or energy of 5.9 𝜇eV, and they may routinely occur at the cosmic dawn when Cosmic Microwave Background (CMB) photons of the right energy hit neutral hydrogen atoms hovering in the universe. The evolution of the CMB and hydrogen temperatures is shown in the picture here as a function of the cosmological redshift z (large z is early time, z=0 is today). The CMB temperature is red and it decreases with time as (1+z) due to the expansion of the universe. The hydrogen temperature in blue is a bit more tricky. At the recombination time around z=1100 most proton and electrons combine to form neutral atoms, however a small fraction of free electrons and protons survives. Interactions between the electrons and CMB photons via Compton scattering are strong enough to keep the two (and consequently the hydrogen as well) at equal temperatures for some time.  However, around z=200 the CMB and hydrogen temperatures decouple, and the latter subsequently decreases much faster with time, as (1+z)^2. At the cosmic dawn, z~17, the hydrogen gas is already 7 times colder than the CMB, after which light from the first stars heats it up and ionizes it again.

The quantity directly relevant for the 21cm absorption signal is the so-called spin temperature Ts, which is a measure of the relative occupation number of the singlet and triplet hydrogen states. Just before the cosmic dawn, the spin temperature equals the CMB one, and as  a result there is no net absorption or emission of 21cm photons. However, it is believed that the light from the first stars initially lowers the spin temperature down to the hydrogen one. Therefore, there should be absorption of 21cm CMB photons by the hydrogen in the epoch between z~20 and z~15. After taking into account the cosmological redshift, one should now observe a dip in the radio frequencies between 70 and 90 MHz. This is roughly what EDGES finds. The depth of the dip is described by the formula:
 As the spin temperature cannot be lower than that of the hydrogen, the standard physics predicts TCMB/Ts ≼ 7 corresponding  T21 ≽ -0.2K. The surprise is that EDGES observes a larger dip, T21 ≈ -0.5K, 3.8 astrosigma away from the predicted value, as if TCMB/Ts were of order 15.

If the EDGES result is taken at face value, it means that TCMB/Ts at the cosmic dawn was much larger than predicted in the standard scenario.  Either there was a lot more photon radiation at the relevant wavelengths, or the hydrogen gas was much colder than predicted. Focusing on the latter possibility, one could imagine that the hydrogen was cooled due to interactions with cold dark matter  made of relatively light (less than GeV) particles. However, this idea very difficult to realize in practice, because it requires the interaction cross section to be thousands of barns at the relevant epoch! Not picobarns typical for WIMPs. Many orders of magnitude more than the total proton-proton cross section at the LHC. Even in nuclear processes such values are rarely seen.  And we are talking here about dark matter, whose trademark is interacting weakly.   Obviously, the idea is running into all sorts of constraints that have been laboriously accumulated over the years.     
One can try to save this idea by a series of evasive tricks. If the interaction cross section scales as 1/v^4, where v is the relative velocity between colliding matter and dark matter particles, it could be enhanced at the cosmic dawn when the typical velocities were at its minimum. The 1/v^4 behavior is not unfamiliar, as it is characteristic of the electromagnetic forces in the non-relativistic limit. Thus, one could envisage a model where dark matter has a minuscule electric charge, one thousandth or less that of the proton. This trick buys some mileage, but the obstacles remain enormous. The cross section is still large enough for the dark and ordinary matter to couple strongly during the recombination epoch, contrary to what is concluded from precision observations of the CMB. Therefore the milli-charge particles can constitute only  a small fraction of dark matter, less then 1 percent. Finally, one needs to avoid constraints from direct detection, colliders, and emission by stars and supernovae.  A plot borrowed from this paper shows that a tiny region of viable parameter space remains around 100 MeV mass and 10^-5 charge, though my guess is that this will also go away upon a more careful analysis.

So, milli-charge dark matter cooling hydrogen does not stand scrutiny as an explanation for the EDGES anomaly. This does not mean that all exotic explanations must be so implausible. Better models are being and will be proposed, and one of them could even be correct. For example, models where new particles lead to an injection of additional 21cm photons at early times seem to be more encouraging.  My bet? Future observations will confirm the 21cm absorption signal, but the amplitude and other features will turn out to be consistent with the standard 𝞚CDM predictions. Given the number of competing experiments in the starting blocks, the issue should be clarified within the next few years. What is certain is that, this time,  we will learn a lot whether or not the anomalous signal persists :)

by Mad Hatter ( at March 21, 2018 03:31 PM

Clifford V. Johnson - Asymptotia

London Event Tomorrow!

Perhaps you were intrigued by the review of The Dialogues, my non-fiction graphic novel about science, in Saturday’s Spectator? Well, I’ll be talking about the book tomorrow (Thursday) at the bookshop Libreria in London at 7:00 pm. Maybe see you there! #thedialoguesbook

-cvj Click to continue reading this post

The post London Event Tomorrow! appeared first on Asymptotia.

by Clifford at March 21, 2018 01:24 PM

Christian P. Robert - xi'an's og

postdoc position in London plus Seattle

Here is an announcement from Oliver Ratman for a postdoc position at Imperial College London with partners in Seattle, on epidemiology and new Bayesian methods for estimating sources of transmission with phylogenetics. As stressed by Ollie, no pre-requisites in phylogenetics are required, they are really looking for someone with solid foundations in Mathematics/Statistics, especially Bayesian Statistics, and good computing skills (R, github, MCMC, Stan). The search is officially for a Postdoc in Statistics and Pathogen Phylodynamics. Reference number is NS2017189LH. Deadline is April 07, 2018.

by xi'an at March 21, 2018 01:18 PM

Peter Coles - In the Dark

Manchester Hill – “Here we fight, and here we die”

Today is the centenary of the start of a major offensive of the Western Front by the German forces against the British and French armies during the First World War. One particular action that took place on the first day of that offensive took place at a location now known as Manchester Hill, a region of high ground forming a salient overlooking the town of St Quentin, on this day 100 years ago i.e. on 21st March 1918. I read about this some time ago, but thought I would do a brief post about it to mark this grim anniversary.

Lieutenant-Colonel Wilfrith Elstob, Commanding Officer, 16th Battalion Manchester Rifles.

Manchester Hill had been captured by the 2nd Battalion of the Manchester Regiment in April 1917 and in March 2018 it was held by the 16th Battalion of the same Regiment under the command of Lieutenant-Colonel Wilfrith Elstob, a schoolteacher before the War who had joined the army in 1914 as a private soldier and was promoted through the ranks. His gallantry on that day earned him a posthumous Victoria Cross with the citation:

For most conspicuous bravery, devotion to duty and self-sacrifice during operations at Manchester Redoubt, near St. Quentin, on the 21st March, 1918. During the preliminary bombardment he encouraged his men in the posts in the Redoubt by frequent visits, and when repeated attacks developed controlled the defence at the points threatened, giving personal support with revolver, rifle and bombs. Single-handed he repulsed one bombing assault driving back the enemy and inflicting severe casualties. Later, when ammunition was required, he made several journeys under severe fire in order to replenish the supply. Throughout the day Lieutenant-Colonel Elstob, although twice wounded, showed the most fearless disregard of his own safety, and by his encouragement and noble example inspired his command to the fullest degree. The Manchester Redoubt was surrounded in the first wave of the enemy attack, but by means of the buried cable Lieutenant-Colonel Elstob was able to assure his Brigade Commander that “The Manchester Regiment will defend Manchester Hill to the last.” Sometime after this post was overcome by vastly superior forces, and this very gallant officer was killed in the final assault, having maintained to the end the duty which he had impressed on his men – namely, “Here we fight, and here we die.” He set throughout the highest example of valour, determination, endurance and fine soldierly bearing.

His last action, after the Germans had broken through the last line of defences, was to use the field telephone to call down an artillery barrage onto his own position. His body was never found and he has no known grave.

You can read the stories of other soldiers who fought and died that day here.

Manchester Hill jutted out into the German lines so, although it was heavily fortified, it was very vulnerable and difficult to defend. Enemy troops were in position on three sides of the hill, and in the event of an attack was difficult to prevent it being surrounded, isolated and destroyed. In the days and hours preceding March 21st the troops on Manchester Hill could see the Germans moving into position and knew a major offensive was imminent. Elstob repeatedly asked his superior offices for permission to withdraw, but it was repeatedly refused. When specific intelligence was received that the attack would take place in the morning of 21st March he once more contacted his HQ to request position to withdraw. After having his request refused once more, he returned to his men and made the famous statement “This is our position. He we fight and here we die.”

There was thick fog the following morning, hiding the inevitable German advance which began at 6.30am with an artillery bombardment until it was too late to prevent them encircling the British garrison. By 11.30 the British were completely encircled. Nevertheless the defenders of Manchester Hill fought off repeated attacks and managed to hold their position until late afternoon against an overwhelmingly larger force. Elstob was in the thick of the action throughout, once holding a position alone using his service revolver and hand grenades. By 4pm however, the battle was lost and virtually all the defenders were dead. Of the 168 men (8 officers and 160 other ranks) who participated in the defence of the Manchester Hill redoubt, just 17 survived (two officers and 15 other ranks).

The German advance broke through Allied lines and stormed on, even at one point threatening Paris, but the pace of the advance led to supply difficulties and it eventually stuttered, was stopped and then flung back into a full retreat. Although German forces had been reinforced by troops no longer needed in the East after the Russian Revolution of 1917, American forces had been arriving in huge numbers – 300,000 a month – at the time of the Spring offensive and it this influx of troops across the Atlantic that proved decisive in the end.

We should celebrate the bravery of the defenders of Manchester Hill, especially Lieutenant-Colonel Elstob, but one can’t help asking why he was not given permission to withdraw. It is true that they delayed and disrupted the German advance, but at a terrible cost. It does seem to me that for all the courage and gallantry displayed by Elstob and his men, their sacrifice was unnecessary.

by telescoper at March 21, 2018 12:50 PM

Emily Lakdawalla - The Planetary Society Blog

Book Review: The Space Barons
A new book focuses on the eccentric leaders of SpaceX, Blue Origin, and Virgin Galactic.

March 21, 2018 11:00 AM

March 20, 2018

Christian P. Robert - xi'an's og

Bayesian maps of Africa

A rather special issue of Nature this week (1 March 2018) as it addresses Bayesian geo-cartography and mapping childhood growth failure and educational achievement (along with sexual differences) all across Africa! Including the (nice) cover of the journal, a preface by Kofi Annan, a cover article by Brian Reich and Murali Haran, and the first two major articles of the journal, one of which includes Ewan Cameron as a co-author. As I was reading this issue of Nature in the train back from Brussels, I could not access the supplementary material, so could not look at the specifics of the statistics, but the maps look quite impressive with a 5×5 km² resolution. And inclusion not only of uncertainty maps but also of predictive maps on the probability of achieving WHO 2025 goals. Surprisingly close to one in some parts of Africa. In terms of education, there are strong oppositions between different regions, with the south of the continent, including Madagascar, showing a positive difference for women in terms of years of education. While there is no reason (from my train seat) to doubt the statistical analyses, I take quite seriously the reservation of the authors that the quality of the prediction cannot be better than the quality of the data, which is “determined by the volume and fidelity of nationally representative surveys”. Which relates to an earlier post of mine about a similar concern with the deaths in Congo.

by xi'an at March 20, 2018 11:18 PM

Peter Coles - In the Dark

Equinoctial Molehills

Very busy today, what with a return to lecturing in Cardiff and so on, so I’ve just got time for a quick post to mark the fact that the Vernal Equinox in the Northern Hemisphere took place today, Tuesday 20th March 2018, at 16.15 UTC (which is 16.15 GMT). This means that the Sun has just crossed the celestial equator on its journey Northward. Some people regard this as the first day of spring, which is fair enough as it does correspond fairly well to the end of the Six Nations rugby.

It wasn’t exactly spring weather when I walked into work this morning, as there are still bits of snow around in Bute Park.

More significantly, a huge number of molehills have appeared. Not quite a mole of molehills, but still quite a few. I’m not sure of the reason for all this molar activity. Perhaps moles have special rituals for marking the Vernal Equinox?

Incidentally I was dismayed to see that my Royal Astronomical Society diary gives the time of the 2018 Vernal Equinox as 16.16 GMT while the wikipedia page I linked to above gives 16.15 GMT. I find a discrepancy of this magnitude extremely unnerving. Or am I making a mountain out of a molehill?

by telescoper at March 20, 2018 04:49 PM

ZapperZ - Physics and Physicists

Micro Fusion In Nanowires Array
This is a rather astounding result. The authors have managed to cause deuterons-deuterons fusion in an array of nanowires via igniting it using only joule-level pulsed laser[1], i.e. not using the huge, gigantic lasers such as that as the National Ignition Facility.

This is an open-access paper and you can get the full version at this link.

And no, before you jump all over this one and think that this is the next fusion power generator, you need to think again. The authors are touting this as a viable (and cheaper) ultra-fast pulsed neutron source, which can be useful in many applications and studies.


[1] A. Curtis et al., Nature Communnications DOI: 10.1038/s41467-018-03445-

by ZapperZ ( at March 20, 2018 02:40 PM

Emily Lakdawalla - The Planetary Society Blog

LightSail 2 doubles the fun with double integrations
LightSail 2 is integrated in Prox-1! It's another important step toward launch day.

March 20, 2018 11:00 AM

Marco Frasca - The Gauge Connection

Good news from Moriond

Some days ago, Rencontres of Moriond 2018 ended with the CERN presenting a wealth of results also about the Higgs particle. The direction that the two great experiments, ATLAS and CMS, took is that of improving the measurements on the Standard Model as no evidence has been seen so far of possible new particles. Also, the studies of the properties of the Higgs particle have been refined as promised and the news are really striking.

In a communicate to the public (see here), CERN finally acknowledge, for the first time, a significant discrepancy between data from CMS and Standard Model for the signal strengths in the Higgs decay channels. They claim a 17% difference. This is what I advocated for some years and I have published in reputable journals. I will discuss this below. I would like only to show you the CMS results in the figure below.

ATLAS, by its side, is seeing significant discrepancy in the ZZ channel (2\sigma) and a 1\sigma compatibility for the WW channel. Here are their results.

On the left the WW channel is shown and on the right there are the combined \gamma\gamma and ZZ channels.

The reason of the discrepancy is due, as I have shown in some papers (see here, here and here), to the improper use of perturbation theory to evaluate the Higgs sector. The true propagator of the theory is a sum of Yukawa-like propagators with a harmonic oscillator spectrum. I solved exactly this sector of the Standard Model. So, when the full propagator is taken into account, the discrepancy is toward an increase of the signal strength. Is it worth a try?

This means that this is not physics beyond the Standard Model but, rather, the Standard Model in its full glory that is teaching something new to us about quantum field theory. Now, we are eager to see the improvements in the data to come with the new run of LHC starting now. In the summer conferences we will have reasons to be excited.

by mfrasca at March 20, 2018 09:17 AM

March 19, 2018

The n-Category Cafe

Magnitude Homology Reading Seminar, I

In Sheffield we have started a reading seminar on the recent paper of Tom Leinster and Mike Shulman Magnitude homology of enriched categories and metric spaces. The plan was to write the talks up as blog posts. Various things, including the massive strike that has been going on in universities in the UK, have meant that I’m somewhat behind with putting the first talk up. The strike also means that we haven’t had many seminars yet!

I gave the first talk which is the one here. It is an introductory talk which just describes the idea of categorification and the paper I wrote with Richard Hepworth on categorifying the magnitude of finite graphs, this is the idea which was generalized by Tom and Mike.


Categorification means many things. In this context it is supposed to be the idea of lifting an invariant from taking values in a set to values in a category. Let’s look at two examples.

[This is possibly a caricature of what actually happened. It would be nice to have some references!] In the eighteenth century, mathematicians such as Riemann knew about the Euler characteristic of surfaces (and possibly manifolds). This is a fundamental invariant which seems to crop up in all sorts of places. Towards the end of the century Poincar'e introduced homology groups <semantics>H (M)<annotation encoding="application/x-tex">\text{H}_\star(M)</annotation></semantics> of a manifold <semantics>M<annotation encoding="application/x-tex">M</annotation></semantics> and was aware

<semantics>χ(M)=rank(H (M))= i(1) irank(H i(M)).<annotation encoding="application/x-tex"> \chi(M)=rank(\text{H}_\star(M))= \sum_i (-1)^i rank (\text{H}_i(M)). </annotation></semantics>

I get the impression the functorial nature of homology was not appreciated until later, but this adds another layer of structure.

Around 1985 Jones introduced his eponymous polynomial <semantics>J(L)[q ±1]<annotation encoding="application/x-tex">\text{J}(L)\in \mathbb{Z}[q^{\pm 1}]</annotation></semantics> for a knot or link <semantics>L<annotation encoding="application/x-tex">L</annotation></semantics> in <semantics>3<annotation encoding="application/x-tex">3</annotation></semantics>-space. This gives a polynomial invariant of links. In around 1999, Khovanov introduced Khovanov homology <semantics>Kh ,(L)<annotation encoding="application/x-tex">Kh_{\star, \star}(L)</annotation></semantics> for a link <semantics>L<annotation encoding="application/x-tex">L</annotation></semantics>, this is bigraded group. The Jones polynomial is obtained from it by taking the dimension (or Euler characteristic!) in an appropriate graded sense:

<semantics>J(L)= i,j(1) iq jrank(Kh i,j(L)).<annotation encoding="application/x-tex"> \text{J}(L)= \sum_{i,j} (-1)^i q^j rank( Kh_{i,j}(L)). </annotation></semantics>

In both these cases we lift an invariant which takes values in a set (either <semantics><annotation encoding="application/x-tex">\mathbb{Z}</annotation></semantics> or <semantics>[q ±1]<annotation encoding="application/x-tex"> \mathbb{Z}[q^{\pm 1}]</annotation></semantics>) to an invariant which takes values in a category (either graded groups or bigraded groups). This lifted invariant has a richer structure and functorial properties, but is probably harder to calculate! This is what we mean by categorifying an invariant.

Magnitude of enriched categories

There was a classical notion of Euler characteristic for finite groups and also one for finite posets. We know that finite groups and finite posets are both examples of finite categories (at one extreme with only one object and at the other extreme with at most one morphism between each pair of objects). Tom found a common generalization of these Euler characteristics which is the idea of an Euler characteristic for finite categories (we will see the definition next week). He further generalized that to the notion of an Euler characteristic for enriched categories (with a additional bit of structure, wait for next week). Finite metric spaces are examples of enriched categories and so have a notion of Euler characteristic. We decided the name was too confusing so after consulting a thesaurus we decided on “magnitude” (having toyed with the name “cardinality”). Tom later noticed something nice about the magnitude of the metric spaces that you get from finite graphs (partly because these have integer-valued metrics).

The journey from the Euler characteristics of finite posets and finite groups to the magnitude of finite graphs via a sequence of generalizations and specializations can be viewed as a trip up and then down the following picture.

hierarchy of some category enrichments

Magnitude of metric spaces (more next week)

For <semantics>X={x 1,x n}<annotation encoding="application/x-tex">X=\{x_1,\dots x_n\}</annotation></semantics> a finite metric space with the metric we can define a matrix <semantics>Z<annotation encoding="application/x-tex">Z</annotation></semantics> by <semantics>Z i,j=e d(x i,x j)<annotation encoding="application/x-tex">Z_{i,j}=e^{-\text{d}(x_i,x_j)}</annotation></semantics>. The magnitude <semantics>|X|<annotation encoding="application/x-tex">|X|</annotation></semantics> is defined to be the sum of the entries of the inverse matrix (if it exists): <semantics>|X|:= i,j(Z 1) i,j<annotation encoding="application/x-tex">|X|:=\sum_{i,j}(Z^{-1})_{i,j}</annotation></semantics>. It is actually more interesting if we look at what happens as we scale <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> (or perhaps if we introduce an indeterminate into the metric). For <semantics>t>0<annotation encoding="application/x-tex">t&gt; 0</annotation></semantics>, we define <semantics>tX<annotation encoding="application/x-tex">t\cdot X</annotation></semantics> to have the same underlying set, but with the metric scaled by a factor of <semantics>t<annotation encoding="application/x-tex">t</annotation></semantics>. This gives us the magnitude function <semantics>|tX|<annotation encoding="application/x-tex">\left|t\cdot X\right|</annotation></semantics> which is a function of <semantics>t<annotation encoding="application/x-tex">t</annotation></semantics>.

We can have a look at a simple example where we take <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> to be a three-point metric space in which two points are much, much closer to each other than they are to the third point. Here is a picture of <semantics>tX<annotation encoding="application/x-tex">t\cdot X</annotation></semantics>.


Here is the graph of the magnitude function of the metric space <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics>.

graph of the magnitude function

This shows in some sense how the magnitude can be viewed as an “effective number of points”. At small scales there is effectively one point, at middling scales there is effectively two points and at very large scales there are effectively three points.

Although the definition looks rather ad hoc, it turns out that it has various connections to thinks like measurements of biodiversity, Hausdorff dimension, volumes, potential theory and several other fun things.

Magnitude of graphs

Suppose that <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics> is a finite graph, then it gives rise to a finite metric space (which we will also write as <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics>) which has the vertices of <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics> as its points and the shortest path distance as its metric, where all edges have length one.

For example we have the five-cycle graph below with <semantics>d(g 0,g 3)=2<annotation encoding="application/x-tex">\text{d}(g_0, g_3)=2</annotation></semantics>.


Tom noticed that we can use the magnitude function of the associated metric space to get an integral power series from the graph. Firstly, we can write <semantics>q=e t<annotation encoding="application/x-tex">q=e^{-t}</annotation></semantics> then the entries of the matrix <semantics>Z<annotation encoding="application/x-tex">Z</annotation></semantics> are just integer powers of <semantics>q<annotation encoding="application/x-tex">q</annotation></semantics> as all of the distances in <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics> are integral. This means that the entries of <semantics>Z 1<annotation encoding="application/x-tex">Z^{-1}</annotation></semantics> are just rational functions of <semantics>q<annotation encoding="application/x-tex">q</annotation></semantics> (with integer coefficients) and hence so is their sum, the magnitude function. Moreover the denominator of this rational function is the determinant of <semantics>Z<annotation encoding="application/x-tex">Z</annotation></semantics> which, as the diagonal entries of <semantics>Z<annotation encoding="application/x-tex">Z</annotation></semantics> are all <semantics>e 0<annotation encoding="application/x-tex">e^{0}</annotation></semantics>, i.e., <semantics>1<annotation encoding="application/x-tex">1</annotation></semantics>, is of the form <semantics>1+powers of q<annotation encoding="application/x-tex">1+\text{powers of }q</annotation></semantics>. So we can take a power series expansion of <semantics>|tG|<annotation encoding="application/x-tex">|t\cdot G|</annotation></semantics> to get an integer power series in <semantics>q=e t<annotation encoding="application/x-tex">q=e^{-t}</annotation></semantics>. We denote this power series by <semantics>#G<annotation encoding="application/x-tex">\# G</annotation></semantics>.

For example, for the five-cycle graph <semantics>C 5<annotation encoding="application/x-tex">C_5</annotation></semantics> pictured above we have

<semantics>#C 5=510q+10q 220q 4+40q 540q 6+.<annotation encoding="application/x-tex"> \# C_5 = 5-10q+10q^2-20q^4+40q^5-40q^6+\cdots. </annotation></semantics>

In general we can identify the first two coefficients as the number of vertices and <semantics>2<annotation encoding="application/x-tex">-2</annotation></semantics> times the number of edges, respectively.

Categorifying the magnitude of graphs

As Richard Hepworth noticed, we can categorify this! In other words, we can find a homology theory which has the magnitude power series <semantics>#GZ[q]<annotation encoding="application/x-tex">\# G\in \Z[q]</annotation></semantics> as its graded Euler characteristic.

For a finite graph <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics> define the magnitude chain groups as follows.

<semantics>MC k,l(G)=(x 0,,x k)|x i1x i,dd(x i1,x i)=l.<annotation encoding="application/x-tex"> MC_{k,l}(G)=\left\langle (x_0,\dots, x_k) \big | x_{i-1}\ne x_{i},\quad \sum \dd(x_{i-1}, x_i)=l\right\rangle. </annotation></semantics>

For example a chain group generator for the five-cycle graph from above is <semantics>(g 0,g 1,g 2,g 4,g 2)MC 4,6(C 5)<annotation encoding="application/x-tex">(g_0, g_1, g_2, g_4, g_2)\in MC_{4,6}(C_5)</annotation></semantics>.

We define maps <semantics> i:MC k,l(G)MC k1,l(G)<annotation encoding="application/x-tex">\partial_i\colon MC_{k,l}(G)\to MC_{k-1,l}(G)</annotation></semantics> for <semantics>i=1,,k1<annotation encoding="application/x-tex">i=1,\dots, k-1</annotation></semantics>:

<semantics> i(x 0,,x k)={(x 0,,x i^,,x k) ifx i1<x i<x i+1, 0 otherwise.<annotation encoding="application/x-tex"> \partial_{i}(x_0,\ldots,x_k) = \begin{cases} (x_0,\ldots,\widehat{x_i},\ldots,x_k) & \text{if}\,\, x_{i-1}&lt;x_{i}&lt;x_{i+1}, \\ 0 & \text{otherwise}. \end{cases} </annotation></semantics>

where <semantics>x i1<x i<x i+1<annotation encoding="application/x-tex">x_{i-1}&lt;x_{i}&lt;x_{i+1}</annotation></semantics> means that <semantics>x i<annotation encoding="application/x-tex">x_i</annotation></semantics> lies on a shortest path between <semantics>x i1<annotation encoding="application/x-tex">x_{i-1}</annotation></semantics> and <semantics>x i+1<annotation encoding="application/x-tex">x_{i+1}</annotation></semantics>, i.e., <semantics>d(x i1,x i)+d(x i,x i+1)=d(x i1,x i+1)<annotation encoding="application/x-tex">\text{d}(x_{i-1},x_i)+\text{d}(x_i,x_{i+1})=\text{d}(x_{i-1},x_{i+1})</annotation></semantics>.

So for our example chain generator in <semantics>C 5<annotation encoding="application/x-tex">C_5</annotation></semantics> you can check that we have

<semantics> i(g 0,g 1,g 2,g 4,g 2)={(g 0,g 2,g 4,g 2) ifi=1, 0 otherwise.<annotation encoding="application/x-tex"> \partial_{i}(g_0, g_1, g_2, g_4, g_2) = \begin{cases} (g_0, g_2, g_4, g_2) & \text{if}\,\, i=1, \\ 0 & {otherwise}. \end{cases} </annotation></semantics>

In the usual way, the differential <semantics>:MC k,l(G)MC k1,l(G)<annotation encoding="application/x-tex">\partial\colon MC_{k,l}(G)\to MC_{k-1,l}(G)</annotation></semantics> is defined as the alternating sum

<semantics>= 1 2++(1) k1 k1.<annotation encoding="application/x-tex"> \partial=\partial_1-\partial_2+\cdots+(-1)^{k-1}\partial_{k-1}. </annotation></semantics>

One can show that this is a differential, so that <semantics>=0<annotation encoding="application/x-tex">\partial\circ\partial =0</annotation></semantics>. Then taking homology gives what is defined to be magnitude homology groups of the graph:

<semantics>MH k,l(G)=H k(MC *,l(G),).<annotation encoding="application/x-tex"> \MH_{k,l}(G)= \text{H}_k(\MC_{\ast,l}(G), \partial). </annotation></semantics>

By direct computation you can calculate the ranks of the magnitude homology groups of the five-cycle graph. The following table shows the ranks <semantics>rank(MH k,l(C 5))<annotation encoding="application/x-tex">rank(MH_{k,l}(C_5))</annotation></semantics> for small <semantics>k<annotation encoding="application/x-tex">k</annotation></semantics> and <semantics>l<annotation encoding="application/x-tex">l</annotation></semantics>.

<semantics> k 0 1 2 3 4 5 6 χ(MH *,l(C 5)) 0 5 5 1 10 10 2 10 10 l 3 10 10 0 4 30 10 20 5 50 10 40 6 20 70 10 40<annotation encoding="application/x-tex"> \begin{array}{rrrrrrrrrrc} &&&&&k\\ &&0&1&2&3&4&5&6&&\chi(MH_{\ast,l}(C_5))\\ &0 & 5&&&&&&&\qquad&5\\ & 1 & & 10 &&&&&&&-10 \\ &2 & && 10 &&&&&&10\\ l& 3 &&& 10 & 10 &&&&&0 \\ & 4 &&&& 30 & 10 &&&&-20\\ & 5 &&&&& 50 & 10 &&&40 \\ & 6 &&&&& 20 & 70 & 10 &&-40 \end{array} </annotation></semantics>

The final column shows the Euler characteristics, which are just the alternating sums of entries in the rows. You can check that these are precisely the coefficients in the power series <semantics>#C 5<annotation encoding="application/x-tex">\# C_5</annotation></semantics> given above: this illustrates the fact that graph magnitude homology does indeed categorify graph magnitude in the sense of the following theorem.

Theorem <semantics>#G= k,l0(1) kq lrank(MH k,l(G))[[q]].<annotation encoding="application/x-tex"> \#G = \sum_{k,l\geq 0} (-1)^k q^l \,rank \bigl(MH_{k,l}(G)\bigr)\in \mathbb{Z}[[q]]. </annotation></semantics>

Thus we have <semantics>MH *,*<annotation encoding="application/x-tex">MH_{\ast,\ast}</annotation></semantics> which a bigraded group valued invariant of graphs which is functorial with respect to certain maps of graphs and has properties like Kunneth Theorem and long exact sequences: richer but harder to calculate than the graph magnitude.

In the following weeks we will hopefully see how Mike and Tom have generalized this construction up to the top of the picture above, namely to certain enriched categories with extra structure.

by willerton ( at March 19, 2018 03:51 PM

Peter Coles - In the Dark

Back to work in Cardiff..

So here I am, then. Back in the offices of the Data Innovation Research Institute in Cardiff for the first time in almost a fortnight. The four-week batch of strikes over pensions has come to an end so I have returned to work. No agreement has been reached so it seems likely there will be further industrial action, but for the time being normal services are being resumed. In fact it’s not exactly back to normal because there’s a large backlog of things to be done (including marking coursework and setting examinations), but I mean at least that lectures resume and will be delivered on the normal timetable. In my Physics of the Early Universe module I have only given four (two-hour) lectures and have missed three. I return to the fray tomorrow morning to give Lecture 8. Frustratingly, that’s the only lecture I have before the Easter break which starts on Friday and lasts for three weeks. Assuming there are no further strikes I’ll be giving lectures 9-12 (the last a revision lecture) after the holiday.

I now have to figure out how to cope with the six hours of lectures missed because of industrial action. That will be tricky, but I’ll do my level best to ensure that I cover everything needed for the examination. I spent most of this morning trying to figure out how to reorganise the remaining material, and I think I can do it as long as we don’t lose any more teaching time. At any rate I have made the decision not to give additional lectures to cover what I’ve missed. Owing to the timing of the strikes (and the fact that only work half the time here in Cardiff) I have been on strike for all the days I would have been working for three weeks. That means I will lose three full weeks pay. Even if it were logistically possible to fit in 6 hours of extra lectures after Easter, I don’t think it’s reasonable for me to do that for free.

While I was on strike a group of my students emailed the Vice-Chancellor of Cardiff University (copying me in) to point out how much teaching they were missing and request some form of compensation. I have a lot of sympathy for their situation and in no way do I want to damage their education. I will do what I can to mitigate the effect of the strike but won’t take any action that reduces the effectiveness of the industrial action. It remains to be seen if any compensation will be offered by the University management, but I think their best policy would be pressure Universities UK to stop pratting about and find a speedy settlement of the dispute.

Anyway, today is a bank holiday in Maynooth – St Patrick’s Day fell on a Saturday this year – and the rest of the week is `Study Week’, so there are no lectures or laboratory sessions. I’ll therefore be staying in Cardiff all week, which gives me the chance to go to a concert on Friday at St David’s Hall – the first for quite a while.

Now, time to get back to writing tomorrow’s lecture…

by telescoper at March 19, 2018 03:32 PM

March 18, 2018

Peter Coles - In the Dark

Anyone for Cricket?

Going through the mail that arrived during the ten days or so I’ve been in Ireland, and with the snow steadily descending outside my window, I find the handy booklet containing this year’s fixtures for Glamorgan County Cricket Club has arrived at last.

Glamorgan’s first County Championship match starts on April 20th, just a month away, but their first home game isn’t until May (against Kent) . Hopefully the snow will have melted by then!

I now have a bit of planning to do in order to fit in as much cricket as I can this summer in between trips to and from Ireland as well as conferences and other things…

by telescoper at March 18, 2018 11:13 AM

March 17, 2018

Cormac O’Raifeartaigh - Antimatter (Life in a puzzling universe)

Remembering Stephen Hawking

Like many physicists, I woke to some sad news early last Wednesday morning, and to a phoneful of requests from journalists for a soundbyte. In fact, although I bumped into Stephen at various conferences, I only had one significant meeting with him – he was intrigued by my research group’s discovery that Einstein once attempted a steady-state model of the universe. It was a slightly scary but very funny meeting during which his famous sense of humour was fully at play.

4Hawking_1 (1)

Yours truly talking steady-state cosmology with Stephen Hawking

I recalled the incident in a radio interview with RTE Radio 1 on Wednesday. As I say in the piece, the first words that appeared on Stephen’s screen were “I knew..” My heart sank as I assumed he was about to say “I knew about that manuscript“. But when I had recovered sufficiently to look again, what Stephen was actually saying was “I knew ..your father”. Phew! You can find the podcast here.

Image result for cormac o raifeartaigh stephen hawking

Hawking in conversation with my late father (LHS) and with Ernest Walton (RHS)

RTE TV had a very nice obituary on the Six One News, I have a cameo appearence a few minutes into the piece here.

In my view, few could question Hawking’s brilliant contributions to physics, or his outstanding contribution to the public awareness of science. His legacy also includes the presence of many brilliant young physicists at the University of Cambridge today. However, as I point out in a letter in today’s Irish Times, had Hawking lived in Ireland, he probably would have found it very difficult to acquire government funding for his work. Indeed, he would have found that research into the workings of the universe does not qualify as one of the “strategic research areas” identified by our national funding body, Science Foundation Ireland. I suspect the letter will provoke an angry from certain quarters, but it is tragically true.


The above notwithstanding, it’s important not to overstate the importance of one scientist. Indeed, today’s Sunday Times contains a good example of the dangers of science history being written by journalists. Discussing Stephen’s 1974 work on black holes, Bryan Appleyard states  “The paper in effect launched the next four decades of cutting edge physics. Odd flowers with odd names bloomed in the garden of cosmic speculation – branes, worldsheets , supersymmetry …. and, strangest of all, the colossal tree of string theory”.

What? String theory, supersymmetry and brane theory are all modern theories of particle physics (the study of the world of the very small). While these theories were used to some extent by Stephen in his research in cosmology (the study of the very large), it is ludicrous to suggest that they were launched by his work.


by cormac at March 17, 2018 08:27 PM

March 16, 2018

Sean Carroll - Preposterous Universe

Stephen Hawking’s Scientific Legacy

Stephen Hawking died Wednesday morning, age 76. Plenty of memories and tributes have been written, including these by me:

I can also point to my Story Collider story from a few years ago, about how I turned down a job offer from Hawking, and eventually took lessons from his way of dealing with the world.

Of course Hawking has been mentioned on this blog many times.

When I started writing the above pieces (mostly yesterday, in a bit of a rush), I stumbled across this article I had written several years ago about Hawking’s scientific legacy. It was solicited by a magazine at a time when Hawking was very ill and people thought he would die relatively quickly — it wasn’t the only time people thought that, only to be proven wrong. I’m pretty sure the article was never printed, and I never got paid for it; so here it is!

(If you’re interested in a much better description of Hawking’s scientific legacy by someone who should know, see this article in The Guardian by Roger Penrose.)

Stephen Hawking’s Scientific Legacy

Stephen Hawking is the rare scientist who is also a celebrity and cultural phenomenon. But he is also the rare cultural phenomenon whose celebrity is entirely deserved. His contributions can be characterized very simply: Hawking contributed more to our understanding of gravity than any physicist since Albert Einstein.

“Gravity” is an important word here. For much of Hawking’s career, theoretical physicists as a community were more interested in particle physics and the other forces of nature — electromagnetism and the strong and weak nuclear forces. “Classical” gravity (ignoring the complications of quantum mechanics) had been figured out by Einstein in his theory of general relativity, and “quantum” gravity (creating a quantum version of general relativity) seemed too hard. By applying his prodigious intellect to the most well-known force of nature, Hawking was able to come up with several results that took the wider community completely by surprise.

By acclimation, Hawking’s most important result is the realization that black holes are not completely black — they give off radiation, just like ordinary objects. Before that famous paper, he proved important theorems about black holes and singularities, and afterward studied the universe as a whole. In each phase of his career, his contributions were central.

The Classical Period

While working on his Ph.D. thesis in Cambridge in the mid-1960’s, Hawking became interested in the question of the origin and ultimate fate of the universe. The right tool for investigating this problem is general relativity, Einstein’s theory of space, time, and gravity. According to general relativity, what we perceive as “gravity” is a reflection of the curvature of spacetime. By understanding how that curvature is created by matter and energy, we can predict how the universe evolves. This may be thought of as Hawking’s “classical” period, to contrast classical general relativity with his later investigations in quantum field theory and quantum gravity.

Around the same time, Roger Penrose at Oxford had proven a remarkable result: that according to general relativity, under very broad circumstances, space and time would crash in on themselves to form a singularity. If gravity is the curvature of spacetime, a singularity is a moment in time when that curvature becomes infinitely big. This theorem showed that singularities weren’t just curiosities; they are an important feature of general relativity.

Penrose’s result applied to black holes — regions of spacetime where the gravitational field is so strong that even light cannot escape. Inside a black hole, the singularity lurks in the future. Hawking took Penrose’s idea and turned it around, aiming at the past of our universe. He showed that, under similarly general circumstances, space must have come into existence at a singularity: the Big Bang. Modern cosmologists talk (confusingly) about both the Big Bang “model,” which is the very successful theory that describes the evolution of an expanding universe over billions of years, and also the Big Bang “singularity,” which we still don’t claim to understand.

Hawking then turned his own attention to black holes. Another interesting result by Penrose had shown that it’s possible to extract energy from a rotating black hole, essentially by bleeding off its spin until it’s no longer rotating. Hawking was able to demonstrate that, although you can extract energy, the area of the event horizon surrounding the black hole will always increase in any physical process. This “area theorem” was both important in its own right, and also evocative of a completely separate area of physics: thermodynamics, the study of heat.

Thermodynamics obeys a set of famous laws. For example, the first law tells us that energy is conserved, while the second law tells us that entropy — a measure of the disorderliness of the universe — never decreases for an isolated system. Working with James Bardeen and Brandon Carter, Hawking proposed a set of laws for “black hole mechanics,” in close analogy with thermodynamics. Just as in thermodynamics, the first law of black hole mechanics ensures that energy is conserved. The second law is Hawking’s area theorem, that the area of the event horizon never decreases. In other words, the area of the event horizon of a black hole is very analogous to the entropy of a thermodynamic system — they both tend to increase over time.

Black Hole Evaporation

Hawking and his collaborators were justly proud of the laws of black hole mechanics, but they viewed them as simply a formal analogy, not a literal connection between gravity and thermodynamics. In 1972, a graduate student at Princeton University named Jacob Bekenstein suggested that there was more to it than that. Bekenstein, on the basis of some ingenious thought experiments, suggested that the behavior of black holes isn’t simply like thermodynamics, it actually is thermodynamics. In particular, black holes have entropy.

Like many bold ideas, this one was met with resistance from experts — and at this point, Stephen Hawking was the world’s expert on black holes. Hawking was certainly skeptical, and for good reason. If black hole mechanics is really just a form of thermodynamics, that means black holes have a temperature. And objects that have a temperature emit radiation — the famous “black body radiation” that played a central role in the development of quantum mechanics. So if Bekenstein were right, it would seemingly imply that black holes weren’t really black (although Bekenstein himself didn’t quite go that far).

To address this problem seriously, you need to look beyond general relativity itself, since Einstein’s theory is purely “classical” — it doesn’t incorporate the insights of quantum mechanics. Hawking knew that Russian physicists Alexander Starobinsky and Yakov Zel’dovich had investigated quantum effects in the vicinity of black holes, and had predicted a phenomenon called “superradiance.” Just as Penrose had showed that you could extract energy from a spinning black hole, Starobinsky and Zel’dovich showed that rotating black holes could emit radiation spontaneously via quantum mechanics. Hawking himself was not an expert in the techniques of quantum field theory, which at the time were the province of particle physicists rather than general relativists. But he was a quick study, and threw himself into the difficult task of understanding the quantum aspects of black holes, so that he could find Bekenstein’s mistake.

Instead, he surprised himself, and in the process turned theoretical physics on its head. What Hawking eventually discovered was that Bekenstein was right — black holes do have entropy — and that the extraordinary implications of this idea were actually true — black holes are not completely black. These days we refer to the “Bekenstein-Hawking entropy” of black holes, which emit “Hawking radiation” at their “Hawking temperature.”

There is a nice hand-waving way of understanding Hawking radiation. Quantum mechanics says (among other things) that you can’t pin a system down to a definite classical state; there is always some intrinsic uncertainty in what you will see when you look at it. This is even true for empty space itself — when you look closely enough, what you thought was empty space is really alive with “virtual particles,” constantly popping in and out of existence. Hawking showed that, in the vicinity of a black hole, a pair of virtual particles can be split apart, one falling into the hole and the other escaping as radiation. Amazingly, the infalling particle has a negative energy as measured by an observer outside. The result is that the radiation gradually takes mass away from the black hole — it evaporates.

Hawking’s result had obvious and profound implications for how we think about black holes. Instead of being a cosmic dead end, where matter and energy disappear forever, they are dynamical objects that will eventually evaporate completely. But more importantly for theoretical physics, this discovery raised a question to which we still don’t know the answer: when matter falls into a black hole, and then the black hole radiates away, where does the information go?

If you take an encyclopedia and toss it into a fire, you might think the information contained inside is lost forever. But according to the laws of quantum mechanics, it isn’t really lost at all; if you were able to capture every bit of light and ash that emerged from the fire, in principle you could exactly reconstruct everything that went into it, even the print on the book pages. But black holes, if Hawking’s result is taken at face value, seem to destroy information, at least from the perspective of the outside world. This conundrum is the “black hole information loss puzzle,” and has been nagging at physicists for decades.

In recent years, progress in understanding quantum gravity (at a purely thought-experiment level) has convinced more people that the information really is preserved. In 1997 Hawking made a bet with American physicists Kip Thorne and John Preskill; Hawking and Thorne said that information was destroyed, Preskill said that somehow it was preserved. In 2007 Hawking conceded his end of the bet, admitting that black holes don’t destroy information. However, Thorne has not conceded for his part, and Preskill himself thinks the concession was premature. Black hole radiation and entropy continue to be central guiding principles in our search for a better understanding of quantum gravity.

Quantum Cosmology

Hawking’s work on black hole radiation relied on a mixture of quantum and classical ideas. In his model, the black hole itself was treated classically, according to the rules of general relativity; meanwhile, the virtual particles near the black hole were treated using the rules of quantum mechanics. The ultimate goal of many theoretical physicists is to construct a true theory of quantum gravity, in which spacetime itself would be part of the quantum system.

If there is one place where quantum mechanics and gravity both play a central role, it’s at the origin of the universe itself. And it’s to this question, unsurprisingly, that Hawking devoted the latter part of his career. In doing so, he established the agenda for physicists’ ambitious project of understanding where our universe came from.

In quantum mechanics, a system doesn’t have a position or velocity; its state is described by a “wave function,” which tells us the probability that we would measure a particular position or velocity if we were to observe the system. In 1983, Hawking and James Hartle published a paper entitled simply “Wave Function of the Universe.” They proposed a simple procedure from which — in principle! — the state of the entire universe could be calculated. We don’t know whether the Hartle-Hawking wave function is actually the correct description of the universe. Indeed, because we don’t actually have a full theory of quantum gravity, we don’t even know whether their procedure is sensible. But their paper showed that we could talk about the very beginning of the universe in a scientific way.

Studying the origin of the universe offers the prospect of connecting quantum gravity to observable features of the universe. Cosmologists believe that tiny variations in the density of matter from very early times gradually grew into the distribution of stars and galaxies we observe today. A complete theory of the origin of the universe might be able to predict these variations, and carrying out this program is a major occupation of physicists today. Hawking made a number of contributions to this program, both from his wave function of the universe and in the context of the “inflationary universe” model proposed by Alan Guth.

Simply talking about the origin of the universe is a provocative step. It raises the prospect that science might be able to provide a complete and self-contained description of reality — a prospect that stretches beyond science, into the realms of philosophy and theology. Hawking, always provocative, never shied away from these implications. He was fond of recalling a cosmology conference hosted by the Vatican, at which Pope John Paul II allegedly told the assembled scientists not to inquire into the origin of the universe, “because that was the moment of creation and therefore the work of God.” Admonitions of this sort didn’t slow Hawking down; he lived his life in a tireless pursuit of the most fundamental questions science could tackle.


by Sean Carroll at March 16, 2018 11:23 PM

Ben Still - Neutrino Blog

Particle Physics Brick by Brick
It has been a very long time since I last posted and I apologise for that. I have been working the LEGO analogy, as described in the pentaquark series and elsewhere, into a book. The book is called Particle Physics Brick by Brick and the aim is to stretch the LEGO analogy to breaking point while covering as much of the standard model of particle physics as possible. I have had enormous fun writing it and I hope that you will enjoy it as much if you choose to buy it.

It has been available in the UK since September 2017 and you can buy it from Foyles / Waterstones / Blackwell's / AmazonUK where it is receiving ★★★★★ reviews

It is released in the US this Wednesday 21st March 2018 and you can buy it from all good book stores and 

I just wanted to share a few reviews of the book as well because it makes me happy!

Spend a few hours perusing these pages and you'll be in a much better frame of mind to understand your place in the cosmos... The astronomically large objects of the universe are no easier to grasp than the atomically small particles of matter. That's where Ben Still comes in, carrying a box of Legos. A British physicist with a knack for explaining abstract concepts... He starts by matching the weird properties and interactions described by the Standard Model of particle physics with the perfectly ordinary blocks of a collection of Legos. Quarks and leptons, gluons and charms are assigned to various colors and combinations of plastic bricks. Once you've got that system in mind, hang on: Still races off to illustrate the Big Bang, the birth of stars, electromagnetism and all matter of fantastical-sounding phenomenon, like mesons and beta decay. "Given enough plastic bricks, the rules in this book and enough time," Still concludes, "one might imagine that a plastic Universe could be built by us, brick by brick." Remember that the next time you accidentally step on one barefoot.--Ron Charles, The Washington Post

Complex topics explained simply An excellent book. I am Head of Physics at a school and have just ordered 60 copies of this for our L6th students for summer reading before studying the topic on particle physics early next year. Highly recommended. - Ben ★★★★★ AmazonUK

It's beautifully illustrated and very eloquently explains the fundamentals of particle ...
This is a gem of a pop science book. It's beautifully illustrated and very eloquently explains the fundamentals of particle physics without hitting you over the head with quantum field theory and Lagrangian dynamics. The author has done an exceptional job. This is a must have for all students and academics of both physics and applied maths! - Jamie ★★★★★ AmazonUK

by Ben ( at March 16, 2018 09:32 PM

March 15, 2018

Jester - Resonaances

Where were we?
Last time this blog was active, particle physics was entering a sharp curve. That the infamous 750 GeV resonance had petered out was not a big deal in itself - one expects these things to happen every now and then.  But the lack of any new physics at the LHC when it had already collected a significant chunk of data was a reason to worry. We know that we don't know everything yet about the fundamental interactions, and that there is a deeper layer of reality that needs to be uncovered (at least to explain dark matter, neutrino masses, baryogenesis, inflation, and physics at energies above the Planck scale). For a hundred years, increasing the energy of particle collisions has been the best way to increase our understanding of the basic constituents of nature. However, with nothing at the LHC and the next higher energy collider decades away, a feeling was growing that the progress might stall.

In this respect, nothing much has changed during the time when the blog was dormant, except that these sentiments are now firmly established. Crisis is no longer a whispered word, but it's openly discussed in corridors, on blogs, on arXiv, and in color magazines.  The clear message from the LHC is that the dominant paradigms about the physics at the weak scale were completely misguided. The Standard Model seems to be a perfect effective theory at least up to a few TeV, and there is no indication at what energy scale new particles have to show up. While everyone goes through the five stages of grief at their own pace, my impression is that most are already well past the denial. The open question is what should be the next steps to make sure that exploration of fundamental interactions will not halt. 

One possible reaction to a crisis is more of the same.  Historically, such an approach has often been efficient, for example it worked for a long time in the case of the Soviet economy. In our case one could easily go on with more models, more epicycles, more parameter space,  more speculations.  But the driving force for all these SusyWarpedCompositeStringBlackHairyHole enterprise has always been the (small but still) possibility of being vindicated by the LHC. Without serious prospects of experimental verification, model building is reduced to intellectual gymnastics that can hardly stir imagination.  Thus the business-as-usual is not an option in the long run: it couldn't elicit any enthusiasm among the physicists or the public,  it wouldn't attract new bright students, and thus it would be a straight path to irrelevance.

So, particle physics has to change. On the experimental side we will inevitably see, just for economical reasons, less focus on high-energy colliders and more on smaller experiments. Theoretical particle physics will also have to evolve to remain relevant.  Certainly, the emphasis needs to be shifted away from empty speculations in favor of more solid research. I don't pretend to know all the answers or have a clear vision of the optimal strategy, but I see three promising directions.

One is astrophysics where there are much better prospects of experimental progress.  The cosmos is a natural collider that is constantly testing fundamental interactions independently of current fashions or funding agencies.  This gives us an opportunity to learn more  about dark matter and neutrinos, and also about various hypothetical particles like axions or milli-charged matter. The most recent story of the 21cm absorption signal shows that there are still treasure troves of data waiting for us out there. Moreover, new observational windows keep opening up, as recently illustrated by the nascent gravitational wave astronomy. This avenue is of course a non-brainer, already explored since a long time by particle theorists, but I expect it will further gain in importance in the coming years. 

Another direction is precision physics. This, also, has been an integral part of particle physics research for quite some time, but it should grow in relevance. The point is that one can probe very heavy particles, often beyond the reach of present colliders,  by precisely measuring low-energy observables. In the most spectacular example, studying proton decay may give insight into new particles with masses of order 10^16 GeV - unlikely to be ever attainable directly. There is a whole array of observables that can probe new physics well beyond the direct LHC reach: a myriad of rare flavor processes, electric dipole moments of the electron and neutron, atomic parity violation, neutrino scattering,  and so on. This road may be long and tedious but it is bound to succeed: at some point some experiment somewhere must observe a phenomenon that does not fit into the Standard Model. If we're very lucky, it  may be that the anomalies currently observed by the LHCb in certain rare B-meson decays are already the first harbingers of a breakdown of the Standard Model at higher energies.

Finally, I should mention formal theoretical developments. The naturalness problem of the cosmological constant and of the Higgs mass may suggest some fundamental misunderstanding of quantum field theory on our part. Perhaps this should not be too surprising.  In many ways we have reached an amazing proficiency in QFT when applied to certain precision observables or even to LHC processes. Yet at the same time QFT is often used and taught in the same way as magic in Hogwarts: mechanically,  blindly following prescriptions from old dusty books, without a deeper understanding of the sense and meaning.  Recent years have seen a brisk development of alternative approaches: a revival of the old S-matrix techniques, new amplitude calculation methods based on recursion relations, but also complete reformulations of the QFT basics demoting the sacred cows like fields, Lagrangians, and gauge symmetry. Theory alone rarely leads to progress, but it may help to make more sense of the data we already have. Could better understanding or complete reformulating of QFT bring new answers to the old questions? I think that is  not impossible. 

All in all, there are good reasons to worry, but also tons of new data in store and lots of fascinating questions to answer.  How will the B-meson anomalies pan out? What shall we do after we hit the neutrino floor? Will the 21cm observations allow us to understand what dark matter is? Will China build a 100 TeV collider? Or maybe a radio telescope on the Moon instead?  Are experimentalists still needed now that we have machine learning? How will physics change with the centre of gravity moving to Asia?  I will tell you my take on such and other questions and  highlight old and new ideas that could help us understand the nature better.  Let's see how far I'll get this time ;)

by Mad Hatter ( at March 15, 2018 10:43 PM

ZapperZ - Physics and Physicists

SQUID: History and Applications
No, this is not the squid that you eat. It is the Superconducting Quantum Interference Device, which is really a very clear application of quantum mechanics via the use of superconductors.

This is a lecture presented by UC-Berkeley's John Clarke at the 2018 APS March Meeting.


by ZapperZ ( at March 15, 2018 05:50 PM

Tommaso Dorigo - Scientificblogging

Some Notes On Jester's Take On The Future Of HEP
I am very glad to observe that Adam Falkowsky has resumed his blogging activities (for how long, that's early to say). He published the other day a blog entry titled "Where were we", in which he offers his view of the present status of things in HEP and the directions he foresees for the field.
I was about to leave a comment there, but since I am a very discontinuous blog reader (you either write or read, in this business -no time for both things together) I feared I would then miss any reply or ensuing discussion. Not that I mean to say anything controversial or flippant; on the contrary, I mostly agree with Adam's assessment of the situation. With some distinguos.

read more

by Tommaso Dorigo at March 15, 2018 11:23 AM

Jester - Resonaances

Next stop: tth
This was a summer of brutally dashed hopes for a quick discovery of many fundamental particles that we were imagining. For the time being we need  to focus on the ones that actually exist, such as the Higgs boson. In the Run-1 of the LHC, the Higgs existence and identity were firmly established,  while its mass and basic properties were measured. The signal was observed with large significance in 4 different decay channels (γγ, ZZ*, WW*, ττ), and two different production modes (gluon fusion, vector-boson fusion) have been isolated.  Still, there remains many fine details to sort out. The realistic goal for the Run-2 is to pinpoint the following Higgs processes:
  • (h→bb): Decays to b-quarks.
  • (Vh): Associated production with W or Z boson. 
  • (tth): Associated production with top quarks. 

It seems that the last objective may be achieved quicker than expected. The tth production process is very interesting theoretically, because its rate is proportional to the (square of the) Yukawa coupling between the Higgs boson and top quarks. Within the Standard Model, the value of this parameter is known to a good accuracy, as it is related to the mass of the top quark. But that relation can be  disrupted in models beyond the Standard Model, with the two-Higgs-doublet model and composite/little Higgs models serving as prominent examples. Thus, measurements of the top Yukawa coupling will provide a crucial piece of information about new physics.

In the Run-1, a not-so-small signal of tth production was observed by the ATLAS and CMS collaborations in several channels. Assuming that Higgs decays have the same branching fraction as in the Standard Model, the tth signal strength normalized to the Standard Model prediction was estimated as

At face value, a strong evidence for the tth production was obtained in the Run-1! This fact was not advertised by the collaborations because the measurement is not clean due to a large number of top quarks produced by other processes at the LHC. The tth signal is thus a small blip on top of a huge background, and it's not excluded that some unaccounted for systematic errors are skewing the measurements. The collaborations thus preferred to play it safe, and wait for more data to be collected.

In the Run-2 with 13 TeV collisions the tth production cross section is 4-times larger than in the Run-1, therefore the new data are coming at a fast pace. Both ATLAS and CMS presented their first Higgs results in early August, and the tth signal is only getting stronger.  ATLAS showed their measurements in the γγ, WW/ττ, and bb final states of Higgs decay, as well as their combination:
Most channels display a signal-like excess, which is reflected by the Run-2 combination being 2.5 sigma away from zero. A similar picture is emerging in CMS, with 2-sigma signals in the γγ and WW/ττ channels. Naively combining all Run-1 and and Run-2 results one then finds
At face value, this is a discovery! Of course, this number should be treated with some caution because, due to large systematic errors, a naive Gaussian combination may not represent very well the true likelihood. Nevertheless, it indicates that, if all goes well, the discovery of the tth production mode should be officially announced in the near future, maybe even this year.

Should we get excited that the measured tth rate is significantly larger than Standard Model one? Assuming  that the current central value remains, it would mean that  the top Yukawa coupling is 40% larger than that predicted by the Standard Model. This is not impossible, but very unlikely in practice. The reason is that the top Yukawa coupling also controls the gluon fusion - the main Higgs production channel at the LHC - whose rate is measured to be in perfect agreement with the Standard Model.  Therefore, a realistic model that explains the large tth rate would also have to provide negative contributions to the gluon fusion amplitude, so as to cancel the effect of the large top Yukawa coupling. It is possible to engineer such a cancellation in concrete models, but  I'm not aware of any construction where this conspiracy arises in a natural way. Most likely, the currently observed excess is  a statistical fluctuation (possibly in combination with  underestimated theoretical and/or  experimental errors), and the central value will drift toward μ=1 as more data is collected. 

by Jester ( at March 15, 2018 11:21 AM

Jester - Resonaances

Weekend Plot: update on WIMPs
There's been a lot of discussion on this blog about the LHC not finding new physics.  I should however give justice to other experiments that also don't find new physics, often in a spectacular way. One area where this is happening is direct detection of WIMP dark matter. This weekend plot summarizes the current limits on the spin-independent scattering cross-section of dark matter particles on nucleons:
For large WIMP masses, currently the most succesful detection technology is to fill up a tank with a ton of liquid xenon and wait for a passing dark matter particle to knock one of the nuclei. Recently, we have had updates from two such experiments: LUX in the US, and PandaX in China, whose limits now cut below zeptobarn cross sections (1 zb = 10^-9 pb = 10^-45 cm^2). These two experiments are currently going head-to-head, but  Panda, being larger, will ultimately overtake LUX.  Soon, however,  it'll have to face a new fierce competitor: the XENON1T experiment, and the plot will have to be updated next year.  Fortunately, we won't need to be learning another prefix soon. Once yoctobarn sensitivity is achieved by the experiments, we will hit the neutrino floor:  the non-reducible background from solar and atmospheric neutrinos (gray area at the bottom of the plot). This will make detecting a dark matter signal much more challenging, and will certainly slow down the progress for WIMP masses larger than ~5 GeV. For lower masses,  the distance to the floor remains large. Xenon detectors lose their steam there, and another technology is needed, like germanium detectors of CDMS and CDEX, or CaWO4 crystals of CRESST. Also on this front important progress is expected soon.

What does the theory say about when we will find dark matter? It is perfectly viable that the discovery is waiting for us just behind the corner in the remaining space above the neutrino floor, but currently there's no strong theoretical hints in favor of that possibility. Usually, dark matter experiments advertise that they're just beginning to explore the interesting parameter space predicted by theory models.This is not quite correct.  If the WIMP were true to its name, that is to say if it was interacting via the weak force (meaning, coupled to Z with order 1 strength), it would have order 10 fb scattering cross section on neutrons. Unfortunately, that natural possibility was excluded in the previous century. Years of experimental progress have shown that the WIMPs, if they exist, must be interacting super-weakly with matter. For example, for a 100 GeV fermionic dark matter with the vector coupling g to the Z boson, the current limits imply g ≲ 10^-4. The coupling can be larger if the Higgs boson is the mediator of interactions between the dark and visible worlds, as the Higgs already couples very weakly to nucleons. This construction is, arguably, the most plausible one currently probed by direct detection experiments.  For a scalar dark matter particle X with mass 0.1-1 TeV  coupled to the Higgs via the interaction  λ v h |X|^2 the experiments are currently probing the coupling λ in the 0.01-1 ballpark. In general, there's no theoretical lower limit on the dark matter coupling to nucleons. Nevertheless, the weak coupling implied by direct detection limits creates some tension for the thermal production paradigm, which requires a weak (that is order picobarn) annihilation cross section for dark matter particles. This tension needs to be resolved by more complicated model building,  e.g. by arranging for resonant annihilation or for co-annihilation.

by Jester ( at March 15, 2018 11:20 AM

Lubos Motl - string vacua and pheno

Jester's unconstructive recommended HEP reforms
Yesterday, Adam Falkowski published his first blog post since September 2016, Where were we?. He starts by saying that particle physics is in crisis – which is no longer a prohibited word – because the LHC hasn't found any physics beyond the Standard Model.

He mentions texts by Hossenfelder, Giudice, and the Economist to prove that the "crisis" is being used. But there have always been people who preached about crises. More than two decades ago, in 1996, John Horgan published his "End of Science" diatribe. Ten years later, Šmoits introduced their own crisis hype. Adam, if you think that you're still substantially different from these three imbeciles, you might be wrong.

According to Falkowski, the continuation of particle physics as we knew it would be like the prolonged existence of the Soviet Union. Wow. Another troubling aspect of these assertions is that Falkowski continues to write business-as-usual papers on particle physics. Adam, maybe it's normal in your environment to do things that you consider worthless and be paid for them. But I think that you're showing the absence of academic integrity by doing so and I will always emphasize that this behavior is immoral.

Instead of "doing more of the same", Falkowski recommends some random buzzwords – switching to astrophysics, tabletop experiments, and precision physics, among a few others. The energy frontier should be abandoned. Be my guest: but you can only make decisions about yourself because you just denounced the offer to be the next Soviet dictator. ;-) You may switch – and you should switch – to astrophysics, tabletop experiments, and precision physics if you think it's a good idea. You're clearly not doing it – you're talking the talk but not walking the walk.

Some people work on astrophysics, tabletop experiments, and precision physics. They get some results that have only impressed others to a limited extent which is why others keep on doing other things. But you know what's happening when folks like Falkowski call for a random revolution in the field? They don't have any evidence that it would be an improvement – they just want to social engineer a new community and take credit simply for being consequential, whether or not it would bring anything good. That's wrong and it mustn't be allowed.

By the way, the increase of the energy is still the single most natural path to progress in experimental particle physics. All other experiments depend on much more specific and much more unlikely assumptions. So all these paths should be probed by somebody but Falkowski's call is nothing else than the call to ban the energy frontier and that's just totally wrong.

When Falkowski talked about the equivalence of particle physics and the Soviet Union, he also wrote the following:
But the driving force for all these SusyWarpedCompositeStringBlackHairyHole enterprise has always been the (small but still) possibility of being vindicated by the LHC.
But this is an absolute lie.

I've met most of the people who have studied SusyWarpedCompositeStringBlackHairyHole and I can assure you that the LHC has played virtually no role in their research. That the short-term experimental projects should determine what high energy theorists think about is a lie spread by Mr Sm*lin, Mr W*it, and similar feces. The more top-down or high-energy physics a theoretical physicist does, the more independent he or she is from any short-term events in experimental physics simply because the ongoing experiments are unlikely to address the most important questions directly.

In particular, an overwhelming majority of hep-th papers (as well as a substantial fraction of the hep-ph papers) has virtually nothing to do with any collider experiment of the current epoch because they're simply solving more far-reaching, long-term problems where the power of the human mind is more important than a powerful magnet. Feces may say that something like that should be impossible but it is possible and what feces say does not matter for science.

So please, Mr Falkowski, don't push for random revolutions within particle physics because your recommendations make no sense, have no justification, and are driven by your personal lack of excitement for the field. If you're not excited about your work, you should switch to another job, instead of poisoning the environment where many people actually know what they're doing and why they're doing it. You're clearly in the process of turning into another Hossenfelder-like lying piece of toxic šit and if you have a chance to stop it at all, you should do so as soon as possible. One vitriolic blog post that you write in two years may easily do more damage than all the positive things you have ever contributed to science.

by Luboš Motl ( at March 15, 2018 07:15 AM

March 14, 2018

ZapperZ - Physics and Physicists

Stephen Hawking: 1942–2018
Of course, the biggest physics news of the day is the passing of Stephen Hawking at the age of 76.

Unfortunately, as popular as he is in the public arena, it also means that he left us without being awarded the highest prize in physics, which is the Nobel prize. This isn't unusual, especially for a theorist, because there are many theorists whose contribution became of utmost importance only many years later after they are gone.

Still, as a scientist who had attained a highly-unusual superstar status among the public, I will not be surprised if he has had a lasting impact of the field, and the perception of the field among the public and aspiring physicists.

RIP, Stephen.


by ZapperZ ( at March 14, 2018 01:50 PM

Tommaso Dorigo - Scientificblogging

RIP Stephen Hawking
I do not keep crocodiles[*] in my drawer, so this short piece will have to do today.... Stephen Hawking, the world-renowned British cosmologist, passed away yesterday, and with him we lost not only a bright thinker and all-round scientist, but also a person who inspired two or three generations of students and researchers, thanks of his will to live and take part in active research in spite of the difficulties he had to face, which he always managed to take with irony. Confined on a wheelchair by ALS, and incapable of even speaking without electronic assistance, he always displayed uncommon sharpness and wit.

read more

by Tommaso Dorigo at March 14, 2018 11:20 AM

The n-Category Cafe

Stabilization of Derivators

(guest post by Ian Coley)

I recently published a paper to the arXiv which reconstructs an old paper of Alex Heller. Heller’s Stable homotopy theories and stabilization is one of a few proto-derivator papers that are still oft-cited by those of us studying derivators — a subject absent from this website since the two papers of Mike Shulman and Kate Ponto were published in 2014! Therefore before getting into the paper itself, it’s worth recalling what a derivator is supposed to be and do. For those interested in the long version, check out the nLab article or Moritz Groth’s excellent paper.

But for the short version: a prederivator is a strict 2-functor <semantics>𝔻:Cat opCAT<annotation encoding="application/x-tex">\mathbb{D}\colon\mathbf{Cat}^\text{op}\to\mathbf{CAT}</annotation></semantics>. For <semantics>J,KCat<annotation encoding="application/x-tex">J,K\in\mathbf{Cat}</annotation></semantics> and a functor <semantics>u:JK<annotation encoding="application/x-tex">u\colon J\to K</annotation></semantics>, we obtain a restriction functor <semantics>u *:𝔻(K)𝔻(J)<annotation encoding="application/x-tex">u^\ast\colon\mathbb{D}(K)\to\mathbb{D}(J)</annotation></semantics>. A natural transformation <semantics>α:uv<annotation encoding="application/x-tex">\alpha\colon u\to v</annotation></semantics> is mapped contravariantly to <semantics>α *:u *v *<annotation encoding="application/x-tex">\alpha^\ast\colon u^\ast\to v^\ast</annotation></semantics>.

We think of the domain of a derivator as diagrams in the shape of small categories, and the outputs coherent diagrams with values in some category <semantics>𝒞<annotation encoding="application/x-tex">\mathcal{C}</annotation></semantics>. Specifically, that category <semantics>𝒞<annotation encoding="application/x-tex">\mathcal{C}</annotation></semantics> is the value of <semantics>𝔻<annotation encoding="application/x-tex">\mathbb{D}</annotation></semantics> at the one-point category <semantics>e<annotation encoding="application/x-tex">e</annotation></semantics>. We’ll call <semantics>𝔻(e)<annotation encoding="application/x-tex">\mathbb{D}(e)</annotation></semantics> the underlying category of <semantics>𝔻<annotation encoding="application/x-tex">\mathbb{D}</annotation></semantics>.

A derivator is a prederivator satisfying some more axioms, most pertinent among them being the following: each <semantics>u *<annotation encoding="application/x-tex">u^\ast</annotation></semantics> admits a left adjoint <semantics>u !<annotation encoding="application/x-tex">u_!</annotation></semantics> and a right adjoint <semantics>u *<annotation encoding="application/x-tex">u_\ast</annotation></semantics> (called the homotopy Kan extensions along <semantics>u<annotation encoding="application/x-tex">u</annotation></semantics>). As a consequence, <semantics>𝔻(e)<annotation encoding="application/x-tex">\mathbb{D}(e)</annotation></semantics> admits all homotopy limits and homotopy colimits. For one final axiom, a pointed derivator is a derivator with a pointed base, which implies that each <semantics>𝔻(K)<annotation encoding="application/x-tex">\mathbb{D}(K)</annotation></semantics> is also pointed. In this situation, we obtain a suspension-loop adjunction on <semantics>𝔻(e)<annotation encoding="application/x-tex">\mathbb{D}(e)</annotation></semantics>, where suspension <semantics>Σ<annotation encoding="application/x-tex">\Sigma</annotation></semantics> is defined as the (homotopy) pushout of <semantics>0X0<annotation encoding="application/x-tex">0\leftarrow X\rightarrow 0</annotation></semantics> and loop <semantics>Ω<annotation encoding="application/x-tex">\Omega</annotation></semantics> is the (homotopy) pullback of <semantics>0X0<annotation encoding="application/x-tex">0\rightarrow X\leftarrow 0</annotation></semantics>. It’s worth emphasizing that, once we have a derivator whose base happens to be pointed, the suspension and loop are canonical adjoints and are determined by the higher structure of the derivator. Other interesting properties are also automatically satisfied, but we needn’t get into that here.

What examples in real life give rise to a pointed derivator? For any pointed combinatorial model category <semantics><annotation encoding="application/x-tex">\mathcal{M}</annotation></semantics>, we can consider its homotopy category <semantics>Ho<annotation encoding="application/x-tex">\operatorname{Ho}\mathcal{M}</annotation></semantics>. It can be shown that, for any small category <semantics>K<annotation encoding="application/x-tex">K</annotation></semantics>, <semantics> K<annotation encoding="application/x-tex">\mathcal{M}^K</annotation></semantics> admits both a projective and an injective model structure. This leads us to a pointed derivator <semantics>𝔻 <annotation encoding="application/x-tex">\mathbb{D}_{\mathcal{M}}</annotation></semantics>, defined by <semantics>𝔻 (K):=Ho( K)<annotation encoding="application/x-tex">\mathbb{D}_{\mathcal{M}}(K):=\operatorname{Ho}(\mathcal{M}^K)</annotation></semantics>. For <semantics>𝒞<annotation encoding="application/x-tex">\mathcal{C}</annotation></semantics> a pointed <semantics><annotation encoding="application/x-tex">\infty</annotation></semantics>-category, we have a similar derivator <semantics>𝔻 𝒞(K):=Ho(𝒞 N(K))<annotation encoding="application/x-tex">\mathbb{D}_{\mathcal{C}}(K):=\operatorname{Ho}(\mathcal{C}^{N(K)})</annotation></semantics>, where <semantics>N(K)<annotation encoding="application/x-tex">N(K)</annotation></semantics> is the nerve of <semantics>K<annotation encoding="application/x-tex">K</annotation></semantics>.

The next question is of stability. A pointed derivator is stable if the <semantics>(Σ,Ω)<annotation encoding="application/x-tex">(\Sigma,\Omega)</annotation></semantics> adjunction is an adjoint equivalence. The stable homotopy category of pointed spaces is constructed for precisely this purpose, that is, so that reduced suspension and loop space become inverse equivalences. There are many (classical) ways to perform this stabilization, and we aim for the abstract formulation in derivators.

If we are given a pointed derivator <semantics>𝔻<annotation encoding="application/x-tex">\mathbb{D}</annotation></semantics>, is there some universal nearest stable derivator? Put another way, is there some derivator <semantics>St𝔻<annotation encoding="application/x-tex">\operatorname{St}\mathbb{D}</annotation></semantics> so that morphisms from <semantics>𝔻<annotation encoding="application/x-tex">\mathbb{D}</annotation></semantics> into a stable derivator <semantics>𝕊<annotation encoding="application/x-tex">\mathbb{S}</annotation></semantics> are the “same” as morphisms <semantics>St𝔻𝕊<annotation encoding="application/x-tex">\operatorname{St}\mathbb{D}\to \mathbb{S}</annotation></semantics>? The answer is yes, and we have the following strong result.

Theorem 7.14. Let <semantics>Der !<annotation encoding="application/x-tex">\mathbf{Der}_!</annotation></semantics> be the 2-category with objects regular pointed derivators, maps cocontinuous morphisms of derivators, and natural transformations modifications. Let <semantics>StDer !<annotation encoding="application/x-tex">\mathbf{StDer}_!</annotation></semantics> be the full sub-2-category of stable derivators. There is a pseudofunctor <semantics>St:Der !StDer !<annotation encoding="application/x-tex">\operatorname{St}\colon\mathbf{Der}_!\to\mathbf{StDer}_!</annotation></semantics> which is left adjoint to the inclusion. Specifically, there is a universal cocontinuous morphism of derivators <semantics>stab:𝔻St𝔻<annotation encoding="application/x-tex"> \operatorname{stab}\colon\mathbb{D}\to\operatorname{St}\mathbb{D} </annotation></semantics> and precomposition with this morphism give an equivalence of categories of cocontinuous morphisms <semantics>Hom !(St𝔻,𝕊)stab *Hom !(𝔻,𝕊)<annotation encoding="application/x-tex"> \operatorname{Hom}_!(\operatorname{St}\mathbb{D},\mathbb{S})\overset{\operatorname{stab}^\ast}{\longrightarrow}\operatorname{Hom}_!(\mathbb{D},\mathbb{S}) </annotation></semantics> for any stable derivator <semantics>𝕊<annotation encoding="application/x-tex">\mathbb{S}</annotation></semantics>.

The adjective regular will be explained in a few paragraphs. To give a sketch of the proof, we first develop a theory of prespectrum objects in a derivator <semantics>𝔻<annotation encoding="application/x-tex">\mathbb{D}</annotation></semantics> (Notation 5.8). Consider the poset <semantics>V 2<annotation encoding="application/x-tex">V\subset\mathbb{Z}^2</annotation></semantics> on the objects <semantics>(i,j)<annotation encoding="application/x-tex">(i,j)</annotation></semantics> such that <semantics>|ij|1<annotation encoding="application/x-tex">|i-j|\leq 1</annotation></semantics>. We construct a pointed derivator which we name <semantics>Sp𝔻<annotation encoding="application/x-tex">\operatorname{Sp}\mathbb{D}</annotation></semantics> whose underlying category is the subcategory of <semantics>𝔻(V)<annotation encoding="application/x-tex">\mathbb{D}(V)</annotation></semantics> that vanish off the diagonal. That this is a pointed derivator is a consequence of Theorem 4.10, which was stated without proof by Heller (his Proposition 7.4). The proof is nontrivial but straightforward, and it’s a fantastic exercise in the calculus of mates.

An object <semantics>XSp𝔻(e)<annotation encoding="application/x-tex">X\in\operatorname{Sp}\mathbb{D}(e)</annotation></semantics> looks like <semantics> 0 X 1 0 X 0 0 X 1 0 <annotation encoding="application/x-tex">\array{&&0&\to&X_1\\ &&\uparrow&&\uparrow\\ 0&\to&X_0&\to&0\\ \uparrow&&\uparrow&&\\ X_{-1}&\to&0&&}</annotation></semantics> extending infinitely in both directions. The higher structure of the derivator encodes the comparison maps <semantics>σ n:X nΩX n+1<annotation encoding="application/x-tex">\sigma_n\colon X_n\to \Omega X_{n+1}</annotation></semantics> for all <semantics>n<annotation encoding="application/x-tex">n\in\mathbb{Z}</annotation></semantics> naturally in such an object, so we rightfully call such an <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> a prespectrum object. If all these comparison maps are isomorphisms, we call <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> a (stable) spectrum object. We let <semantics>St𝔻Sp𝔻<annotation encoding="application/x-tex">\operatorname{St}\mathbb{D}\subset\operatorname{Sp}\mathbb{D}</annotation></semantics> be the full subprederivator (i.e. full subcategories at each <semantics>KCat<annotation encoding="application/x-tex">K\in\mathbf{Cat}</annotation></semantics>) consisting of the stable spectrum objects.

There is no guarantee that the stable spectrum prederivator is actually a derivator, let alone a stable derivator. The first big theorem of the paper (Theorem 6.12) is to show that there is a localization <semantics>Sp𝔻St𝔻<annotation encoding="application/x-tex">\operatorname{Sp}\mathbb{D}\to\operatorname{St}\mathbb{D}</annotation></semantics>. Lemme 4.2 of Denis-Charles Cisinski in Catégories Dérivables says that the localization of any derivator is still a derivator, so that takes care of the first concern.

To prove that <semantics>St𝔻<annotation encoding="application/x-tex">\operatorname{St}\mathbb{D}</annotation></semantics> is stable, we now need to define the adjective regular. A derivator is called regular if filtered colimits commute with finite limits. The derivator associated to any <semantics>n<annotation encoding="application/x-tex">n</annotation></semantics>-topos for <semantics>n[0,]<annotation encoding="application/x-tex">n\in[0,\infty]</annotation></semantics> is regular, as is the derivator associated to a Grothendieck abelian category. One of the equivalent definitions of stability for a derivator is that all colimits commute with finite limits, so regularity is a kind of pre-stability assumption on <semantics>𝔻<annotation encoding="application/x-tex">\mathbb{D}</annotation></semantics>. In this case, <semantics>St𝔻<annotation encoding="application/x-tex">\operatorname{St}\mathbb{D}</annotation></semantics> is stable (Lemma 6.19 and Proposition 6.23). (As a side note: I am in the market for an alternative to regular as a descriptor for this situation and welcome audience suggestions.)

The last thing to settle is the universal property of the stabilization. Most specifically, the pseudonaturality of <semantics>St<annotation encoding="application/x-tex">\operatorname{St}</annotation></semantics> on morphisms in the domain. Heller’s original proof has a critical error that we take great pains to fix in §7 of the paper, specifically in Lemma 7.4. Heller writes down a diagram incoherently (his Diagram 9.3) that he lifts to a coherent object. Perhaps the crowning achievement of the paper is the construction of this diagram coherently, which takes place through a subposet of <semantics> 5<annotation encoding="application/x-tex">\mathbb{Z}^5</annotation></semantics>. Diagram yoga is absent in Heller’s original work, but our proof methodology has a diagrammatic flavor that is emblematic of the theory of derivators in modern scholarship. This makes it difficult to recreate the more ambitious diagrams within this blog, and so interested readers should check out the paper itself. Specifically, most of the subsections titled “Construction” repair baseless (though ultimately provable) claims by Heller and contain the most complex diagram shapes.

by shulman ( at March 14, 2018 07:02 AM

March 12, 2018

Clifford V. Johnson - Asymptotia

Signing Times

Well, @WalterIsaacson was signing at the same table as me at #sxsw so we got to catch up between doing our penmanship. Excited to read his Leonardo book. And he’s put #thedialoguesbook on his reading list! #graphicnovel A post shared by Clifford Johnson (@asymptotia) on Mar 11, 2018 at 1:38pm … Click to continue reading this post

The post Signing Times appeared first on Asymptotia.

by Clifford at March 12, 2018 10:59 PM

CERN Bulletin


Fresh snow layers
white veils
empty sheets of paper
waiting to be filled.
Microscopic crystals
with chaotic tracks.
A crazy flake dance
on top of red smiles
with warm caps
and gloves already soaked.

The snowmen melted
their members were sucked
by the frozen soil.
The sun appeared
with its irradiating smile
flying above the school.

March 12, 2018 04:03 PM

CERN Bulletin

24 April 2018: Ordinary General Assembly of the Staff Association!

In the first semester of each year, the Staff Association (SA) invites its members to attend and participate in the Ordinary General Assembly (OGA).

This year the OGA will be held on Thursday, 24 April 2018 from 14.00 to 16.00, Main Auditorium, Meyrin (500-1-001).

During the Ordinary General Assembly, the activity and financial reports of the SA are presented and submitted for approval to the members. This is the occasion to get a global view on the activities of the SA, its management, and an opportunity to express your opinion, particularly by taking part in votes. Other items are listed on the agenda, as proposed by the Staff Council.

Who can vote?

Ordinary members (MPE) of the SA can take part in all votes. Associated members (MPA) of the SA and/or affiliated pensioners have a right to vote on those topics that are of direct interest to them.

Who can give their opinion, and how?

The Ordinary General Assembly is also the opportunity for members of the SA to express themselves through the addition of discussion points to the agenda. For these points to be subjected to a vote, the request must be introduced in writing to the President of the Staff Association, at least 20 days before the General Assembly, and by at least 20 members of the SA. Additionally, members of the SA can ask the OGA to have a discussion on a specific point, after expiration of the agenda, but no decision shall be taken based on these discussions.

Can we contest the decisions?

Any decision taken by the Ordinary General Assembly can be contested through a referendum as defined in the Statute of the Staff Association.

Do not hesitate, take part in your Ordinary General Assembly on 24 April 2018. Come and make your voice count, and seize this occasion to exchange with your staff delegates!

Statutes of the CERN Staff Association:

March 12, 2018 12:03 PM

CERN Bulletin


Le GAC organise des permanences avec entretiens individuels qui se tiennent le dernier mardi de chaque mois, sauf en juillet et décembre.

La prochaine permanence se tiendra le :

Mardi 27 mars de 13 h 30 à 16 h 00
Salle de réunion de l’Association du personnel

Les permanences suivantes auront lieu les mardis 24 avril, 29 mai, 26 juin, 28 août, 25 septembre, 30 octobre et 27 novembre 2018.

Les permanences du Groupement des Anciens sont ouvertes aux bénéficiaires de la Caisse de pensions (y compris les conjoints survivants) et à tous ceux qui approchent de la retraite. Nous invitons vivement ces derniers à s’associer à notre groupement en se procurant, auprès de l’Association du personnel, les documents nécessaires.

Nous invitons vivement ces derniers à s’associer à notre groupement en se procurant, auprès de l’Association du personnel, les documents nécessaires.

Informations :
Formulaire de contact :

March 12, 2018 12:03 PM

CERN Bulletin

CERN Bulletin

Open Day at Crèche and School of the CERN Staff Association

In the morning of Saturday, 3 March 2018, the Crèche and School of the CERN Staff Association opened its doors for parents who wished to visit the establishment.

Once again, the Open Day was a great success and brought together more than 50 families for two information sessions, which included:

  • a general presentation of the establishment by the Headmistress, and
  • a visit of the facilities led by the Headmistress and her deputy.

At the end of the visit, parents were invited for a drink. This was an opportunity for both parents and professionals to have interesting discussions regarding the general conditions of the establishment and the pedagogical approach applied in the crèche and the school.

The management team was delighted to offer the parents the opportunity to participate in this event, where everyone could express their views, ask questions and find answers in a friendly and relaxed atmosphere.

March 12, 2018 12:03 PM

March 11, 2018

John Baez - Azimuth

Hypergraph Categories of Cospans


Two students in the Applied Category Theory 2018 school wrote a blog article about Brendan Fong’s theory of decorated cospans:

• Jonathan Lorand and Fabrizio Genovese, Hypergraph categories of cospans, The n-Category Café, 28 February 2018.

Jonathan Lorand is a math grad student at the University of Zurich working on symplectic and Poisson geometry with Alberto Cattaneo. Fabrizio Genovese is a grad student in computer science at the University of Oxford, working with Bob Coecke and Dan Marsden on categorical quantum mechanics, quantum field theory and the like.

Brendan was my student, so it’s nice to see newer students writing a clear summary of some of his thesis work, namely this paper:

• Brendan Fong, Decorated cospans, Theory and Applications of Categories 30 (2015), 1096–1120.

I wrote a summary of it myself, so I won’t repeat it here:

• John Baez, Decorated cospans, Azimuth, 1 May 2015.

What’s especially interesting to me is that both Jonathan and Fabrizio know some mathematical physics, and they’re part of a group who will be working with me on some problems as part of the Applied Category Theory 2018 school! Brendan and Blake Pollard and I used symplectic geometry and decorated cospans to study the black-boxing of electrical circuits and Markov processes… maybe we should try to go further with that project!

by John Baez at March 11, 2018 09:14 PM

The n-Category Cafe

Cognition, Convexity, and Category Theory

guest post by Tai-Danae Bradley and Brad Theilman

Recently in the Applied Category Theory Seminar our discussions have returned to modeling natural language, this time via Interacting Conceptual Spaces I by Joe Bolt, Bob Coecke, Fabrizio Genovese, Martha Lewis, Dan Marsden, and Robin Piedeleu. In this paper, convex algebras lie at the heart of a compositional model of cognition based on Peter Gärdenfors’ theory of conceptual spaces. We summarize the ideas in today’s post.

Sincere thanks go to Brendan Fong, Nina Otter, Fabrizio Genovese, Joseph Hirsh, and other participants of the seminar for helpful discussions and feedback.


A few weeks ago here at the Café, Cory and Jade summarized the main ideas behind the DisCoCat model, i.e. the categorical compositional distributional model of meaning developed in a 2010 paper by Coecke, Sadrzadeh, and Clark. Within the comments section of that blog entry, Coecke noted that the DisCoCat model is essentially a grammatical quantum field theory — a functor (morally) from a pregroup to finite dimensional real vector spaces. In this model, the meaning of a sentence is determined by the meanings of its constituent parts, which are themselves represented as vectors with meanings determined statistically. But as he also noted,

Vector spaces are extremely bad at representing meanings in a fundamental way, for example, lexical entailment, like tiger < big cat < mammal < animal can’t be represented in a vector space. At Oxford we are now mainly playing around with alternative models of meaning drawn from cognitive science, psychology and neuroscience. Our Interacting Conceptual Spaces I is an example of this….

This (ICS I) is the paper that we discuss in today’s blog post. It presents a new model in which words are no longer represented as vectors. Instead, they are regions within a conceptual space, a term coined by cognitive scientist Peter Gärdenfors in Conceptual Spaces: The Geometry of Thought. A conceptual space is a combination of geometric domains where convexity plays a key role. Intuitively, if we have a space representing the concept of fruit, and if two points in this space represent banana, then one expects that every point “in between” should also represent banana. The goal of ICS I is to put Gärdenfors’ idea on a more formal categorical footing, all the while adhering to the main principles of the DisCoCat model. That is, we still consider a functor out of a grammar category, namely the pregroup <semantics>Preg(n,s)<annotation encoding="application/x-tex">\mathsf{Preg}(n,s)</annotation></semantics>, freely generated by noun type <semantics>n<annotation encoding="application/x-tex">n</annotation></semantics> and sentence type <semantics>s<annotation encoding="application/x-tex">s</annotation></semantics>. (But in light of Preller’s argument as mentioned previously, we use the word functor with caution.) The semantics category, however, is no longer vector spaces but rather <semantics>ConvexRel,<annotation encoding="application/x-tex">\mathsf{ConvexRel},</annotation></semantics> the category of convex algebras and convex relations. We make these ideas and definitions precise below.


A convex algebra is, loosely speaking, a set equipped with a way of taking formal finite convex combinations of its elements. More formally, let <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics> be a set and let <semantics>D(A)<annotation encoding="application/x-tex">D(A)</annotation></semantics> denote the set of formal finite sums <semantics> ip ia i<annotation encoding="application/x-tex">\sum_i p_i a_i</annotation></semantics> of elements of <semantics>A,<annotation encoding="application/x-tex">A,</annotation></semantics> where <semantics>p i 0<annotation encoding="application/x-tex">p_i\in\mathbb{R}_{\geq 0}</annotation></semantics> and <semantics> ip i=1.<annotation encoding="application/x-tex">\sum_i p_i=1.</annotation></semantics> (We emphasize that this sum is formal. In particular, <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics> need not be equipped with a notion of addition or scaling.) A convex algebra is a set <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics> together with a function <semantics>α:D(A)A<annotation encoding="application/x-tex">\alpha\colon D(A)\to A</annotation></semantics>, called a “mixing operation,” that is well-behaved in the following sense:

  • the convex combination of a single element is itself, and
  • the two ways of evaluating a convex combination of a convex combination are equal.

For example, every convex subspace of <semantics> n<annotation encoding="application/x-tex">\mathbb{R}^n</annotation></semantics> is naturally a convex algebra. (And we can’t resist mentioning that convex subspaces of <semantics> n<annotation encoding="application/x-tex">\mathbb{R}^n</annotation></semantics> are also examples of algebras over the operad of topological simplices. But as we learned through a footnote in Tobias Fritz’s Convex Spaces I, it’s best to stick with monads rather than operads. Indeed, a convex algebra is an Eilenberg-Moore algebra of the finite distribution monad.) Join semilattices also provide an example of convex algebras. A finite convex combination of elements <semantics>a i<annotation encoding="application/x-tex">a_i</annotation></semantics> in the lattice is defined to be the join of those elements having non-zero coefficients: <semantics> ip ia i:= i{a i:p i0}<annotation encoding="application/x-tex">\sum_i p_i a_i:=\vee_i \{a_i:p_i\neq 0\}</annotation></semantics>. (In particular, the coefficients play no role on the right-hand side.)

Given two convex algebras <semantics>(A,α)<annotation encoding="application/x-tex">(A,\alpha)</annotation></semantics> and <semantics>(B,β)<annotation encoding="application/x-tex">(B,\beta)</annotation></semantics>, a convex relation is a binary relation <semantics>RA×B<annotation encoding="application/x-tex">R\subseteq A\times B</annotation></semantics> that respects convexity. That is, if <semantics>R(a i,b i)<annotation encoding="application/x-tex">R(a_i,b_i)</annotation></semantics> for all <semantics>i=1,,n<annotation encoding="application/x-tex">i=1,\ldots,n</annotation></semantics> then <semantics>R(α( i=1 np ia i),β( i=1 np ib i))<annotation encoding="application/x-tex">R\left(\alpha(\sum_{i=1}^n p_i a_i),\beta(\sum_{i=1}^n p_i b_i)\right)</annotation></semantics>. We then define <semantics>ConvexRel<annotation encoding="application/x-tex">\mathsf{ConvexRel}</annotation></semantics> to be the category with convex algebras as objects and convex relations as morphisms. Composition and identities are as for usual binary relations.

Now since in this model, the category of vector spaces is being replaced by <semantics>ConvexRel<annotation encoding="application/x-tex">\mathsf{ConvexRel}</annotation></semantics>, one hopes that (in keeping with the spirit of the DisCoCat model), the latter admits a symmetric monoidal compact closed structure. Indeed it does.

  • <semantics>ConvexRel<annotation encoding="application/x-tex">\mathsf{ConvexRel}</annotation></semantics> has a symmetric monoidal structure given by the Cartesian product: We use <semantics>(A,α)(B,β)<annotation encoding="application/x-tex">(A,\alpha)\otimes(B,\beta)</annotation></semantics> to denote the set <semantics>A×B<annotation encoding="application/x-tex">A\times B</annotation></semantics> equipped with mixing operation given by <semantics>D(A×B) A×B p i(a i,b i) (α(p ia i),β(p ib i)).<annotation encoding="application/x-tex"> \begin{aligned} D(A\times B)&\longrightarrow A\times B\\ \sum p_i(a_i,b_i)&\mapsto \left(\alpha(\sum p_i a_i),\beta(\sum p_i b_i)\right). \end{aligned} </annotation></semantics> The monoidal unit is the one-point set <semantics><annotation encoding="application/x-tex">\star</annotation></semantics> which has a unique convex algebra structure. We’ll denote this convex algebra by <semantics>I.<annotation encoding="application/x-tex">I.</annotation></semantics>
  • Each object in <semantics>ConvexRel<annotation encoding="application/x-tex">\mathsf{ConvexRel}</annotation></semantics> is self-dual, and cups and caps are given as follows: <semantics>η A:I(A,α)(A,α) is the relation{(,(a,a)):aA)} ϵ A:(A,α)(A,α)I is the relation{((a,a),):aA}<annotation encoding="application/x-tex">\begin{aligned} \eta_A\colon I \to(A,\alpha)\otimes(A,\alpha) \quad &\text{ is the relation} \quad \{(\star,(a,a)):a\in A)\}\\ \epsilon_A\colon (A,\alpha)\otimes(A,\alpha)\to I \quad &\text{ is the relation} \quad \{((a,a),\star):a\in A\} \end{aligned}</annotation></semantics>

The compact closed structure guarantees that <semantics>ConvexRel<annotation encoding="application/x-tex">\mathsf{ConvexRel}</annotation></semantics> fits nicely into the DisCoCat framework: words in a sentence are assigned types according to a chosen pregroup grammar, and a sentence is deemed grammatical if it reduces to type <semantics>s<annotation encoding="application/x-tex">s</annotation></semantics>. Moreover, these type reductions in <semantics>Preg(n,s)<annotation encoding="application/x-tex">\mathsf{Preg}(n,s)</annotation></semantics> give rise to corresponding morphisms in <semantics>ConvexRel<annotation encoding="application/x-tex">\mathsf{ConvexRel}</annotation></semantics> where the meaning of the sentence can be determined. We’ll illustrate this below by computing the meaning of the sentence

<semantics>bananastastesweet.<annotation encoding="application/x-tex"> b a n a n a s\; t a s t e\; s w e e t. </annotation></semantics>

To start, note that this sentence is comprised of three grammar types:

and each corresponds to a different conceptual space <semantics>noun, nNsentence, sSverb, n rsn lNSN<annotation encoding="application/x-tex">\text{noun, }\; n \rightsquigarrow N \qquad\qquad \text{sentence, }\; s \rightsquigarrow S \qquad\qquad \text{verb, }\; n^r s n^l \rightsquigarrow N\otimes S \otimes N</annotation></semantics> which we describe next.

Computing Meaning

The Noun Space <semantics>N<annotation encoding="application/x-tex">N</annotation></semantics>

A noun is a state <semantics>IN<annotation encoding="application/x-tex">I\to N</annotation></semantics>, i.e. a convex subset of the noun space <semantics>N<annotation encoding="application/x-tex">N</annotation></semantics>. Restricting our attention to food nouns, the space <semantics>N<annotation encoding="application/x-tex">N</annotation></semantics> is a product of color, taste, and texture domains: <semantics>N=N colorN tasteN texture 8<annotation encoding="application/x-tex">N=N_{\text{color}}\otimes N_{\text{taste}}\otimes N_{\text{texture}} \subset \mathbb{R}^8</annotation></semantics> where

  • <semantics>N color<annotation encoding="application/x-tex">N_{\text{color}}</annotation></semantics> is the RGB color cube, i.e. the set of all triples <semantics>(R,G,B)[0,1] 3<annotation encoding="application/x-tex">(R,G,B)\in [0,1]^3</annotation></semantics>.
  • <semantics>N taste<annotation encoding="application/x-tex">N_{\text{taste}}</annotation></semantics> is the taste tetrahedron, i.e. the convex hull of four basic tastes: sweet, sour, bitter, and salty.
  • <semantics>N texture<annotation encoding="application/x-tex">N_{\text{texture}}</annotation></semantics> is the unit interval <semantics>[0,1]<annotation encoding="application/x-tex">[0,1]</annotation></semantics> where 0 represents liquid and 1 represents solid.

The noun banana is then a product of three convex subregions of <semantics>N<annotation encoding="application/x-tex">N</annotation></semantics>:

That is, banana is the product of a yellow/green region, the convex hull of three points in the taste tetrahedron, and a subinterval of the texture interval. Other foods and beverages, avocados, chocolate, beer, etc. can be expressed similarly.

The Sentence Space <semantics>S<annotation encoding="application/x-tex">S</annotation></semantics>

The meaning of a sentence is a convex subset of a sentence space <semantics>S<annotation encoding="application/x-tex">S</annotation></semantics>. Here, <semantics>S<annotation encoding="application/x-tex">S</annotation></semantics> is chosen as a simple-yet-sensible space to capture one’s experience when eating and drinking. It is the join semilattice on four points

where in the first component, 0 = negative and 1 = positive, while in the second component, 0 = not surprising and 1 = surprising. For instance, <semantics>(0,1)<annotation encoding="application/x-tex">(0,1)</annotation></semantics> represents negative and surprising while the convex subset <semantics>{(1,1),(1,0)}<annotation encoding="application/x-tex">\{(1,1),(1,0)\}</annotation></semantics> represents positive.

The Verb Space <semantics>NSN<annotation encoding="application/x-tex">N\otimes S\otimes N</annotation></semantics>

A transitive verb is a convex subset of <semantics>NSN<annotation encoding="application/x-tex">N\otimes S\otimes N</annotation></semantics>. For instance, if we suppose momentarily that we live in a world in which one can survive on bananas and beer alone, then the verb taste can be represented by <semantics>taste =Conv({greenbanana{(0,0)}bitter} {greenbanana{(1,1)}sweet} {yellowbanana{(1,0)}sweet} {beer{(0,1)}sweet} {beer{(1,0)}bitter})<annotation encoding="application/x-tex"> \begin{aligned} t a s t e &= \text{Conv}( \{g r e e n\; b a n a n a \otimes\{(0,0)\}\otimes b i t t e r\}\\ &\,\cup \{g r e e n\; b a n a n a\otimes\{(1,1)\}\otimes s w e e t\}\\ &\,\cup \{y e l l o w\; b a n a n a\otimes\{(1,0)\}\otimes s w e e t\}\\ &\,\cup \{b e e r \otimes\{(0,1)\}\otimes s w e e t\}\\ &\,\cup \{b e e r\otimes\{(1,0)\}\otimes b i t t e r\}) \end{aligned} </annotation></semantics> where Conv denotes the convex hull of the argument. Here, green is an intersective adjective, so green banana is computed by taking the intersection of the banana space with the green region of the color cube. Likewise for yellow banana.

Tying it all together

Finally, we compute the meaning of bananas taste sweet, which has grammar type reduction <semantics>n(n rsn l)n(nn r)ss.<annotation encoding="application/x-tex">n(n^r s n^l)n\leq (nn^r )s\leq s.</annotation></semantics> In <semantics>ConvexRel<annotation encoding="application/x-tex">\mathsf{ConvexRel}</annotation></semantics>, this corresponds to the following morphism: <semantics>NbananasNSNtasteNsweetϵ N1 Sϵ NS<annotation encoding="application/x-tex">\overset{\text{bananas}}{N}\otimes\;\; \overset{\text{taste}}{N\;\; \otimes \;\; S\;\; \otimes \;\;N}\;\; \otimes \overset{\text{sweet}}{N}\;\;\;\overset{\epsilon_N\otimes 1_S\otimes\epsilon_N}{\longrightarrow}\;\;\; S</annotation></semantics> <semantics>bananastastesweet =(ϵ N1 Sϵ N)(bananastastesweet) =(ϵ N1 S)(banana(greenbanana{(1,1)} yellowbanana{(1,0)} beer{(0,1)})) ={(1,1),(1,0)} =positive<annotation encoding="application/x-tex"> \begin{aligned} b a n a n a s\; t a s t e\; s w e e t &= (\epsilon_N\otimes 1_S\otimes \epsilon_N)(b a n a n a s \otimes t a s t e \otimes s w e e t)\\ &=(\epsilon_N\otimes 1_S)(b a n a n a\otimes (g r e e n\; b a n a n a \otimes \{(1,1)\} \\ &\qquad\qquad\qquad\qquad\quad \cup y e l l o w\; b a n a n a \otimes\{(1,0)\} \\ &\qquad\qquad\qquad\qquad\quad \cup b e e r\otimes\{(0,1)\}))\\ &=\{(1,1),(1,0)\}\\ &= p o s i t i v e \end{aligned} </annotation></semantics> Note that the rightmost <semantics>ϵ N<annotation encoding="application/x-tex">\epsilon_N</annotation></semantics> selects subsets of the taste space that include sweet things and then deletes “sweet.” The leftmost <semantics>ϵ N<annotation encoding="application/x-tex">\epsilon_N</annotation></semantics> selects subsets of the taste space that include banana and then deletes “banana.” We are left with a convex subset of <semantics>S<annotation encoding="application/x-tex">S</annotation></semantics>, i.e. the meaning of the sentence.

Closing Remarks

Although not shown in the example above, one can also account for relative pronouns using certain morphisms called multi-wires or spiders (these arise from commutative special dagger Frobenius structures). The authors also give a toy example from the non-food world by modeling movement of a person from one location to another, using time and space to define new noun, sentence, and verb spaces.

In short, the conceptual spaces framework seeks to capture meaning in a way that resembles human thought more closely than the vector space model. This leaves us to puzzle over a couple of questions: 1) Do all concepts exhibit a convex structure? and 2) How might the conceptual spaces framework be implemented experimentally?

by john ( at March 11, 2018 04:29 PM

March 09, 2018

John Baez - Azimuth

An Upper Bound on Reidemeister Moves


Graham’s number is famous for being the largest number to have ever shown up in a proof. The true story is more complicated, as I discovered by asking Graham. But here’s a much smaller but still respectable number that showed up in knot theory:

2 \uparrow \uparrow (10 \uparrow 1,000,000)

It’s 2 to the 2 to the 2 to the 2… where we go on for 101,000,000 times. It appears in a 2011 paper by Coward and Lackenby. It shows up in their upper bound on how many steps it can take to wiggle around one picture of a link until you get another picture of the same link.

This upper bound is ridiculously large. But because this upper bound is computable, it follows that we can decide, in a finite amount of time, whether two pictures show the same link or not. We know when to give up. This had previously been unknown!

Here’s the paper:

• Alexander Coward and Marc Lackenby, An upper bound on Reidemeister moves, American Journal of Mathematics 136 (2014), 1023–1066.

Let me spell out the details a tiny bit more.

A link is a collection of circles embedded in 3-dimensional Euclidean space. We count two links as ‘the same’, or ‘ambient isotopic’, if we can carry one to another by a smooth motion where no circle ever crosses another. (This can be made more precise.) We can draw links in the plane:

and we can get between any two diagrams of the same link by distorting the plane and also doing a sequence of ‘Reidemeister moves’. There are 3 kinds of Reidemeister moves, shown above and also here:

Coward and Lackenby found an upper bound on how many Reidemeister moves it takes to get between two diagrams of the same link. Let n be the total number of crossings in both diagrams. Then we need at most 2 to the 2 to the 2 to the 2 to the 2… Reidemeister moves, where the number of 2’s in this tower is cn, where c = 101,000,000.

It’s fun to look at the paper and see how they get such a terrible upper bound. I’m sure they could have done much better with a bit of work, but that wasn’t the point. All they wanted was a computable upper bound.

Subsequently, Lackenby proved a polynomial upper bound on how many Reidemeister moves it takes to reduce a diagram of the unknot to a circle, like this:

If the original diagram has n crossings, he proved it takes at most (236n)11 Reidemeister moves. Because this is a polynomial, it follows that recognizing whether a knot diagram is a diagram of the unknot is in NP. As far as I know, it remains an open question whether this problem is in P.

• Marc Lackenby, A polynomial upper bound on Reidemeister moves, Annals of Mathematics 182 (2015), 491–564.

As a challenge, can you tell if this diagram depicts the unknot?

If you get stuck, read Lackenby’s paper!

To learn more about any of the pictures here, click on them. For example, this unknotting process:

showed up in this paper:

• Louis Kauffman and Sofia Lambropoulou, Hard unknots and collapsing tangles, in Introductory Lectures On Knot Theory: Selected Lectures Presented at the Advanced School and Conference on Knot Theory and Its Applications to Physics and Biology, 2012, pp. 187–247.

I bumped into Coward and Lackenby’s theorem here:

• Evelyn Lamb, Laura Taalman’s Favorite Theorem, Scientific American, 8 March 2018.

It says:

Taalman’s favorite theorem gives a way to know for sure whether a knot is equivalent to the unknot, a simple circle. It shows that if the knot is secretly the unknot, there is an upper bound, based on the number of crossings in a diagram of the knot, to the number of Reidemeister moves you will have to do to reduce the knot to a circle. If you try every possible sequence of moves that is at least that long and your diagram never becomes a circle, you know for sure that the knot is really a knot and not an unknot. (Say that ten times fast.)

Taalman loves this theorem not only because it was the first explicit upper bound for the question but also because of how extravagant the upper bound is. In the original paper proving this theorem, Joel Haas and Jeffrey Lagarias got a bound of

2^{n 10^{11}}

where n is the number of crossings in the diagram. That’s 2 to the n hundred billionth power. Yikes! When you try to put that number into the online calculator Wolfram Alpha, even for a very small number of crossings, the calculator plays dead.

Dr. Taalman also told us about another paper, this one by Alexander Coward and Marc Lackenby, that bounds the number of Reidemeister moves needed to show whether any two given knot diagrams are equivalent. That bound involves towers of powers that also get comically large incredibly quickly. They’re too big for me to describe how big they are.

So, I wanted to find out how big they are!

If you want a more leisurely introdution to the Haas–Lagarias result, try the podcast available at Eveyln Lamb’s article, or this website:

• Kevin Knudson, My favorite theorem: Laura Talman, Episode 14.

by John Baez at March 09, 2018 05:22 PM

Tommaso Dorigo - Scientificblogging

On Lawrence Krauss, BuzzFeed, And #MeToo
Large amounts of ink (well, electrons) have been spilt over the web in the past few months to discuss the #MeToo movement. It seems this blog will eventually join the crowd, although a bit belatedly, and with a slightly different viewing angle. 
After keeping silent on the matter, I am stimulated to discuss it after a BuzzFeed article exposed several cases of alleged sexual harassment and related inappropriate behavior by world-class cosmologist-cum-science-pop-guy-cum-skeptic Lawrence Krauss. Plus, yesterday was international women's day, and I never miss a chance to miss a deadline.

read more

by Tommaso Dorigo at March 09, 2018 11:48 AM

March 08, 2018

Clifford V. Johnson - Asymptotia

An Exhibit!

There’s actually an exhibit of process art for my book in the Fine Arts library at USC! Maybe of interest. There will be a companion exhibit about graphic novels over in the science and engineering library. Opening shortly. There’s actually an exhibit of process art for my book in the … Click to continue reading this post

The post An Exhibit! appeared first on Asymptotia.

by Clifford at March 08, 2018 06:07 AM

The n-Category Cafe

Cartesian Bicategories

guest post by Daniel Cicala and Jules Hedges

We continue the Applied Category Theory Seminar with a discussion of Carboni and Walters’ paper Cartesian Bicategories I. The star of this paper is the notion of ‘bicategories of relations’. This is an abstraction of relations internal to a category. As such, this paper provides excellent, if technical, examples of internal relations and other internal category theory concepts. In this post, we discuss bicategories of relations while occasionally pausing to enjoy some internal category theory such as relations, adjoints, monads, and the Kleisli construction.

We’d like to thank Brendan Fong and Nina Otter for running such a great seminar. We’d also like to thank Paweł Sobociński and John Baez for helpful discussions.

Shortly after Bénabou introduced bicategories, a program was initiated to study these through profunctor bicategories. Carboni and Walters, however, decided to study bicategories with a more relational flavor. This is not quite as far a departure as one might think. Indeed, relations and profunctors are connected. Let’s recall two facts:

  • a profunctor from <semantics>C<annotation encoding="application/x-tex">C</annotation></semantics> to <semantics>D<annotation encoding="application/x-tex">D</annotation></semantics> is a functor from <semantics>D op×C<annotation encoding="application/x-tex">D^{op} \times C</annotation></semantics> to <semantics>Set<annotation encoding="application/x-tex">Set</annotation></semantics>, and

  • a relation between sets <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics> and <semantics>y<annotation encoding="application/x-tex">y</annotation></semantics> can be described with a <semantics>{0,1}<annotation encoding="application/x-tex">\{0,1\}</annotation></semantics>-valued matrix of size <semantics>x×y<annotation encoding="application/x-tex">x \times y</annotation></semantics>.

Heuristically, profunctors can be thought of as a generalization of relations when considering profunctors as “<semantics>Set<annotation encoding="application/x-tex">\mathbf{Set}</annotation></semantics>-valued matrix of size <semantics>ob(C)×ob(D)<annotation encoding="application/x-tex">\text{ob} (C) \times \text{ob} (D)</annotation></semantics>”. As such, a line between profunctors and relations appears. In Cartesian Bicategories I, authors Carboni and Walters walk this line and continue a study of bicategories from a relational viewpoint.

The primary accomplishment of this paper is to characterize ‘bicategories of internal relations’ <semantics>Rel(E)<annotation encoding="application/x-tex">\mathbf{Rel}(E)</annotation></semantics> and of ‘ordered objects’ <semantics>Ord(E)<annotation encoding="application/x-tex">\mathbf{Ord}(E)</annotation></semantics> in a regular category <semantics>E<annotation encoding="application/x-tex">E</annotation></semantics>. To do this, the authors begin by introducing the notion of Cartesian bicategory, an early example of a bicategory with a monoidal product. They then explore bicategories of relations, which are Cartesian bicategories whose objects are Frobenius monoids. The name “bicategories of relations” indicates their close relationship with classical relations <semantics>Rel<annotation encoding="application/x-tex">\mathbf{Rel}</annotation></semantics>.

We begin by defining the two most important examples of a bicategory of relations: <semantics>Rel(E)<annotation encoding="application/x-tex">\mathbf{Rel}(E)</annotation></semantics> and <semantics>Ord(E)<annotation encoding="application/x-tex">\mathbf{Ord}(E)</annotation></semantics>. Knowing these bicategories will ground us as we wade through the theory of Cartesian bicategories. We finish by characterizing <semantics>Rel(E)<annotation encoding="application/x-tex">\mathbf{Rel}(E)</annotation></semantics> and <semantics>Ord(E)<annotation encoding="application/x-tex">\mathbf{Ord}(E)</annotation></semantics> in terms of the developed theory.

Internal relations

In set theory, a relation from <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics> to <semantics>y<annotation encoding="application/x-tex">y</annotation></semantics> is a subset of <semantics>x×y<annotation encoding="application/x-tex">x \times y</annotation></semantics>. In category theory, things become more subtle. A relation <semantics>r<annotation encoding="application/x-tex">r</annotation></semantics> from <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics> to <semantics>y<annotation encoding="application/x-tex">y</annotation></semantics> internal to a category <semantics>C<annotation encoding="application/x-tex">C</annotation></semantics> is a ‘jointly monic’ <semantics>C<annotation encoding="application/x-tex">C</annotation></semantics>-span <semantics>xr 0r^r 1y<annotation encoding="application/x-tex">x \xleftarrow{r_0} \hat{r} \xrightarrow{r_1} y</annotation></semantics> That is, for any arrows <semantics>a,b:ur^<annotation encoding="application/x-tex">a , b \colon u \to \hat{r}</annotation></semantics> such that <semantics>r 0a=r 0b<annotation encoding="application/x-tex">r_0 a = r_0 b</annotation></semantics> and <semantics>r 1a=r 1b<annotation encoding="application/x-tex">r_1 a = r_1 b</annotation></semantics> hold, then <semantics>a=b<annotation encoding="application/x-tex">a = b</annotation></semantics>. In a category with products, this definition simplifies substantially; it is merely a monic arrow <semantics>r:r^x×y<annotation encoding="application/x-tex">r \colon \hat{r} \to x \times y</annotation></semantics>.

Given a span <semantics>xcwdy<annotation encoding="application/x-tex">x \xleftarrow{c} w \xrightarrow{d} y</annotation></semantics> and the relation <semantics>rr 0,r 1<annotation encoding="application/x-tex">r \coloneqq \langle r_0 , r_1 \rangle </annotation></semantics> from above, we say that <semantics>c<annotation encoding="application/x-tex">c</annotation></semantics> is <semantics>r<annotation encoding="application/x-tex">r</annotation></semantics>-related to <semantics>d<annotation encoding="application/x-tex">d</annotation></semantics> if there is an <semantics>wr^<annotation encoding="application/x-tex">w \to \hat{r}</annotation></semantics> so that

commutes. We will write <semantics>r:cd<annotation encoding="application/x-tex"> r \colon c \nrightarrow d</annotation></semantics> when <semantics>c<annotation encoding="application/x-tex">c</annotation></semantics> is <semantics>r<annotation encoding="application/x-tex">r</annotation></semantics>-related to <semantics>d<annotation encoding="application/x-tex">d</annotation></semantics>.

While we can talk about relations internal to <semantics>any<annotation encoding="application/x-tex">any</annotation></semantics> category, we cannot generally assemble them into another category. However, if we start with a regular category <semantics>E<annotation encoding="application/x-tex">E</annotation></semantics>, then there is a bicategory <semantics>Rel(E)<annotation encoding="application/x-tex">\mathbf{Rel}(E)</annotation></semantics> of relations internal to <semantics>E<annotation encoding="application/x-tex">E</annotation></semantics>. The objects are those of <semantics>E<annotation encoding="application/x-tex">E</annotation></semantics>. The arrows are the relations internal to <semantics>E<annotation encoding="application/x-tex">E</annotation></semantics> with composition given by pullback:

Additionally, we have a unique 2-cell, written <semantics>rs<annotation encoding="application/x-tex">r \leq s</annotation></semantics>, whenever <semantics>s:r 0r 1<annotation encoding="application/x-tex"> s \colon r_0 \nrightarrow r_1 </annotation></semantics>. Diagrammatically, <semantics>rs<annotation encoding="application/x-tex">r \leq s</annotation></semantics> if there exists a commuting diagram

Internal ordered objects

We are quite used to the idea of having an order on a set. But what about an order on a category? This is captured by <semantics>Ord(E),<annotation encoding="application/x-tex">\mathbf{Ord}(E),</annotation></semantics> the bicategory of ordered objects and ideals in a regular category <semantics>E<annotation encoding="application/x-tex">E</annotation></semantics>.

The objects of <semantics>Ord(E)<annotation encoding="application/x-tex">\mathbf{Ord}(E)</annotation></semantics> are ordered objects in <semantics>E<annotation encoding="application/x-tex">E</annotation></semantics>. An ordered object is a pair <semantics>(x,r)<annotation encoding="application/x-tex">(x,r)</annotation></semantics> consisting of an <semantics>E<annotation encoding="application/x-tex">E</annotation></semantics>-object <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics> and a reflexive and transitive relation <semantics>r:xx<annotation encoding="application/x-tex">r : x \to x</annotation></semantics> internal to <semantics>E<annotation encoding="application/x-tex">E</annotation></semantics>.

(Puzzle: <semantics>r<annotation encoding="application/x-tex"> r </annotation></semantics> is a monic of type <semantics>r:r^x×x<annotation encoding="application/x-tex"> r \colon \hat{r} \to x \times x</annotation></semantics>. Both reflexivity and transitivity can be defined using morphisms. What are the domains and codomains? What properties should be satisfied?)

The arrows of <semantics>Ord(E)<annotation encoding="application/x-tex">\mathbf{Ord}(E)</annotation></semantics> are a sort of ‘order preserving relation’ called an ideal. Precisely, an ideal <semantics>f:(x,r)(y,s)<annotation encoding="application/x-tex">f \colon (x,r) \to (y,s)</annotation></semantics> between ordered objects is a relation <semantics>f:xy<annotation encoding="application/x-tex">f \colon x \nrightarrow y</annotation></semantics> such that given

  • morphisms <semantics>a,a,b,b<annotation encoding="application/x-tex"> a , a' , b , b' </annotation></semantics> with a common domain <semantics>z<annotation encoding="application/x-tex"> z </annotation></semantics>, and

  • relations <semantics>r:aa<annotation encoding="application/x-tex"> r \colon a \nrightarrow a'</annotation></semantics>, <semantics>f:ab<annotation encoding="application/x-tex"> f \colon a' \nrightarrow b'</annotation></semantics>, and <semantics>s:bb<annotation encoding="application/x-tex"> s \colon b' \nrightarrow b </annotation></semantics>

then <semantics>f:ab<annotation encoding="application/x-tex"> f \colon a \nrightarrow b</annotation></semantics>.

In <semantics>Set<annotation encoding="application/x-tex"> \mathbf{Set} </annotation></semantics>, an ordered object is a preordered set and an ideal <semantics>f:(x,r)(y,s)<annotation encoding="application/x-tex"> f \colon (x,r) \to (y,s) </annotation></semantics> is a directed subset of <semantics>x×y<annotation encoding="application/x-tex">x \times y </annotation></semantics> with the property that if it contains <semantics>s<annotation encoding="application/x-tex"> s </annotation></semantics> and <semantics>ss<annotation encoding="application/x-tex"> s' \leq s </annotation></semantics>, then it contains <semantics>s<annotation encoding="application/x-tex"> s' </annotation></semantics>.

There is at most a single 2-cell between parallel arrows in <semantics>Ord(E)<annotation encoding="application/x-tex">\mathbf{Ord}(E)</annotation></semantics>. Given <semantics>f,g:(x,r)(y,s)<annotation encoding="application/x-tex">f , g \colon (x,r) \to (y,s)</annotation></semantics>, write <semantics>fg<annotation encoding="application/x-tex">f \leq g</annotation></semantics> whenever <semantics>g:f 0f 1<annotation encoding="application/x-tex"> g \colon f_0 \nrightarrow f_1 </annotation></semantics>.

Cartesian Bicategories

Now that we know what bicategories we have the pleasure of working with, we can move forward with the theoretical aspects. As we work through the upcoming definitions, it is helpful to recall our motivating examples <semantics>Rel(E)<annotation encoding="application/x-tex">\mathbf{Rel}(E)</annotation></semantics> and <semantics>Ord(E)<annotation encoding="application/x-tex">\mathbf{Ord}(E)</annotation></semantics>.

As mentioned above, in the early days of bicategory theory, mathematicians would study bicategories as <semantics>V<annotation encoding="application/x-tex">V</annotation></semantics>-enriched profunctor bicategories for some suitable <semantics>V<annotation encoding="application/x-tex">V</annotation></semantics>. A shrewd observation was made that when <semantics>V<annotation encoding="application/x-tex">V</annotation></semantics> is Cartesian, a <semantics>V<annotation encoding="application/x-tex">V</annotation></semantics>-profunctor bicategory has several important commonalities with <semantics>Rel(E)<annotation encoding="application/x-tex">\mathbf{Rel}(E)</annotation></semantics> and <semantics>Ord(E)<annotation encoding="application/x-tex">\mathbf{Ord}(E)</annotation></semantics>. Namely, there is the existence of a Cartesian product <semantics><annotation encoding="application/x-tex">\otimes</annotation></semantics>, plus for each object <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics>, a diagonal arrow <semantics>Δ:xxx<annotation encoding="application/x-tex">\Delta \colon x \to x \otimes x</annotation></semantics> and terminal object <semantics>ϵ:xI<annotation encoding="application/x-tex">\epsilon \colon x \to I</annotation></semantics>. With this insight, Carboni and Walters decided to take this structure as primitive.

To simplify coherence, we only look at locally posetal bicategories (i.e. <semantics>Pos<annotation encoding="application/x-tex">\mathbf{Pos}</annotation></semantics>-enriched categories). This renders 2-dimensional coherences redundant as all parallel 2-cells manifestly commute. This assumption also endows each hom-poset with 2-cells <semantics><annotation encoding="application/x-tex">\leq</annotation></semantics> and, as we will see, local meets <semantics><annotation encoding="application/x-tex">\wedge</annotation></semantics>. For the remainder of this article, all bicategories will be locally posetal unless otherwise stated.

Definition. A locally posetal bicategory <semantics>B<annotation encoding="application/x-tex">B</annotation></semantics> is Cartesian when equipped with

  • a symmetric tensor product <semantics>BBB<annotation encoding="application/x-tex">B \otimes B \to B</annotation></semantics>,

  • a cocommutative comonoid structure, <semantics>Δ x:xxx<annotation encoding="application/x-tex">\Delta_x \colon x \to x \otimes x</annotation></semantics>, and <semantics>ϵ x:xI<annotation encoding="application/x-tex">\epsilon_x \colon x \to I</annotation></semantics>, on every <semantics>B<annotation encoding="application/x-tex">B</annotation></semantics>-object <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics>

such that

  • every 1-arrow <semantics>r:xy<annotation encoding="application/x-tex">r \colon x \to y</annotation></semantics> is a lax comonoid homomorphism, i.e.

<semantics>Δ yr(rr)Δ xandϵ yrϵ x<annotation encoding="application/x-tex"> \Delta_y r \leq ( r \otimes r ) \Delta_x \quad \text{and} \quad \epsilon_y r \leq \epsilon_x </annotation></semantics>

  • for all objects <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics>, both <semantics>Δ x<annotation encoding="application/x-tex">\Delta_x</annotation></semantics> and <semantics>ϵ x<annotation encoding="application/x-tex">\epsilon_x</annotation></semantics> have right adjoints <semantics>Δ x *<annotation encoding="application/x-tex">\Delta^\ast_x</annotation></semantics> and <semantics>ϵ x *<annotation encoding="application/x-tex">\epsilon^\ast_x</annotation></semantics>.

Moreover, <semantics>(Δ x,ϵ x)<annotation encoding="application/x-tex">(\Delta_x , \epsilon_x)</annotation></semantics> is the only cocommutative comonoid structure on <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics> admitting right adjoints.

(Question: This definition contains a slight ambiguity in the authors use of the term “moreover”. Is the uniqueness property of the cocommutative comonoid structure an additional axiom or does it follow from the other axioms?)

If you’re not accustomed to thinking about adjoints internal to a general bicategory, place yourself for a moment in <semantics>Cat<annotation encoding="application/x-tex">\mathbf{Cat}</annotation></semantics>. Recall that adjoint functors are merely a pair of arrows (adjoint functors) together with a pair of 2-cells (unit and counit) obeying certain equations. But this sort of data can exist in any bicategory, not just <semantics>Cat<annotation encoding="application/x-tex">\mathbf{Cat}</annotation></semantics>. It is worth spending a minute to feel comfortable with this concept because, in what follows, adjoints play an important role.

Observe that the right adjoints <semantics>Δ x *<annotation encoding="application/x-tex">\Delta^\ast_x</annotation></semantics> and <semantics>ϵ x *<annotation encoding="application/x-tex">\epsilon^\ast_x</annotation></semantics> turn <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics> into a commutative monoid object, hence a bimonoid. The (co)commutative (co)monoid structure on an object <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics> extend to a tensor product on <semantics>xx<annotation encoding="application/x-tex">x \otimes x</annotation></semantics> as seen in this string diagram:

Ultimately, we want to think of arrows in a Cartesian bicategory as generalized relations. What other considerations are required to do this? To answer this, it is helpful to first think about what a generalized function should be.

For the moment, let’s use our <semantics>Set<annotation encoding="application/x-tex">\mathbf{Set}</annotation></semantics> based intuition. For a relation to be a function, we ask that every element of the domain is related to an element of the codomain (entireness) and that the relationship is unique (determinism). How do we encode these requirements into this new, general situation? Again, let’s use intuition from relations in <semantics>Set<annotation encoding="application/x-tex">\mathbf{Set}</annotation></semantics>. Let <semantics>rx×y<annotation encoding="application/x-tex">r \nrightarrow x \times y</annotation></semantics> be a relation and <semantics>r y×x<annotation encoding="application/x-tex">r^\circ \nrightarrow y \times x</annotation></semantics> be the relation defined by <semantics>r :yx<annotation encoding="application/x-tex"> r^{\circ} \colon y \nrightarrow x </annotation></semantics> whenever <semantics>r:xy<annotation encoding="application/x-tex">r \colon x \nrightarrow y</annotation></semantics>. To say that <semantics>r<annotation encoding="application/x-tex">r</annotation></semantics> is entire is equivalent to saying that the composite relation <semantics>r r<annotation encoding="application/x-tex">r^\circ r</annotation></semantics> contains the identity relation on <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics> (puzzle). To say that <semantics>r<annotation encoding="application/x-tex">r</annotation></semantics> is deterministic is to say that the composite relation <semantics>rr <annotation encoding="application/x-tex">rr^\circ</annotation></semantics> is contained by the identity (another puzzle). These two containments are concisely expressed by writing <semantics>1r r<annotation encoding="application/x-tex">1 \leq r^\circ r</annotation></semantics> and <semantics>rr 1<annotation encoding="application/x-tex">r r^\circ \leq 1</annotation></semantics>. Hence <semantics>r<annotation encoding="application/x-tex">r</annotation></semantics> and <semantics>r <annotation encoding="application/x-tex">r^\circ</annotation></semantics> form an adjoint pair! This leads us to the following definition.

Definition. An arrow of a Cartesian bicategory is a map when it has a right adjoint. Maps are closed under identity and composition. Hence, for any Cartesian bicategory <semantics>B<annotation encoding="application/x-tex">B</annotation></semantics>, there is the full subbicategory <semantics>Map(B)<annotation encoding="application/x-tex">\mathbf{Map}(B)</annotation></semantics> whose arrows are the maps in <semantics>B<annotation encoding="application/x-tex">B</annotation></semantics>.

(Puzzle: There is an equivalences of categories <semantics>EMap(Rel(E))<annotation encoding="application/x-tex">E \simeq \mathbf{Map}(\mathbf{Rel}(E))</annotation></semantics> for a regular category <semantics>E<annotation encoding="application/x-tex">E</annotation></semantics>. What does this say for <semantics>E:=Set<annotation encoding="application/x-tex">E := \mathbf{Set}</annotation></semantics>?)

We can now state what appears as Theorem 1.6 of the paper. Recall that <semantics>() *<annotation encoding="application/x-tex">(-)^\ast</annotation></semantics> refers to the right adjoint.

Theorem. Let <semantics>B<annotation encoding="application/x-tex">B</annotation></semantics> be a locally posetal bicategory. If <semantics>B<annotation encoding="application/x-tex">B</annotation></semantics> is Cartesian, then

  • <semantics>Map(B)<annotation encoding="application/x-tex">\mathbf{Map}(B)</annotation></semantics> has finite bicategorical products <semantics><annotation encoding="application/x-tex">\otimes</annotation></semantics>,

  • the hom-posets have finite meets <semantics><annotation encoding="application/x-tex">\wedge</annotation></semantics> (i.e. categorical products) and the identity arrow in <semantics>B(I,I)<annotation encoding="application/x-tex">B(I,I)</annotation></semantics> is maximal (i.e. a terminal object), and

  • bicategorical products and biterminal object in <semantics>Map(B)<annotation encoding="application/x-tex">\mathbf{Map}(B)</annotation></semantics> may be chosen so that <semantics>rs=(p *rp)(psp *)<annotation encoding="application/x-tex">r \otimes s = (p^\ast r p) \wedge (p s p^\ast)</annotation></semantics>,where <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics> denotes the appropriate projection.

Conversely, if the first two properties are satisfied and the third defines a tensor product, then <semantics>B<annotation encoding="application/x-tex">B</annotation></semantics> is Cartesian.

This theorem gives a nice characterisation of Cartesian bicategories. The first two axioms are straightforward enough, but what is the significance of the above tensor product equation?

It’s actually quite painless when you break it down. Note, every bicategorical product <semantics><annotation encoding="application/x-tex">\otimes</annotation></semantics> comes with projections <semantics>p<annotation encoding="application/x-tex">p</annotation></semantics> and inclusions <semantics>p *<annotation encoding="application/x-tex">p^\ast</annotation></semantics>. Now, let <semantics>r:wy<annotation encoding="application/x-tex">r \colon w \to y</annotation></semantics> and <semantics>s:xz<annotation encoding="application/x-tex">s \colon x \to z</annotation></semantics> which gives <semantics>rs:wxyz<annotation encoding="application/x-tex">r \otimes s \colon w \otimes x \to y \otimes z</annotation></semantics>. One canonical arrow of type <semantics>wxyz<annotation encoding="application/x-tex">w \otimes x \to y \otimes z</annotation></semantics> is <semantics>p *rp<annotation encoding="application/x-tex">p^\ast r p</annotation></semantics> which first projects to <semantics>w<annotation encoding="application/x-tex">w</annotation></semantics>, arrives at <semantics>y<annotation encoding="application/x-tex">y</annotation></semantics> via <semantics>r<annotation encoding="application/x-tex">r</annotation></semantics>, which then includes into <semantics>yw<annotation encoding="application/x-tex">y \otimes w</annotation></semantics>. The other arrow is similar, except we first project to <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics>. The above theorem says that by combining these two arrows with a meet <semantics><annotation encoding="application/x-tex">\wedge</annotation></semantics>, the only available operation, we get our tensor product.

Characterizing bicategories of internal relations

The next stage is to add to Cartesian bicategories the property that each object is a Frobenius monoid. In this section we will study such bicategories and see that Cartesian plus Frobenius provides a reasonable axiomatization of relations.

Recall that an object with monoid and comonoid structures is called a Frobenius monoid if the equation

holds. If you’re not familiar with this equation, it has an interesting history as outlined by Walters. Now, if every object in a Cartesian bicategory is a Frobenius monoid, we call it a bicategory of relations. This term is a bit overworked as it commonly refers to <semantics>Rel(E)<annotation encoding="application/x-tex">\mathbf{Rel}(E)</annotation></semantics>. Therefore, we will be careful to call the latter a “bicategory of internal relations”.

Why are bicatgories of relations better than simply Cartesian bicategories? For one, they admit a compact closed structure! This appears as Theorem 2.4 in the paper.

Theorem. A bicategory of relations has a compact closed structure. Objects are self-dual via the unit

<semantics>Δϵ x *:Ixx<annotation encoding="application/x-tex"> \Delta \epsilon^\ast_x \colon I \to x \otimes x </annotation></semantics>

and counit

<semantics>ϵΔ x *:xxI.<annotation encoding="application/x-tex"> \epsilon \Delta^\ast_x \colon x \otimes x \to I. </annotation></semantics>

Moreover, the dual <semantics>r <annotation encoding="application/x-tex">r^\circ</annotation></semantics> of any arrow <semantics>r:xy<annotation encoding="application/x-tex">r \colon x \to y</annotation></semantics> satisfies

<semantics>(rid)Δ(1r )Δr<annotation encoding="application/x-tex"> (r \otimes id) \Delta \leq (1 \otimes r^\circ) \Delta r </annotation></semantics>


<semantics>Δ *(rid)rΔ *(1r ).<annotation encoding="application/x-tex"> \Delta^\ast (r \otimes id) \leq r \Delta^\ast (1 \otimes r^\circ). </annotation></semantics>

Or if you prefer string diagrams, the above inequalities are respectively


Because a bicategory of relations is Cartesian, maps are still present. In fact, they have a very nice characterization here.

Lemma. In a bicategory of relations, an arrow <semantics>r<annotation encoding="application/x-tex">r</annotation></semantics> is a map iff it is a (strict) comonoid homomorphism iff <semantics>rr <annotation encoding="application/x-tex">r \dashv r^\circ</annotation></semantics>.

As one would hope, the adjoint of a map corresponds with the involution coming from the compact closed structure. The following corollary provides further evidence that maps are well-behaved.

Corollary. In a bicategory of relations:

  • <semantics>f<annotation encoding="application/x-tex">f</annotation></semantics> is a map implies <semantics>f =f *<annotation encoding="application/x-tex">f^\circ = f^\ast</annotation></semantics>. In particular, multiplication is adjoint to comultiplication and the unit is adjoint to the counit.

  • for maps <semantics>f<annotation encoding="application/x-tex">f</annotation></semantics> and <semantics>g<annotation encoding="application/x-tex">g</annotation></semantics>, if <semantics>fg<annotation encoding="application/x-tex">f \leq g</annotation></semantics> then <semantics>f=g<annotation encoding="application/x-tex">f=g</annotation></semantics>.

But maps don’t merely behave in a nice way. They also contain a lot of the information about a Cartesian bicategory and, when working with bicategories of relations, the local information is quite fruitful too. This is made precise in the following corollary.

Corollary. Let <semantics>F:BD<annotation encoding="application/x-tex">F \colon B \to D</annotation></semantics> be a pseudofunctor between bicategories of relations. The following are equivalent:

  • <semantics>F<annotation encoding="application/x-tex">F</annotation></semantics> strictly preserves the Frobenius structure.

  • The restriction <semantics>F:Map(B)Map(D)<annotation encoding="application/x-tex">F \colon \mathbf{Map}(B) \to \mathbf{Map}(D)</annotation></semantics> strictly preserves the comonoid structure.

  • <semantics>F<annotation encoding="application/x-tex">F</annotation></semantics> preserves local meets and <semantics>I<annotation encoding="application/x-tex">I</annotation></semantics>.

Characterizing bicategories of internal relations

The entire point of the theory developed above is to be able to prove things about certain classes of bicategories. In this section, we provide a characterization theorem for bicategories of internal relations. Freyd had already given this characterization using allegories. However, he relied on a proof by contradiction whereas using bicategories of relations allows for a constructive proof.

A bicategory of relations is meant to generalize bicategories of internal relations. Given a bicategory of relations, we’d like to know when an arrow is “like an internal relation”.

Definition. An arrow <semantics>r:xy<annotation encoding="application/x-tex">r \colon x \to y</annotation></semantics> is a tabulation if there exists maps <semantics>f:zx<annotation encoding="application/x-tex">f \colon z \to x</annotation></semantics> and <semantics>g:zy<annotation encoding="application/x-tex">g \colon z \to y</annotation></semantics> such that <semantics>r=gf *<annotation encoding="application/x-tex">r = g f^\ast</annotation></semantics> and <semantics>f *fg *g=1 z<annotation encoding="application/x-tex">f^\ast f \wedge g^\ast g = 1_z</annotation></semantics>.

This definition seems bizarre on its face, but it really is analogous to the jointly monic span-definition of an internal relation. That <semantics>r=gf *<annotation encoding="application/x-tex">r = g f^\ast</annotation></semantics> is saying that <semantics>r<annotation encoding="application/x-tex">r</annotation></semantics> is like a span <semantics>xfzgy<annotation encoding="application/x-tex">x \xleftarrow{f} z \xrightarrow{g} y</annotation></semantics>. The equation <semantics>f *fg *g=1 z<annotation encoding="application/x-tex">f^\ast f \wedge g^\ast g = 1_z</annotation></semantics> implies that this span is jointly monic (puzzle).

A bicategory of relations is called functionally complete if every arrow <semantics>r:xI<annotation encoding="application/x-tex">r \colon x \to I</annotation></semantics> has a tabulation <semantics>i:x rx<annotation encoding="application/x-tex">i \colon x_r \to x</annotation></semantics> and <semantics>t:x rI<annotation encoding="application/x-tex">t \colon x_r \to I</annotation></semantics>. One can show that the existence of these tabulations together with compact closedness is sufficient to obtain a unique (up to isomorphism) tabulation for every arrow. We now provide the characterization, presented as Theorem 3.5.

Theorem. Let <semantics>B<annotation encoding="application/x-tex">B</annotation></semantics> be a functionally complete bicategory of relations. Then:

  • <semantics>Map(B)<annotation encoding="application/x-tex">\mathbf{Map}(B)</annotation></semantics> is a regular category (all 2-arrows are trivial by an above corollary)

  • There is a biequivalence of bicategories <semantics>Rel(Map(B))B<annotation encoding="application/x-tex">\mathbf{Rel}(\mathbf{Map}(B)) \simeq B</annotation></semantics> obtained by sending the relation <semantics>f,g<annotation encoding="application/x-tex">\langle f,g \rangle</annotation></semantics> of <semantics>Rel(Map(B))<annotation encoding="application/x-tex">\mathbf{Rel}(\mathbf{Map}(B))</annotation></semantics> to the arrow <semantics>gf <annotation encoding="application/x-tex">g f^\circ</annotation></semantics> of <semantics>B<annotation encoding="application/x-tex">B</annotation></semantics>.

So all functionally complete bicategories of relations are bicategories of internal relations. An interesting quesion is whether any regular category can be realized as <semantics>Map(B)<annotation encoding="application/x-tex">\mathbf{Map}(B)</annotation></semantics> for some functionally complete bicategory of relations. Perhaps a knowledgeable passerby will gift us with an answer in the comments!

From this theorem, we can classify some important types of categories. For instance, bicategories of relations internal to a Heyting category are exactly the functionally complete bicategory of relations having all right Kan extensions. Bicategories of relations internal to an elementary topos are exactly the functionally complete bicategories of relations <semantics>B<annotation encoding="application/x-tex">B</annotation></semantics> such that <semantics>B(x,)<annotation encoding="application/x-tex">B(x,-)</annotation></semantics> is representable in <semantics>Map(B)<annotation encoding="application/x-tex">\mathbf{Map}(B)</annotation></semantics> for all objects <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics>.

Characterizing ordered object bicategories

The goal of this section is to characterize the bicategory <semantics>Ord(E)<annotation encoding="application/x-tex">\mathbf{Ord}(E)</annotation></semantics> of ordered objects and ideals. We already introduced <semantics>Ord(E)<annotation encoding="application/x-tex">\mathbf{Ord}(E)</annotation></semantics> earlier, but that definition isn’t quite abstract enough for our purposes. An equivalent way of defining an ordered object in <semantics>E<annotation encoding="application/x-tex">E</annotation></semantics> is as an <semantics>E<annotation encoding="application/x-tex">E</annotation></semantics>-object <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics> together with a relation <semantics>r<annotation encoding="application/x-tex">r</annotation></semantics> on <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics> such that <semantics>1r<annotation encoding="application/x-tex">1 \leq r</annotation></semantics> and <semantics>rrr<annotation encoding="application/x-tex">r r \leq r</annotation></semantics>. Does this data look familiar? An ordered object in <semantics>E<annotation encoding="application/x-tex">E</annotation></semantics> is simply a monad in <semantics>Rel(E)<annotation encoding="application/x-tex">\mathbf{Rel}(E)</annotation></semantics>!

(Puzzle: What is a monad in a general bicategory? Hint: how are adjoints defined in a general bicategory?)

Quite a bit is known about monads, and we can now apply that knowledge to our study of <semantics>Ord(E)<annotation encoding="application/x-tex">\mathbf{Ord}(E)</annotation></semantics>.

Recall that any monad in <semantics>Cat<annotation encoding="application/x-tex">\mathbf{Cat}</annotation></semantics> gives rise to a category of adjunctions. The initial object of this category is the Kleisli category. Since the Kleisli category can be defined using a universal property, we can define a Kleisli object in any bicategory. In general, a Kleisli object for a monad <semantics>t:xx<annotation encoding="application/x-tex">t \colon x \to x</annotation></semantics> need not exist but when it does, it is defined as an arrow <semantics>k:xx t<annotation encoding="application/x-tex">k : x \to x_t</annotation></semantics> plus a 2-arrow <semantics>θ:ktk<annotation encoding="application/x-tex">\theta \colon k t \to k</annotation></semantics> such that, given any arrow <semantics>f:xy<annotation encoding="application/x-tex">f \colon x \to y</annotation></semantics> and 2-arrow <semantics>α:ftf<annotation encoding="application/x-tex">\alpha \colon f t \to f</annotation></semantics>, there exists a unique arrow <semantics>h:x ty<annotation encoding="application/x-tex">h \colon x_t \to y</annotation></semantics> such that <semantics>hk=f<annotation encoding="application/x-tex">h k = f</annotation></semantics>. The pasting diagrams involved also commute:

As in the case of working inside <semantics>Cat<annotation encoding="application/x-tex">\mathbf{Cat}</annotation></semantics>, we would expect for <semantics>k<annotation encoding="application/x-tex">k</annotation></semantics> to be on the left of an adjoint pair, and indeed it is. We get a right adjoint <semantics>k *<annotation encoding="application/x-tex">k^\ast</annotation></semantics> such that the composite <semantics>k *k<annotation encoding="application/x-tex">k^\ast k</annotation></semantics> is our original monad <semantics>t<annotation encoding="application/x-tex">t</annotation></semantics>. The benefit of working in the locally posetal case is we also have that <semantics>kk *=1<annotation encoding="application/x-tex">k k^\ast = 1</annotation></semantics>. This realizes <semantics>t<annotation encoding="application/x-tex">t</annotation></semantics> as an idempotent:

<semantics>tt=k *kk *k=k *k=t.<annotation encoding="application/x-tex"> t t = k^\ast k k^\ast k = k^\ast k = t. </annotation></semantics>

It follows that the Kleisli object construction is exactly an idempotent splitting of <semantics>t<annotation encoding="application/x-tex">t</annotation></semantics>! This means we can start with an exact category <semantics>E<annotation encoding="application/x-tex">E</annotation></semantics> and construct <semantics>Ord(E)<annotation encoding="application/x-tex">\mathbf{Ord}(E)</annotation></semantics> by splitting the idempotents of <semantics>Rel(E)<annotation encoding="application/x-tex">\mathbf{Rel}(E)</annotation></semantics>. With this in mind, we move on to the characterization, presented as Theorem 4.6.

Theorem. A bicategory <semantics>B<annotation encoding="application/x-tex">B</annotation></semantics> is biequivalent to a bicategory <semantics>Ord(E)<annotation encoding="application/x-tex">\mathbf{Ord}(E)</annotation></semantics> if and only if

  • <semantics>B<annotation encoding="application/x-tex">B</annotation></semantics> is Cartesian,

  • every monad in <semantics>B<annotation encoding="application/x-tex">B</annotation></semantics> has a Kleisli object,

  • for each object <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics>, there is a monad on <semantics>x<annotation encoding="application/x-tex">x</annotation></semantics> and a Frobenius monoid <semantics>x 0<annotation encoding="application/x-tex">x_0</annotation></semantics> that is isomorphic to the monad’s Kleisli object,

  • given a Frobenius monoid x and <semantics>f:xx<annotation encoding="application/x-tex">f \colon x \to x</annotation></semantics> with <semantics>f1<annotation encoding="application/x-tex">f \leq 1</annotation></semantics>, <semantics>f<annotation encoding="application/x-tex">f</annotation></semantics> splits.

Final words

The authors go on to look closer at bicategories of relations inside Grothendieck topoi and abelian categories. Both of these are regular categories, and so fit into the picture we’ve just painted. However, each have additional properties and structure that compels further study.

Much of what we have done can be done in greater generality. For instance, we can drop the local posetal requirement. However, this would greatly complicate matters by requiring non-trivial coherence conditions.

by john ( at March 08, 2018 01:51 AM

March 07, 2018

The n-Category Cafe

Hypergraph Categories of Cospans

guest post by Jonathan Lorand and Fabrizio Genovese

In the Applied Category Theory Seminar, we most recently read Brendan Fong’s article on decorated cospans. This construction is part of a larger framework, developed in Brendan Fong’s PhD thesis, for studying interconnected, open, network-style systems. A paradigmatic example: systems composed of electric circuits having input and output terminals, allowing for composition of smaller circuits into larger. An aim of Brendan’s framework is to give, for any such kind of system, a unified categorical way to describe both the formal, symbolic language of such systems (their “syntax”) as well as the behavior of the systems that these formal symbols represent (the “semantics”). For circuits: syntax is formal rules for combining circuit diagram nomenclature; semantics is a (mathematical) description of how real-life circuits behave in the presence of voltages, currents, etc.. Decorated cospans are a tool ideal for “syntax”; decorated corelations are designed to handle “semantics” and are flexible enough to model any so-called hypergraph category. We’ll focus on the former, and hint at the latter.

John Baez has written several nice blog posts already about decorated cospans, corelations, and his work with Brendan Fong on passive linear networks of circuits; we hope the present post navigates a course which complements these. We set the scene with hypergraph categories, then discuss (decorated) cospans, and give a very brief introduction to corelations.

We’d like to thank Brendan for his many helpful comments, his technical support, and diagram sharing. Also thanks to Brendan, Nina and the whole “Adjoint team” for a great seminar experience so far.

Hypergraph categories

The name “hypergraph category” is rather new, introduced in [F1] and [K]. The concept itself is apparently older, going back at least to Carboni and Walters.

First of all, why “hypergraph”? There are several notions which fall under the name “hypergraph”, and we do not follow conventions too closely here. For us, a hypergraph is simply a kind of graph which has different types of nodes and edges, with edges allowed to be possibly “open” in the sense that they are attached only on one end to some node. Composition of such graphs to build new graphs is possible by connecting open edges together (though we only allow edges of the same type to connect). The following picture illustrates such a composition, with different types of edges indicated by different types of lines.

A hypergraph category is a categorical version of hypergraphs, where we think of edges as objects and nodes as morphisms. Here is a concise definition, we’ll then do some unpacking: a hypergraph category is a symmetric monoidal category such that each object is equipped with the structure of a special commutative Frobenius monoid, and such that these structures satisfy a certain compatibility with the monoidal product.

We’ll use the string diagram language for symmetric monoidal categories, and introduce four new symbols corresponding to (co)monoid maps. A special commutative Frobenius monoid structure (SCFM structure) on an object <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> of a symmetric monoidal category <semantics>(C,)<annotation encoding="application/x-tex">(C, \otimes)</annotation></semantics> is given by maps

(called the multiplication, unit, comultiplication, and counit), satisfying the axioms of a commutative monoid

and cocommutative comonoid

as well as the “Frobenius” and “special” axioms

This set of axioms is not minimal; in particular, if a monoid and comonoid structure satisfies the Frobenius law, then commutativity and cocommutativity imply each other. Thus the term “commutative” Frobenius monoid entails “bicommutativity”.

Suppose an object <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> in <semantics>(C,)<annotation encoding="application/x-tex">(C, \otimes)</annotation></semantics> carries a SCFM structure. A coherence result know as the “spider theorem” tells us, roughly, that any morphism <semantics>f:X mX n<annotation encoding="application/x-tex">f: X^m \rightarrow X^n</annotation></semantics> described by a connected string diagram, and defined only using operations and canonical maps coming from <semantics>(C,)<annotation encoding="application/x-tex">(C,\otimes)</annotation></semantics> and the maps <semantics>(μ,η,δ,ϵ)<annotation encoding="application/x-tex">( \mu, \eta, \delta, \epsilon)</annotation></semantics>, can be depicted simply as a “spider”: a diagram with one node, <semantics>m<annotation encoding="application/x-tex">m</annotation></semantics> input legs and <semantics>n<annotation encoding="application/x-tex">n</annotation></semantics> output legs.

This result, though, is only about maps between monoidal powers of a single object <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics>. The interaction of SCFM structures on different objects of a hypergraph category is provided by the following compatibility axioms: in a hypergraph category we require

In this way one obtains a kind of category whose calculus of string diagrams reflects the initial intuition given by hypergraphs. More details, in particular on diagrammatics, can be found in [BGKSZ] and [K].

As indicated above, hypergraphs are a natural structure for describing open, network-like systems. The word “open” means that one has a notion of input, output and composition/interaction. Here is a summary of this “network intuition” in terms of hypergraph categories

One wishes also to compare/translate between different hypergraph categories. A hypergraph functor <semantics>(C,)(C,)<annotation encoding="application/x-tex">(C, \otimes) \rightarrow (C', \boxtimes)</annotation></semantics> between hypergraph categories is a strong symmetric monoidal functor <semantics>(F,φ)<annotation encoding="application/x-tex">(F, \varphi)</annotation></semantics> such that, for each object <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> of <semantics>C<annotation encoding="application/x-tex">C</annotation></semantics>, the SCFM structure <semantics>(FX,μ FX,η FX,δ FX,ϵ FX)<annotation encoding="application/x-tex">(F X, \mu_{F X}, \eta_{F X} ,\delta_{F X} , \epsilon_{F X})</annotation></semantics> on <semantics>FX<annotation encoding="application/x-tex">F X</annotation></semantics> must coincide with the one induced by <semantics>F<annotation encoding="application/x-tex">F</annotation></semantics>:

<semantics>(FX,Fμ Xφ X,X,Fη Xφ I,φ X,X 1Fδ X,φ I 1Fϵ X)<annotation encoding="application/x-tex"> (F X, F \mu_{X} \circ \varphi_{X,X}, F \eta_{X} \circ \varphi_I, \varphi_{X,X}^{-1} \circ F \delta_{X} , \varphi_I^{-1} \circ F\epsilon_{X}) </annotation></semantics>

An equivalence of hypergraph categories is an equivalence of symmetric monoidal categories which is built of hypergraph functors.

An important remark about hypergraph categories is that the SCFM maps <semantics>(μ X,η X,δ X,ϵ X)<annotation encoding="application/x-tex">(\mu_X, \eta_X, \delta_X, \epsilon_X)</annotation></semantics> on each object <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> in <semantics>C<annotation encoding="application/x-tex">C</annotation></semantics> are not required to be natural in <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics>. Morphisms in <semantics>C<annotation encoding="application/x-tex">C</annotation></semantics> need not interact in a coherent way with the SCFM structures. In particular, a given symmetric monoidal category may admit two hypergraph structures which are hypergraph inequivalent.

Another remark is that hypergraph categories are automatically compact closed, with every object self-dual: the maps


(think: cup and cap) satisfy

as well as the vertically reflected equations (think: snake!). The first equality uses the Frobenius law, the second the (co)unital laws.

Cospan Categories

Cospans lead to hypergraph categories following a familiar construction. One starts with a category <semantics>C<annotation encoding="application/x-tex">C</annotation></semantics> having finite colimits, viewing it as a symmetric monoidal category <semantics>(C,+)<annotation encoding="application/x-tex">(C, +)</annotation></semantics> with the coproduct as monoidal product. From this we define a category whose objects are the objects of <semantics>C<annotation encoding="application/x-tex">C</annotation></semantics>, and whose morphisms <semantics>XY<annotation encoding="application/x-tex">X \rightarrow Y</annotation></semantics> are cospans in <semantics>C<annotation encoding="application/x-tex">C</annotation></semantics>

considered up to a notion of isomorphism. The source and target of the cospan are the “feet”, while <semantics>N<annotation encoding="application/x-tex">N</annotation></semantics> is the “apex”. Cospans are isomorphic for us if they have the same feet and if there exists an isomorphism between their apexes which makes the evident triangles commute. The maps <semantics>i<annotation encoding="application/x-tex">i</annotation></semantics> and <semantics>o<annotation encoding="application/x-tex">o</annotation></semantics> (the “legs” of the cospan) are labeled to hint at the interpretation that we are thinking of <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> as representing inputs, and <semantics>Y<annotation encoding="application/x-tex">Y</annotation></semantics> as representing outputs.

Composition of cospans <semantics>Xi XNo YY<annotation encoding="application/x-tex">X \overset{i_X}{\rightarrow} N \overset{o_Y}{\leftarrow} Y</annotation></semantics> and <semantics>Yi YMo ZZ<annotation encoding="application/x-tex">Y \overset{i_Y}{\rightarrow} M \overset{o_Z}{\leftarrow} Z</annotation></semantics> is given by taking a pushout over the shared foot

Considering only isomorphism classes of cospans makes this composition well-defined and associative, and one checks that this builds us a category. It’s called <semantics>Cospan(C)<annotation encoding="application/x-tex">Cospan(C)</annotation></semantics>.

<semantics>Cospan(C)<annotation encoding="application/x-tex">Cospan(C)</annotation></semantics> has a symmetric monoidal structure induced from <semantics>C<annotation encoding="application/x-tex">C</annotation></semantics>, and comes with a canonical hypergraph structure. To see roughly how this works, note that we have an “identity on objects” embedding <semantics>CCospan(C)<annotation encoding="application/x-tex">C \hookrightarrow Cospan(C)</annotation></semantics> defined by sending a morphism <semantics>f:XY<annotation encoding="application/x-tex">f: X \rightarrow Y</annotation></semantics> in <semantics>C<annotation encoding="application/x-tex">C</annotation></semantics> to the cospan <semantics>XfY1 YY<annotation encoding="application/x-tex">X \overset{f}{\rightarrow} Y \overset{1_Y}{\leftarrow} Y</annotation></semantics>. This gives us a copy of <semantics>C<annotation encoding="application/x-tex">C</annotation></semantics> inside <semantics>Cospan(C)<annotation encoding="application/x-tex">Cospan(C)</annotation></semantics> and induces a monoidal structure on <semantics>Cospan(C)<annotation encoding="application/x-tex">Cospan(C)</annotation></semantics>.

The directional symmetry in the definition of a cospan means that cospans actually come in pairs

Call such cospans opposites of one another. Note that, alongside the above embedding, there is an analogous embedding <semantics>C opCospan(C)<annotation encoding="application/x-tex">C^{op} \hookrightarrow Cospan(C)</annotation></semantics>, sending <semantics>f op:YX<annotation encoding="application/x-tex">f^{op}: Y \rightarrow X</annotation></semantics> to <semantics>Y1 YYfX<annotation encoding="application/x-tex">Y \overset{1_Y}{\rightarrow} Y\overset{f}{\leftarrow} X</annotation></semantics>.

For each object <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> in <semantics>(C,+)<annotation encoding="application/x-tex">(C,+)</annotation></semantics> we have the copairing <semantics>[1 X,1 X]:X+XX<annotation encoding="application/x-tex">[1_X, 1_X]: X + X \rightarrow X</annotation></semantics> and a unique map <semantics>!:X<annotation encoding="application/x-tex">! : \emptyset \rightarrow X</annotation></semantics> from our initial object <semantics><annotation encoding="application/x-tex">\emptyset</annotation></semantics> in <semantics>(C,+)<annotation encoding="application/x-tex">(C,+)</annotation></semantics>. The images of these morphisms under <semantics>CCospan(C)<annotation encoding="application/x-tex">C \hookrightarrow Cospan(C)</annotation></semantics> give us a multiplication map <semantics>μ<annotation encoding="application/x-tex">\mu</annotation></semantics> and a unit map <semantics>η<annotation encoding="application/x-tex">\eta</annotation></semantics> for a commutative monoid structure on <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics>. By defining comultiplication <semantics>δ<annotation encoding="application/x-tex">\delta</annotation></semantics> and counit <semantics>ϵ<annotation encoding="application/x-tex">\epsilon</annotation></semantics> via the cospans which are opposite to <semantics>μ<annotation encoding="application/x-tex">\mu</annotation></semantics> and <semantics>η<annotation encoding="application/x-tex">\eta</annotation></semantics>, respectively, one obtains a SCFM on each object of <semantics>Cospan(C)<annotation encoding="application/x-tex">Cospan(C)</annotation></semantics>, with compatibility making <semantics>Cospan(C)<annotation encoding="application/x-tex">Cospan(C)</annotation></semantics> a hypergraph category.

As a simple example to play with, consider the category <semantics>FinSet<annotation encoding="application/x-tex">FinSet</annotation></semantics> whose objects are the sets <semantics><annotation encoding="application/x-tex">\emptyset</annotation></semantics>, <semantics>{1}<annotation encoding="application/x-tex">\{1\}</annotation></semantics>, <semantics>{1,2}<annotation encoding="application/x-tex">\{1, 2 \}</annotation></semantics>, <semantics>{1,2,3}<annotation encoding="application/x-tex">\{1,2,3\}</annotation></semantics>, … and whose morphisms are functions between these sets. This category has finite colimits, with initial object <semantics><annotation encoding="application/x-tex">\emptyset</annotation></semantics> and coproduct defined by

<semantics>{1,2,...,n}+{1,2,...,m}={1,2,..,n+m};<annotation encoding="application/x-tex">\{1,2,...,n\} + \{1,2,...,m\} = \{1,2,..,n+m \};</annotation></semantics>

these make <semantics>(FinSet,+)<annotation encoding="application/x-tex">(FinSet,+)</annotation></semantics> a symmetric strict monoidal category. We think of <semantics>{1}<annotation encoding="application/x-tex">\{1\}</annotation></semantics> as the generator of the objects; all other objects may be built from this one using the coproduct (thinking of <semantics><annotation encoding="application/x-tex">\emptyset</annotation></semantics> as the zero-th monoidal power of <semantics>{1}<annotation encoding="application/x-tex">\{1\}</annotation></semantics>). Graphically we depict <semantics>{1}<annotation encoding="application/x-tex">\{1\}</annotation></semantics> as a black dot, and any object <semantics>{1,2,...,n}<annotation encoding="application/x-tex">\{1,2,...,n\}</annotation></semantics> as a cloud of <semantics>n<annotation encoding="application/x-tex">n</annotation></semantics> dots (since our monoidal product is strictly commutative, the order of how we juxtapose our dots doesn’t matter, so we just use “clouds”).

Suppose we are composing cospans <semantics>Xi XNo YY<annotation encoding="application/x-tex">X \overset{i_X}{\rightarrow} N \overset{o_Y}{\leftarrow} Y</annotation></semantics> and <semantics>Yi YMo ZZ<annotation encoding="application/x-tex">Y \overset{i_Y}{\rightarrow} M \overset{o_Z}{\leftarrow} Z</annotation></semantics> in <semantics>FinSet<annotation encoding="application/x-tex">FinSet</annotation></semantics>. The pushout over the shared foot <semantics>Y<annotation encoding="application/x-tex">Y</annotation></semantics> implements the merging of the two apexes according to the “instructions” given by the output function <semantics>o Y<annotation encoding="application/x-tex">o_Y</annotation></semantics> of the first cospan and the input function <semantics>i Y<annotation encoding="application/x-tex">i_Y</annotation></semantics> of the second, as illustrated below in red

In this image, the bottom “cloud” is <semantics>Y<annotation encoding="application/x-tex">Y</annotation></semantics>, on the left is <semantics>N<annotation encoding="application/x-tex">N</annotation></semantics>, on the right is <semantics>M<annotation encoding="application/x-tex">M</annotation></semantics>, and above is <semantics>N+ YM<annotation encoding="application/x-tex">N +_Y M</annotation></semantics>, while <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> and <semantics>Z<annotation encoding="application/x-tex">Z</annotation></semantics>, and the maps from them, are not depicted.

Decorated cospans

In the setting of electrical circuits, one may view circuit diagrams as a set of nodes <semantics>N<annotation encoding="application/x-tex">N</annotation></semantics> which is “decorated” by the other symbols of the circuit diagram. For given <semantics>N<annotation encoding="application/x-tex">N</annotation></semantics>, this information may be encoded mathematically by specifying, for example, a set of edges E with target and source maps <semantics>s,t:EN<annotation encoding="application/x-tex">s,t : E \rightarrow N</annotation></semantics>, as well as other information, such as an assignment of resistances to edges by specifying a function <semantics>r:E(0,)<annotation encoding="application/x-tex">r : E \rightarrow (0, \infty)</annotation></semantics>.

Thinking of the set of nodes <semantics>N<annotation encoding="application/x-tex">N</annotation></semantics> as the apex of a cospan in <semantics>FinSet<annotation encoding="application/x-tex">FinSet</annotation></semantics>, the images in <semantics>N<annotation encoding="application/x-tex">N</annotation></semantics> of the feet <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> and <semantics>Y<annotation encoding="application/x-tex">Y</annotation></semantics> correspond to choices of input and output terminals, respectively, and the legs of the cospan allow further flexibility in connecting, for example, multiple inputs of one circuit to an output terminal of another, as illustrated above. John Baez’s blog post has nice examples of circuits being composed.

Decorated cospans are designed to capture this intuitive notion of composition of network diagrams. This must be done in such a way that the cospans and their decorations compose together in the desired manner.

Formally, Brendan achieves this using a lax monoidal functor <semantics>(F,φ):(C,+)(Set,×)<annotation encoding="application/x-tex">(F, \varphi) : (C, +) \rightarrow (Set, \times)</annotation></semantics> to encode the decorations on the apexes of cospans in <semantics>Cospan(C)<annotation encoding="application/x-tex">Cospan(C)</annotation></semantics>. (One may actually use any monoidal category <semantics>(D,)<annotation encoding="application/x-tex">(D, \otimes)</annotation></semantics> in place of <semantics>(Set,×)<annotation encoding="application/x-tex">(Set, \times)</annotation></semantics>). Given such a functor <semantics>F<annotation encoding="application/x-tex">F</annotation></semantics>, he constructs a category <semantics>FCospan(C)<annotation encoding="application/x-tex">F Cospan(C)</annotation></semantics> of “<semantics>F<annotation encoding="application/x-tex">F</annotation></semantics>-decorated cospans in <semantics>C<annotation encoding="application/x-tex">C</annotation></semantics>” where the objects are the objects of <semantics>C<annotation encoding="application/x-tex">C</annotation></semantics>, and morphisms are represented by pairs consisting of a cospan <semantics>XiNoY<annotation encoding="application/x-tex">X \overset{i}{\rightarrow} N \overset{o}{\leftarrow} Y</annotation></semantics> in <semantics>C<annotation encoding="application/x-tex">C</annotation></semantics> and an element <semantics>sFN<annotation encoding="application/x-tex">s \in F N</annotation></semantics> of the image of the apex <semantics>N<annotation encoding="application/x-tex">N</annotation></semantics> under <semantics>F<annotation encoding="application/x-tex">F</annotation></semantics>. This element is the decoration of the apex. On the nose, morphisms in <semantics>FCospan(C)<annotation encoding="application/x-tex">F Cospan(C)</annotation></semantics> are isomorphism classes of such pairs; two such pairs are isomorphic if their constituting cospans are isomorphic via an isomorphism <semantics>n:NN<annotation encoding="application/x-tex">n : N \rightarrow N'</annotation></semantics> between their apexes which is also compatible with the decorations: <semantics>(Fn)(s)=s<annotation encoding="application/x-tex">(F n)(s) = s'</annotation></semantics>.

Composition of decorated cospans works, on representatives of morphisms in <semantics>FCospan(C)<annotation encoding="application/x-tex">F Cospan(C)</annotation></semantics>, by composing the constituent cospans and composing the decorations <semantics>s<annotation encoding="application/x-tex">s</annotation></semantics> and <semantics>t<annotation encoding="application/x-tex">t</annotation></semantics> by sending <semantics>(s,t)FN×FM<annotation encoding="application/x-tex">(s, t) \in F N \times F M</annotation></semantics> to a decoration on the composed apex <semantics>N+ YM<annotation encoding="application/x-tex">N +_Y M</annotation></semantics> via the map

<semantics>F[j N,j M]:FN×FMF(N+ YM)<annotation encoding="application/x-tex"> F[j_N,j_M]: F N \times FM \rightarrow F(N+_Y M) </annotation></semantics>

(recall that <semantics>j N<annotation encoding="application/x-tex">j_N</annotation></semantics> and <semantics>j M<annotation encoding="application/x-tex">j_M</annotation></semantics> denote the pushout maps involved in composing our cospans). A key point is that we can use the copairing <semantics>[j N,j M]:N+MN+ YM<annotation encoding="application/x-tex">[j_N, j_M] : N + M \rightarrow N +_Y M</annotation></semantics> to implement the “merging” of the decorations on <semantics>N<annotation encoding="application/x-tex">N</annotation></semantics> and <semantics>M<annotation encoding="application/x-tex">M</annotation></semantics>.

Under the assumptions that the “decoration category” <semantics>(D,)<annotation encoding="application/x-tex">(D, \otimes)</annotation></semantics>, and <semantics>F<annotation encoding="application/x-tex">F</annotation></semantics>, are braided (we keep these assumptions now), the category <semantics>FCospan(C)<annotation encoding="application/x-tex">F Cospan(C)</annotation></semantics> inherits a monoidal and hypergraph structure from <semantics>Cospan(C)<annotation encoding="application/x-tex">Cospan(C)</annotation></semantics>. Here is a summary:

In [F1] (see Theorem 4.1) a method is given for constructing hypergraph functors between decorated cospan categories <semantics>FCospan(C)<annotation encoding="application/x-tex">F Cospan(C)</annotation></semantics> and <semantics>GCospan(C)<annotation encoding="application/x-tex">G Cospan(C')</annotation></semantics>, starting from the data of a natural transformation which relates <semantics>C<annotation encoding="application/x-tex">C</annotation></semantics> and <semantics>F<annotation encoding="application/x-tex">F</annotation></semantics> to <semantics>C<annotation encoding="application/x-tex">C'</annotation></semantics> and <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics>.

In [F2], decorated cospans are generalized to decorated corelations, and in this setting a theorem is proved which is analogous to the one mentioned above. This gives, in particular, a way of constructing functors from hypergraph categories of decorated cospans (think: syntax) to hypergraph categories of corelations (think: semantics).

Although we won’t have the space to explain decorated corelations properly, we’ll conclude our post by introducing the basic idea of corelations and indicating their potential for describing the “semantics”, or behavior, of open, network-like systems.


The corelation approach to describing the behavior of open systems is linked to the idea of “black-boxing”. Suppose one has two different circuit diagrams - described in a decorated cospan category as above - which have the same set of input and output terminals <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> and <semantics>Y<annotation encoding="application/x-tex">Y</annotation></semantics>, and suppose we build these circuits. The real-life circuits will impose certain relations on the values of the voltages and currents which may be simultaneously measured at all of the terminals. Not every configuration of possible values is realizable because circuits obey certain laws, such as Ohm’s law relating currents, voltages and resistances.

Note that two different circuits might impose the same relations between their terminals. With respect to an agreed upon mode of measurement, the relations imposed by a circuit on its terminals may be called the (external) “behavior” of a circuit. We’ll say that two circuits behave the same way if they are indistinguishable by making measurements at their terminals. In other words, if each circuit were covered by an opaque black box, with only the terminals exposed, we would not know which circuit is which.

Decorated corelations are designed to capture exactly the information about the “black-box behavior” of open, network-style systems. Brendan gives at least two reasons why one might wish to keep track only of this “behavioral” information, rather than all of the internal workings of components of open systems. One reason is conceptual clarity: the focus is kept on the behaviorally relevant information. A second reason is computational: when smaller components are composed into larger, the amount of syntactical information may increase explosively, while semantically relevant information may be more stable. As a simplified illustration: in the below composition of circuit diagrams, the parts of the resulting larger circuit which have no paths to any terminals are not relevant for the behavior of the circuit.

To give a sketch of how corelations work, let’s use the category FinSet mentioned above. Suppose we have cospans in FinSet <semantics>Xi XNo YY<annotation encoding="application/x-tex">X \overset{i_X}{\rightarrow} N \overset{o_Y}{\leftarrow} Y</annotation></semantics> and <semantics>Yi YMo ZZ<annotation encoding="application/x-tex">Y \overset{i_Y}{\rightarrow} M \overset{o_Z}{\leftarrow} Z</annotation></semantics>, ready to be composed as so

Their composite

is the cospan <semantics>Xj No YN+ YMj Mi YZ<annotation encoding="application/x-tex">X \overset{j_N o_Y}{\rightarrow} N +_Y M \overset{j_M i_Y}{\leftarrow} Z</annotation></semantics>. The union of the images of the legs of this cospan is precisely the image of the copairing <semantics>[j No Y,j Mi Y]:X+ZN+ YM<annotation encoding="application/x-tex">[j_N o_Y, j_M i_Y]: X + Z \longrightarrow N+_Y M</annotation></semantics>. Points outside of this image may be irrelevant, we may wish to “discard” them. A way of implementing “discarding” is to focus our attention on cospans <semantics>XiNoY<annotation encoding="application/x-tex">X \overset{i}{\rightarrow} N \overset{o}{\leftarrow} Y</annotation></semantics> for which <semantics>[i,o]:X+YN<annotation encoding="application/x-tex">[i,o]: X + Y \rightarrow N</annotation></semantics> is surjective. These are called corelations. To compose corelations <semantics>Xi XNo YY<annotation encoding="application/x-tex">X \overset{i_X}{\rightarrow} N \overset{o_Y}{\leftarrow} Y</annotation></semantics> and <semantics>Yi YMo ZZ<annotation encoding="application/x-tex">Y \overset{i_Y}{\rightarrow} M \overset{o_Z}{\leftarrow} Z</annotation></semantics>, we compose them as cospans to obtain <semantics>Xj Ni NN+ YMj Mo MZ<annotation encoding="application/x-tex">X \overset{j_N i_N}{\rightarrow} N +_Y M \overset{j_M o_M}{\leftarrow} Z</annotation></semantics> and then restrict the map <semantics>[j No Y,j Mi Y]<annotation encoding="application/x-tex">[j_N o_Y, j_M i_Y]</annotation></semantics> to its “surjective part”, to obtain again a corelation.

Factorization systems <semantics>(,)<annotation encoding="application/x-tex">(\mathcal{E}, \mathcal{M})</annotation></semantics> in a category give a way to make this idea precise and general; in [F2] factorization systems are used define a general notion of corelation which is modeled on the above example. Then, given a category <semantics>C<annotation encoding="application/x-tex">C</annotation></semantics> with finite colimits and a factorization system <semantics>(,)<annotation encoding="application/x-tex">(\mathcal{E}, \mathcal{M})</annotation></semantics> with suitable properties, one can build a hypergraph category <semantics>(Corel (,)(C),+)<annotation encoding="application/x-tex">(Corel_{(\mathcal{E}, \mathcal{M})}(C), +)</annotation></semantics> where objects are those of <semantics>C<annotation encoding="application/x-tex">C</annotation></semantics> and morphisms are so-called <semantics>(,)<annotation encoding="application/x-tex">(\mathcal{E}, \mathcal{M})</annotation></semantics>-corelations. As in the case of cospans, corelations can be decorated, leading to decorated corelation categories. The scope of this construction turns out to be quite general: in [F3] it is proved that any hypergraph category is hypergraph equivalent to some decorated corelation category.

Further reading

by john ( at March 07, 2018 04:40 PM

March 04, 2018

John Baez - Azimuth

Coarse-Graining Open Markov Processes

Kenny Courser and I have been working hard on this paper for months:

• John Baez and Kenny Courser, Coarse-graining open Markov processes.

It may be almost done. So, it would be great if people here could take a look and comment on it! It’s a cool mix of probability theory and double categories. I’ve posted a similar but non-isomorphic article on the n-Category Café, where people know a lot about double categories. But maybe some of you here know more about Markov processes!

‘Coarse-graining’ is a standard method of extracting a simple Markov process from a more complicated one by identifying states. We extend coarse-graining to open Markov processes. An ‘open’ Markov process is one where probability can flow in or out of certain states called ‘inputs’ and ‘outputs’. One can build up an ordinary Markov process from smaller open pieces in two basic ways:

• composition, where we identify the outputs of one open Markov process with the inputs of another,


• tensoring, where we set two open Markov processes side by side.

A while back, Brendan Fong, Blake Pollard and I showed that these constructions make open Markov processes into the morphisms of a symmetric monoidal category:

A compositional framework for Markov processes, Azimuth, January 12, 2016.

Here Kenny and I go further by constructing a symmetric monoidal double category where the 2-morphisms include ways of coarse-graining open Markov processes. We also extend the previously defined ‘black-boxing’ functor from the category of open Markov processes to this double category.

But before you dive into the paper, let me explain all this stuff a bit more….

Very roughly speaking, a ‘Markov process’ is a stochastic model describing a sequence of transitions between states in which the probability of a transition depends only on the current state. But the only Markov processes talk about are continuous-time Markov processes with a finite set of states. These can be drawn as labeled graphs:

where the number labeling each edge describes the probability per time of making a transition from one state to another.

An ‘open’ Markov process is a generalization in which probability can also flow in or out of certain states designated as ‘inputs’ and outputs’:

Open Markov processes can be seen as morphisms in a category, since we can compose two open Markov processes by identifying the outputs of the first with the inputs of the second. Composition lets us build a Markov process from smaller open parts—or conversely, analyze the behavior of a Markov process in terms of its parts.

In this paper, Kenny extend the study of open Markov processes to include coarse-graining. ‘Coarse-graining’ is a widely studied method of simplifying a Markov process by mapping its set of states X onto some smaller set X' in a manner that respects the dynamics. Here we introduce coarse-graining for open Markov processes. And we show how to extend this notion to the case of maps p: X \to X' that are not surjective, obtaining a general concept of morphism between open Markov processes.

Since open Markov processes are already morphisms in a category, it is natural to treat morphisms between them as morphisms between morphisms, or ‘2-morphisms’. We can do this using double categories!

Double categories were first introduced by Ehresmann around 1963. Since then, they’ve used in topology and other branches of pure math—but more recently they’ve been used to study open dynamical systems and open discrete-time Markov chains. So, it should not be surprising that they are also useful for open Markov processes..

A 2-morphism in a double category looks like this:

While a mere category has only objects and morphisms, here we have a few more types of things. We call A, B, C and D ‘objects’, f and g ‘vertical 1-morphisms’, M and N ‘horizontal 1-cells’, and \alpha a ‘2-morphism’. We can compose vertical 1-morphisms to get new vertical 1-morphisms and compose horizontal 1-cells to get new horizontal 1-cells. We can compose the 2-morphisms in two ways: horizontally by setting squares side by side, and vertically by setting one on top of the other. The ‘interchange law’ relates vertical and horizontal composition of 2-morphisms.

In a ‘strict’ double category all these forms of composition are associative. In a ‘pseudo’ double category, horizontal 1-cells compose in a weakly associative manner: that is, the associative law holds only up to an invertible 2-morphism, the ‘associator’, which obeys a coherence law. All this is just a sketch; for details on strict and pseudo double categories try the paper by Grandis and Paré.

Kenny and I construct a double category \mathbb{M}\mathbf{ark} with:

  1. finite sets as objects,
  2. maps between finite sets as vertical 1-morphisms,
  3. open Markov processes as horizontal 1-cells,
  4. morphisms between open Markov processes as 2-morphisms.

I won’t give the definition of item 4 here; you gotta read our paper for that! Composition of open Markov processes is only weakly associative, so \mathbb{M}\mathbf{ark} is a pseudo double category.

This is how our paper goes. In Section 2 we define open Markov processes and steady state solutions of the open master equation. In Section 3 we introduce coarse-graining first for Markov processes and then open Markov processes. In Section 4 we construct the double category \mathbb{M}\mathbf{ark} described above. We prove this is a symmetric monoidal double category in the sense defined by Mike Shulman. This captures the fact that we can not only compose open Markov processes but also ‘tensor’ them by setting them side by side.

For example, if we compose this open Markov process:

with the one I showed you before:

we get this open Markov process:

But if we tensor them, we get this:

As compared with an ordinary Markov process, the key new feature of an open Markov process is that probability can flow in or out. To describe this we need a generalization of the usual master equation for Markov processes, called the ‘open master equation’.

This is something that Brendan, Blake and I came up with earlier. In this equation, the probabilities at input and output states are arbitrary specified functions of time, while the probabilities at other states obey the usual master equation. As a result, the probabilities are not necessarily normalized. We interpret this by saying probability can flow either in or out at both the input and the output states.

If we fix constant probabilities at the inputs and outputs, there typically exist solutions of the open master equation with these boundary conditions that are constant as a function of time. These are called ‘steady states’. Often these are nonequilibrium steady states, meaning that there is a nonzero net flow of probabilities at the inputs and outputs. For example, probability can flow through an open Markov process at a constant rate in a nonequilibrium steady state. It’s like a bathtub where water is flowing in from the faucet, and flowing out of the drain, but the level of the water isn’t changing.

Brendan, Blake and I studied the relation between probabilities and flows at the inputs and outputs that holds in steady state. We called the process of extracting this relation from an open Markov process ‘black-boxing’, since it gives a way to forget the internal workings of an open system and remember only its externally observable behavior. We showed that black-boxing is compatible with composition and tensoring. In other words, we showed that black-boxing is a symmetric monoidal functor.

In Section 5 of our new paper, Kenny and I show that black-boxing is compatible with morphisms between open Markov processes. To make this idea precise, we prove that black-boxing gives a map from the double category \mathbb{M}\mathbf{ark} to another double category, called \mathbb{L}\mathbf{inRel}, which has:

  1. finite-dimensional real vector spaces U,V,W,\dots as objects,
  2. linear maps f : V \to W as vertical 1-morphisms from V to W,
  3. linear relations R \subseteq V \oplus W as horizontal 1-cells from V to W,
  4. squares

    obeying (f \oplus g)R \subseteq S as 2-morphisms.

Here a ‘linear relation’ from a vector space V to a vector space W is a linear subspace R \subseteq V \oplus W. Linear relations can be composed in the usual way we compose relations. The double category \mathbb{L}\mathbf{inRel} becomes symmetric monoidal using direct sum as the tensor product, but unlike \mathbb{M}\mathbf{ark} it is a strict double category: that is, composition of linear relations is associative.

Our main result, Theorem 5.5, says that black-boxing gives a symmetric monoidal double functor

\blacksquare : \mathbb{M}\mathbf{ark} \to \mathbb{L}\mathbf{inRel}

As you’ll see if you check out our paper, there’s a lot of nontrivial content hidden in this short statement! The proof requires a lot of linear algebra and also a reasonable amount of category theory. For example, we needed this fact: if you’ve got a commutative cube in the category of finite sets:

and the top and bottom faces are pushouts, and the two left-most faces are pullbacks, and the two left-most arrows on the bottom face are monic, then the two right-most faces are pullbacks. I think it’s cool that this is relevant to Markov processes!

Finally, in Section 6 we state a conjecture. First we use a technique invented by Mike Shulman to construct symmetric monoidal bicategories \mathbf{Mark} and \mathbf{LinRel} from the symmetric monoidal double categories \mathbb{M}\mathbf{ark} and \mathbb{L}\mathbf{inRel}. We conjecture that our black-boxing double functor determines a functor between these symmetric monoidal bicategories. This has got to be true. However, double categories seem to be a simpler framework for coarse-graining open Markov processes.

Finally, let me talk a bit about some related work. As I already mentioned, Brendan, Blake and I constructed a symmetric monoidal category where the morphisms are open Markov processes. However, we formalized such Markov processes in a slightly different way than Kenny and I do now. We defined a Markov process to be one of the pictures I’ve been showing you: a directed multigraph where each edge is assigned a positive number called its ‘rate constant’. In other words, we defined it to be a diagram

where X is a finite set of vertices or ‘states’, E is a finite set of edges or ‘transitions’ between states, the functions s,t : E \to X give the source and target of each edge, and r : E \to (0,\infty) gives the rate constant for each transition. We explained how from this data one can extract a matrix of real numbers (H_{i j})_{i,j \in X} called the ‘Hamiltonian’ of the Markov process, with two properties that are familiar in this game:

H_{i j} \geq 0 if i \neq j,

\sum_{i \in X} H_{i j} = 0 for all j \in X.

A matrix with these properties is called ‘infinitesimal stochastic’, since these conditions are equivalent to \exp(t H) being stochastic for all t \ge 0.

In our new paper, Kenny and I skip the directed multigraphs and work directly with the Hamiltonians! In other words, we define a Markov process to be a finite set X together with an infinitesimal stochastic matrix (H_{ij})_{i,j \in X}. This allows us to work more directly with the Hamiltonian and the all-important ‘master equation’

\displaystyle{    \frac{d p(t)}{d t} = H p(t)  }

which describes the evolution of a time-dependent probability distribution

p(t) : X \to \mathbb{R}

Clerc, Humphrey and Panangaden have constructed a bicategory with finite sets as objects, ‘open discrete labeled Markov processes’ as morphisms, and ‘simulations’ as 2-morphisms. The use the word ‘open’ in a pretty similar way to me. But their open discrete labeled Markov processes are also equipped with a set of ‘actions’ which represent interactions between the Markov process and the environment, such as an outside entity acting on a stochastic system. A ‘simulation’ is then a function between the state spaces that map the inputs, outputs and set of actions of one open discrete labeled Markov process to the inputs, outputs and set of actions of another.

Another compositional framework for Markov processes was discussed by de Francesco Albasini, Sabadini and Walters. They constructed an algebra of ‘Markov automata’. A Markov automaton is a family of matrices with non-negative real coefficients that is indexed by elements of a binary product of sets, where one set represents a set of ‘signals on the left interface’ of the Markov automata and the other set analogously for the right interface.

So, double categories are gradually invading the theory of Markov processes… as part of the bigger trend toward applied category theory. They’re natural things; scientists should use them.

by John Baez at March 04, 2018 03:55 AM

March 03, 2018

John Baez - Azimuth

Nonstandard Integers as Complex Numbers


I just read something cool:

• Joel David Hamkins, Nonstandard models of arithmetic arise in the complex numbers, 3 March 2018.

Let me try to explain it in a simplified way. I think all cool math should be known more widely than it is. Getting this to happen requires a lot of explanations at different levels.

Here goes:

The Peano axioms are a nice set of axioms describing the natural numbers. But thanks to Gödel’s incompleteness theorem, these axioms can’t completely nail down the structure of the natural numbers. So, there are lots of different ‘models’ of Peano arithmetic.

These are often called ‘nonstandard’ models. If you take a model of Peano arithmetic—say, your favorite ‘standard’ model —you can get other models by throwing in extra natural numbers, larger than all the standard ones. These nonstandard models can be countable or uncountable. For more, try this:

Nonstandard models of arithmetic, Wikipedia.

Starting with any of these models you can define integers in the usual way (as differences of natural numbers), and then rational numbers (as ratios of integers). So, there are lots of nonstandard versions of the rational numbers. Any one of these will be a field: you can add, subtract, multiply and divide your nonstandard rationals, in ways that obey all the usual basic rules.

Now for the cool part: if your nonstandard model of the natural numbers is small enough, your field of nonstandard rational numbers can be found somewhere in the standard field of complex numbers!

In other words, your nonstandard rationals are a subfield of the usual complex numbers: a subset that’s closed under addition, subtraction, multiplication and division by things that aren’t zero.

This is counterintuitive at first, because we tend to think of nonstandard models of Peano arithmetic as spooky and elusive things, while we tend to think of the complex numbers as well-understood.

However, the field of complex numbers is actually very large, and it has room for many spooky and elusive things inside it. This is well-known to experts, and we’re just seeing more evidence of that.

I said that all this works if your nonstandard model of the natural numbers is small enough. But what is “small enough”? Just the obvious thing: your nonstandard model needs to have a cardinality smaller than that of the complex numbers. So if it’s countable, that’s definitely small enough.

This fact was recently noticed by Alfred Dolich at a pub after a logic seminar at the City University of New York. The proof is very easy if you know this result: any field of characteristic zero whose cardinality is less than or equal to that of the continuum is isomorphic to some subfield of the complex numbers. So, unsurprisingly, it turned out to have been repeatedly discovered before.

And the result I just mentioned follows from this: any two algebraically closed fields of characteristic zero that have the same uncountable cardinality must be isomorphic. So, say someone hands you a field F of characteristic zero whose cardinality is smaller than that of the continuum. You can take its algebraic closure by throwing in roots to all polynomials, and its cardinality won’t get bigger. Then you can throw in even more elements, if necessary, to get a field whose cardinality is that of the continuum. The resulting field must be isomorphic to the complex numbers. So, F is isomorphic to a subfield of the complex numbers.

To round this off, I should say a bit about why nonstandard models of Peano arithmetic are considered spooky and elusive. Tennenbaum’s theorem says that for any countable non-standard model of Peano arithmetic there is no way to code the elements of the model as standard natural numbers such that either the addition or multiplication operation of the model is a computable function on the codes.

We can, however, say some things about what these countable nonstandard models are like as ordered sets. They can be linearly ordered in a way compatible with addition and multiplication. And then they consist of one copy of the standard natural numbers, followed by a lot of copies of the standard integers, which are packed together in a dense way: that is, for any two distinct copies, there’s another distinct copy between them. Furthermore, for any of these copies, there’s another copy before it, and another after it.

I should also say what’s good about algebraically closed fields of characteristic zero: they are uncountably categorical. In other words, any two models of the axioms for an algebraically closed field with the same cardinality must be isomorphic. (This is not true for the countable models: it’s easy to find different countable algebraically closed fields of characteristic zero. They are not spooky and elusive.)

So, any algebraically closed field whose cardinality is that of the continuum is isomorphic to the complex numbers!

For more on the logic of complex numbers, written at about the same level as this, try this post of mine:

The logic of real and complex numbers, Azimuth 8 September 2014.

by John Baez at March 03, 2018 07:29 PM

March 02, 2018

Cormac O’Raifeartaigh - Antimatter (Life in a puzzling universe)

Snowbound academics are better academics

Like most people in Ireland, I am working at home today. We got quite a dump of snow in the last two days, and there is no question of going anywhere until the roads clear. Worse, our college closed quite abruptly and I was caught on the hop – there are a lot of things (flash drives, books and papers) sitting smugly in my office that I need for my usual research.


The college on Monday evening

That said, I must admit I’m finding it all quite refreshing. For the first time in years, I have time to read interesting things in my daily email; all those postings from academic listings that I never seem to get time to read normally. I’m enjoying it so much, I wonder how much stuff I miss the rest of the time.


The view from my window as I write this

This morning, I thoroughly enjoyed a paper by Nicholas Campion on the representation of astronomy and cosmology in the works of William Shakespeare. I’ve often wondered about this as Shakespeare lived long enough to know of Galileo’s ground-breaking astronomical observations. However, anyone expecting coded references to new ideas about the universe in Shakespeare’s sonnets and plays will be disappointed; apparently he mainly sticks to classical ideas, with a few vague references to the changing order.

I’m also reading about early attempts to measure the parallax of light from a comet, especially by the great Danish astronomer Tycho de Brahe. This paper comes courtesy of the History of Astronomy Discussion Group listings, a really useful resource for anyone interested in the history of astronomy.

While I’m reading all this, I’m also trying to keep abreast of a thoroughly modern debate taking place worldwide, concerning the veracity of an exciting new result in cosmology on the formation of the first stars. It seems a group studying the cosmic microwave background think they have found evidence of a signal representing the absorption of radiation from the first stars. This is exciting enough if correct, but the dramatic part is that the signal is much larger than expected, and one explanation is that this effect may be due to the presence of Dark Matter.

If true, the result would be a major step in our understanding of the formation of stars,  plus a major step in the demonstration of the existence of Dark Matter. However, it’s early days – there are many possible sources of a spurious signal and signals that are larger than expected have a poor history in modern physics! There is a nice article on this in The Guardian, and you can see some of the debate on Peter Coles’s blog In the Dark.  Right or wrong, it’s a good example of how scientific discovery works – if the team can show they have taken all possible spurious results into account, and if other groups find the same result, skepticism will soon be converted into excited acceptance.

All in all, a great day so far. My only concern is that this is the way academia should be – with our day-to-day commitments in teaching and research, it’s easy to forget there is a larger academic world out there.


Of course, the best part is the walk into the village when it finally stops chucking down. can’t believe my local pub is open!


Dunmore East in the snow today


by cormac at March 02, 2018 01:44 PM

Lubos Motl - string vacua and pheno

General strategy of naturalness is just plain logical inference
Backreaction has launched another crusade against naturalness in high-energy physics.
Who is crazy now? (In which I am stunned to encounter people who agree with me that naturalness is nonsense.)
It may be appropriate to start with an answer to the question in that title. Sabine Hossenfelder isn't crazy. Instead, these statements are yet another almost rigorous proof that she is a 100% incompetent, fake physicist – a layman who has no clue about modern physics. It's something else than being crazy. Over 7.5 billion people in the world misunderstand naturalness but they're not "crazy".

We're offered a picture of white teeth with the rhetorical question: Are they natural? Well, sometimes one can feel that the whiteness has been enhanced (but sometimes by very natural procedures and stuff!). But lots of people have really natural white teeth. And things like fillings are surely a symptom of unnatural, artificial interventions into the teeth, aren't they? So if the dental context were discussed cleverly, it could provide us with good enough analogies to particle physics: violations of the "beauty" such as the dental fillings surely betray an unnatural intervention. But she doesn't want to do that.

I won't discuss her suggestions that some hassles with her tooth crown demonstrate a point in particle physics. They don't and the suggestion is so dumb that I won't honor it with a response. The color and slope of the teeth follow from some (statistical) laws of biology, chemistry, and physics. And these laws may be discussed from the viewpoint of naturalness, too. The idea that naturalness fails in these contexts is as wrong as all of her other statements about contemporary particle physics.

Naturalness is an argument in favor of a physical theory – and against other, unnatural theories. There are various detailed meanings of the word "naturalness" that differ by their precision. One may talk about naturalness at a very general level – something that allows us to distinguish whether something has evolved in nature or whether it has been created through the human creation or modified by human interventions or engineering. We want the laws of physics to be natural in this sense because we assume that they weren't constructed by a human engineer.

The most general meaning of naturalness may be an expression of someone's feelings and emotions which lacks any definition that could be written in terms of mathematical symbols or at least words.

But particle physicists tend to use a more special type of naturalness. The dimensionless parameters that determine the natural theory – a theory that gets a good grade from naturalness, and this grade matters – shouldn't be extremely large or extremely small in comparison with one: they should be of order one. The most precise, technical naturalness formulates these requirements more accurately and takes the assumed global symmetries into account. One may insert some reasoning that has led most phenomenologists to assume that for every massive scalar particle such as the Higgs boson, there should exist additional particles whose masses are of the same order as the Higgs mass and that help to keep the Higgs boson comparably light. I've always had doubts about this "strongest" form of naturalness – scales may be hypothetically generated as exponentially small by various mechanisms etc. – but it was still true that these arguments have increased the probability that the simple picture with "additional light enough particles" is right – increased but not to 100%.

Let me provide you with an unnatural theory that I will call Leo Vuyk 2020 theory. He hasn't invented this precise theory yet but he's free to plagiarize me. The theory says that the world is a giant strawberry whose vital characteristic, the Vyukness, is a real number, a parameter known as \(LV\). It defines the ratio of the diameters of the strawberry and its pit. Strawberries usually don't have pits but Leo Vyuk's strawberry has one.

Now, the parameter \(LV\) contains the answer to all questions in the world. It's approximately equal to \(42.05121973\dots\). Fourty-two was explained in the Hitchhiker's Guide. But it's the other digits that matter. If you divide the decimal expansion of \(LV\) to 10-digit segments, the segments are equal to the results of all chronologically sorted quantitative experiments ever done by the scientists. So if you read it carefully, there will be the digits \(0000000125\) that mean that the Higgs boson mass is \(125\GeV\), and so on.

It's a theory that contains the answers to all quantitative questions about Nature.

What is wrong with that theory? It's not predictive, some people would say. But it's just because we haven't found a good enough way to calculate the precise value of \(LV\), Leo could object, and such a formula for \(LV\) could perhaps materialize in the future. A more lasting problem is that the theory is not natural because the constant \(LV\) contains too many zeroes – many more zeroes than a random number of the same type. So it's extremely unlikely that Nature would pick this value by accident.

That's the actual, logical, technical explanation why we "don't like" such a theory. For the theory to be right, some parameters – its only parameter \(LV\) – has to have values or a value that is very unlikely to emerge from a hypothetical calculation that Leo Vuyk may dream about. (The theory also seems to require a preferred reference frame where the measurements are ordered chronologically, it requires one to objectively divide processes to measurements and non-measurements, and it has other big problems.)

By construction, I needed the value of \(LV\) to be fine-tuned. The example was chosen as an extreme form of fine-tuning. As long as physicists will remain physicists, they will need to refuse theories that are as unnatural as the Leo Vyuk 2020 theory above. This is not something that can go away after null results coming from the LHC after 5 or 10 years.

The reason why we refused the 2020 theory as an unnatural one wasn't formulated as a rigorous proof. It was evidence that had the probabilistic character. But that's exactly how natural science almost always works. Science is a method that allows us to say that certain things are more likely and certain things are less likely. If a theory needs to be extremely lucky with the digits in its parameter(s), in order to agree with the empirical data (and in order to be logically consistent), the validity of the theory itself is less likely.

What we're doing isn't really a consequence of some assumed laws of physics – that could be replaced with other laws of physics. The probabilistic reasoning above only assumes pure mathematics – well, the probabilistic calculus or Bayesian inference, if you wish. And pure mathematics can't ever go away. So some kind of naturalness will always be assumed because it's absolutely inseparable from any rational thinking.

What has arguably failed after the null results from the LHC isn't naturalness as a principle. What has failed is an extreme version or simplification of the naturalness that says that the absolute value of a random number \(x\) normally distributed around \(0\) with the standard deviation \(1\) cannot be greater than \(2\) (in combination with another assumption, namely that the field content that is sufficient for a discussion of the value of parameters is as minimal as MSSM or something like that). Well, such a recipe works approximately 95% of the time. But it doesn't always work. Statistics happens. Sometimes this rule-of-thumb, like all rules-of-thumbs, is unavoidably violated.

So the people who built all their research on the bet that the normal distribution may be assumed to be squeezed between \(-2\leq x \leq 2\) were never guaranteed to win that bet, and they have arguably lost this particular one. I have always been among those who have criticized the selective belief in the "new physics around the corner" which was always driven by a wishful thinking, not by solid evidence (it's a wishful thinking because the early new discoveries they believed in are more exciting and they wanted it – and they wanted to get the prizes for boldly guessing those in advance). Numbers like \(1/137\) are still "of order one", although barely so. (The fine-structure constant may be calculated from more fundamental constants at the more relevant, higher energy scales and they're "closer to be of order one" than \(1/137\).) Values of parameters like that may occur – and they have already occurred.

But such failures of the LHC to find new physics after several years can never "debunk" the principle of naturalness in its general form – because no physical experiment can ever refute the laws of mathematics.

I had to laugh when I read her comments about the changing Zeitgeist:
[Everyone has always politely informed me that I was a stupid crank.]

But this time it’s different. One day into the conference I notice that all I was about to say has already been said. The meeting, it seems, collected the world’s naturalness skeptics...
Well, there's a simple explanation why "this time it's different". It's because she has attended a conference that unnaturally collected various people – overwhelmingly third-class physicists such as Hossenfelder herself – who were skeptical about naturalness. As I mentioned, such people aren't rare, well over 7 billion people in the world wouldn't support naturalness with a logical defense. ;-) When you apply filters like that, what you get is a group that is skeptical towards naturalness. What a surprise. But there is no way to deduce anything from a poll taken among these cherry-picked people.

On top of that, these people generally have nothing to say. They may refuse modern physics but they have nothing to replace it with. Hossenfelder herself has admitted that she had nothing to say at the conference. She has nothing to say anywhere. She just emits irrational anti-science tirades and it's apparently enough for her brain-dead readers.

That conference took place in Aachen. It became a politically important city (on the German-Benelux border) during Pepin the Short, a Frankish ruler. At school, as kids, we would always laugh hysterically when Pepin the Short was mentioned because his name sounds as if he had a short penis. Aachen is named "Cáchy" in Czech which arose from a degeneration of "z Aachen" i.e. "from Aachen" in Czech.

When people completely lose it – and lose their ability to rationally think – they may start to build on the very existence of similar losers who gather in a scientifically irrelevant city, and on dumb and illogical analogies with tooth crowns. But scientists actually need to work with ideas that make some sense – and in the case of physical theories, it always includes the condition that the theories have to embrace some kind of naturalness at one level or another.

by Luboš Motl ( at March 02, 2018 12:14 PM

March 01, 2018

John Baez - Azimuth

Cartesian Bicategories

Two students in the Applied Category Theory 2018 school have blogged about a classic paper in category theory:

• Daniel Cicala and Jules Hedges, Cartesian bicategories, The n-Category Café, 19 February 2018.

Jules Hedges is a postdoc in the computer science department at Oxford who is applying category theory to game theory and economics. Daniel Cicala is a grad student working with me on a compositional approach to graph rewriting, which is about stuff like this:

This picture shows four ‘open graphs’: graphs with inputs and outputs. The vertices are labelled with operations. The top of the picture shows a ‘rewrite rule’ where one open graph is turned into another: the operation of multiplying by 2 is replaced by the operation of adding something to itself. The bottom of the picture shows one way we can ‘apply’ this rule: this takes us from open graph at bottom left to the open graph at bottom right.

So, we can use graph rewriting to think about ways to transform a computer program into another, perhaps simpler, computer program that does the same thing.

How do we formalize this?

A computer program wants to be a morphism, since it’s a process that turns some input into some output. Rewriting wants to be a 2-morphism, since it’s a ‘meta-process’ that turns some program into some other program. So, there should be some bicategory with computer programs (or labelled open graphs!) as morphisms and rewrites as 2-morphisms. In fact there should be a bunch of such bicategories, since there are a lot of details that one can tweak.

Together with my student Kenny Courser, Daniel has been investigating these bicategories:

• Daniel Cicala, Spans of cospans, Theory and Applications of Categories 33 (2018), 131–147.

Abstract. We discuss the notion of a span of cospans and define, for them, horizonal and vertical composition. These compositions satisfy the interchange law if working in a topos C and if the span legs are monic. A bicategory is then constructed from C-objects, C-cospans, and doubly monic spans of C-cospans. The primary motivation for this construction is an application to graph rewriting.

• Daniel Cicala, Spans of cospans in a topos, Theory and Applications of Categories 33 (2018), 1–22.

Abstract. For a topos T, there is a bicategory MonicSp(Csp(T)) whose objects are those of T, morphisms are cospans in T, and 2-morphisms are isomorphism classes of monic spans of cospans in T. Using a result of Shulman, we prove that MonicSp(Csp(T)) is symmetric monoidal, and moreover, that it is compact closed in the sense of Stay. We provide an application which illustrates how to encode double pushout rewrite rules as 2-morphisms inside a compact closed sub-bicategory of MonicSp(Csp(Graph)).

This stuff sounds abstract and esoteric when they talk about it, but it’s really all about things like the picture above—and it’s an important part of network theory!

Recently Daniel Cicala has noticed that some of the bicategories he’s getting are ‘cartesian bicategories’ in the sense of this paper:

• Aurelio Carboni and Robert F. C. Walters, Cartesian bicategories I, Journal of Pure and Applied Algebra 49 (1987), 11–32.

And that’s the paper he’s blogging about now with Jules Hedges!

by John Baez at March 01, 2018 07:47 PM

Sean Carroll - Preposterous Universe

Dark Matter and the Earliest Stars

So here’s something intriguing: an observational signature from the very first stars in the universe, which formed about 180 million years after the Big Bang (a little over one percent of the current age of the universe). This is exciting all by itself, and well worthy of our attention; getting data about the earliest generation of stars is notoriously difficult, and any morsel of information we can scrounge up is very helpful in putting together a picture of how the universe evolved from a relatively smooth plasma to the lumpy riot of stars and galaxies we see today. (Pop-level writeups at The Guardian and Science News, plus a helpful Twitter thread from Emma Chapman.)

But the intrigue gets kicked up a notch by an additional feature of the new results: the data imply that the cosmic gas surrounding these early stars is quite a bit cooler than we expected. What’s more, there’s a provocative explanation for why this might be the case: the gas might be cooled by interacting with dark matter. That’s quite a bit more speculative, of course, but sensible enough (and grounded in data) that it’s worth taking the possibility seriously.

[Update: skepticism has already been raised about the result. See this comment by Tim Brandt below.]

Illustration: NR Fuller, National Science Foundation

Let’s think about the stars first. We’re not seeing them directly; what we’re actually looking at is the cosmic microwave background (CMB) radiation, from about 380,000 years after the Big Bang. That radiation passes through the cosmic gas spread throughout the universe, occasionally getting absorbed. But when stars first start shining, they can very gently excite the gas around them (the 21cm hyperfine transition, for you experts), which in turn can affect the wavelength of radiation that gets absorbed. This shows up as a tiny distortion in the spectrum of the CMB itself. It’s that distortion which has now been observed, and the exact wavelength at which the distortion appears lets us work out the time at which those earliest stars began to shine.

Two cool things about this. First, it’s a tour de force bit of observational cosmology by Judd Bowman and collaborators. Not that collecting the data is hard by modern standards (observing the CMB is something we’re good at), but that the researchers were able to account for all of the different ways such a distortion could be produced other than by the first stars. (Contamination by such “foregrounds” is a notoriously tricky problem in CMB observations…) Second, the experiment itself is totally charming. EDGES (Experiment to Detect Global EoR [Epoch of Reionization] Signature) is a small-table-sized gizmo surrounded by a metal mesh, plopped down in a desert in Western Australia. Three cheers for small science!

But we all knew that the first stars had to be somewhen, it was just a matter of when. The surprise is that the spectral distortion is larger than expected (at 3.8 sigma), a sign that the cosmic gas surrounding the stars is colder than expected (and can therefore absorb more radiation). Why would that be the case? It’s not easy to come up with explanations — there are plenty of ways to heat up gas, but it’s not easy to cool it down.

One bold hypothesis is put forward by Rennan Barkana in a companion paper. One way to cool down gas is to have it interact with something even colder. So maybe — cold dark matter? Barkana runs the numbers, given what we know about the density of dark matter, and finds that we could get the requisite amount of cooling with a relatively light dark-matter particle — less than five times the mass of the proton, well less than expected in typical models of Weakly Interacting Massive Particles. But not completely crazy. And not really constrained by current detection limits from underground experiments, which are generally sensitive to higher masses.

The tricky part is figuring out how the dark matter could interact with the ordinary matter to cool it down. Barkana doesn’t propose any specific model, but looks at interactions that depend sharply on the relative velocity of the particles, as v^{-4}. You might get that, for example, if there was an extremely light (perhaps massless) boson mediating the interaction between dark and ordinary matter. There are already tight limits on such things, but not enough to completely squelch the idea.

This is all extraordinarily speculative, but worth keeping an eye on. It will be full employment for particle-physics model-builders, who will be tasked with coming up with full theories that predict the right relic abundance of dark matter, have the right velocity-dependent force between dark and ordinary matter, and are compatible with all other known experimental constraints. It’s worth doing, as currently all of our information about dark matter comes from its gravitational interactions, not its interactions directly with ordinary matter. Any tiny hint of that is worth taking very seriously.

But of course it might all go away. More work will be necessary to verify the observations, and to work out the possible theoretical implications. Such is life at the cutting edge of science!

by Sean Carroll at March 01, 2018 12:00 AM

February 28, 2018

Lubos Motl - string vacua and pheno

1,025,438 grandkids of a six-dimensional theory
String theory is like evolution, bottom-up QFTs are creationism

Jacques Distler and three co-authors (CDTZ, Austin and Maryland) have published an impressively technical paper "classifying" certain exotic beasts,
Tinkertoys for the \(E_8\) Theory.
Their paper is accompanied by a cool interactive website which you should investigate. In particular, if you're asking why they didn't include "all the stuff" from the website in their paper, try this subpage on four punctured spheres. Pick a combination of the four parameters and press "go". What you get are some diagrams of S-duality frames (1,025,438 of them) where you see two circles with some objects and left-right arrows connecting them. You can click at the objects and get additional data on the "fixtures".

If you don't understand what the data means and how to use them, don't worry. 7 billion people don't understand it and my estimate is that 30 people in the world may deal with the data – to the extent that they would have a chance to discover a typo if one were artificially introduced. ;-)

What did they do? They discussed some 4-dimensional quantum field theories with enhanced supersymmetry – the \(\NNN=2\) gauge theories in \(D=4\) – that may be derived from a master theory, the six-dimensional \((2,0)\) theory – by operations that require lots of group theory involving \(E_8\), the largest exceptional Lie group (that the laymen surely know as the Garrett Lisi group after a famous crackpot surfer), as well as geometric operations introducing and curing singularities within a two-sphere (injured by 3 or 4 punctures).

Now, all this science may be formally considered "non-gravitational quantum field theory with lots of geometry and group theory". So there's no full-blown string/M-theory in it. On the other hand, everything about their construction is completely stringy – in the sense that basically everyone who understands these quantum field theories and operations with them is a string theorist.

They're elaborating upon the "class-S" construction by Davide Gaiotto from April 2009 as well as Gaiotto, Moore, and Neitzke from July 2009. Now, I know Jacques Distler in person – as well as from tons of interactions on the web. I've spent years next to Davide Gaiotto and Andy Neitzke (who co-authored papers with AN) at Harvard, and years next to Greg Moore at Rutgers. So all this stuff has been pushed largely by brilliant minds whom I know very well. I believe that their expertise is rare and I think it's nontrivial to find grad students who would be able and willing to learn the necessary stuff even at the level of a "useful brute force assistant".

The six-dimensional \((2,0)\) theory is a non-gravitational local quantum field theory in 5+1 dimensions (well, a family of such theories). So formally, it's as non-gravitational or non-stringy as you can get. However, almost all its definitions and relationships that are known are tightly incorporated into the wisdom of string/M-theory. Note that \((2,0)\) refers to the amount of supersymmetry. In 5+1 dimensions, the supercharges transform as 5+1-dimensional spinors. Those spinors of \(Spin(5,1)\) include two inequivalent chiral representations, just like in \(Spin(4)\approx SU(2)\times SU(2)\) – which has the same types of spinors and where we "cancelled" the temporal and spatial dimensions – so unlike the case of \(Spin(3,1)\) or \(D=4\), it's not enough to say "how much SUSY". You must say "how much left-handed SUSY" and "how much right-handed SUSY". The \((2,0)\) number means that it's enhanced supersymmetry but both supersymmetries have the same chirality in six dimensions.

There are numerous relations of this field theory to string/M-theory. It may be obtained as a low-energy limit of M5-brane dynamics within M-theory; or as some approximation of type IIB string theory on a singularity within a 4-dimensional Euclidean space. It follows that the large \(N\) limit of the \(SU(N)\) \((2,0)\) theories is Maldacena-dual to a \(AdS_7\times S^4\) compactification of M-theory. That's the main link to the higher-dimensional physics.

But the \((2,0)\) has lots of cool relationships to lower-dimensional (compared to six) physics. It has lots of descendants. The \((2,0)\) theory may be formally associated with an ADE gauge group – either \(SU(N)\) or \(SO(2N)\) or \(E_6,E_7,E_8\). The first two are infinite families, the latter three are isolated exceptions. Distler et al. studied the largest isolated exception.

The very existence of these theories – which is "somewhat" hypothetical and if you deny the existence of string/M-theory, you could probably consistently deny the very existence of the \((2,0)\) theories as well – is enough to prove many marvelous facts about quantum field theories and string theory vacua. In particular, the \(\NNN=4\) gauge theory in \(D=4\) has the \(SL(2,\ZZ)\) S-duality group. This duality is obvious if you realize that the \(D=4\) theory may be obtained by compactifying the \((2,0)\) theory on a 2-torus: the duality is nothing else than the group of "large" coordinate transformations on the two-torus.

Distler et al., and the predecessors, consider a more complicated compactification than the simple two-torus. One replaces the two-torus by a more general Riemann surface, especially a simple two-sphere, one that may have "punctures", and there may be a "partial topological twist" associated with every "puncture". At any rate, there are many ways how to get rid of the two extra dimensions and dimensionally reduce the six-dimensional theory to a four-dimensional one.

One ends up with many \(\NNN=2\) theories in \(D=4\) – the "Seiberg-Witten" amount of supersymmetry in a four-dimensional spacetime. Depending on the types of topological twists etc., one gets many possible "descendants" of the six-dimensional theory. Some of them work, some of them don't work.

I can't explain all the details – one reason is that I obviously fail to understand all the details – but I want to make a general big-picture claim that is too philosophical that it doesn't appear in such papers, and maybe the authors aren't sufficiently playful to invent it.

As I announced in the subtitle, this construction of the four-dimensional theories follows the logic of Darwin's evolution – with common ancestors and mutations – while the usual bottom-up construction of quantum field theories (construction where we list possible pieces, fields, and combine them and their interaction) is the creationist attitude to build quantum field theories.

If the species were identified with four-dimensional quantum field theories, they could arise in two basic ways: creation and evolution. Creation means that God asks His assistant for the building blocks and decides, as if he were an engineer, how to combine the building blocks into a four-dimensional theory with some fields and interactions. So God decides it would be fun to have something like an elephant, so He asks his assistant for a trunk and just constructs an elephant.

Gaiotto, Neitzke, Distler, Moore, and others are approaching the birth of the four-dimensional theories differently, by the evolutionary path. They pick their ancestor, in this case the \(E_8\) \((2,0)\) theory in six dimensions, and let it live. They pick several clones of it and mutate them – which happens in the ordinary life – and they get various ancestors with a modified DNA.

A big difference is that in the creationist attitudes to the spectrum of the quantum field theories, you basically impose your conditions "how the theory should behave at long distances" as the initial constraints that dictate everything. On the other hand, the stringy evolutionary attitude is different. You allow the choices that may occur at the fundamental, DNA level – like the partial topological twists assigned to the punctures – and what the theory looks like at low energies is a surprise.

Note that this is exactly how it works in the real-world biological evolution. Some DNA may be mutated – DNA is more fundamental than the anatomy of an animal – and what the anatomy and physiology of the mutated animal arises is a surprise. Whether such a new animal species is "viable" is only determined later, by natural selection.

The stringy attitude based on the \((2,0)\) theory simply acknowledges that analogously to the DNA, some local choices in the Riemann surface of the 2 compactified dimensions are more fundamental, and the long-distance behavior of the resulting four-dimensional quantum field theory is an emerged, derived consequence of the more fundamental choices.

In biology, even if we didn't know about DNA, it would be more right to assume that "something like that exists" and is more fundamental (the more primary cause) than the anatomy of the animal species. This DNA-centered attitude is analogous to the stringy one and those who want to avoid the stringy origin of the theories are analogous to those who insist on creationism. Both of them imagine that it's scientifically OK to imagine that the Creator starts with the long-distance or anatomic result He wants to get – while the stringy or evolutionary folks know that the evolution of species or four-dimensional QFTs is a scientific process that has to obey some laws of Nature, too.

by Luboš Motl ( at February 28, 2018 07:17 AM

February 27, 2018

Tommaso Dorigo - Scientificblogging

Excited B_c State Seen By ATLAS Is Not Confirmed By LHCb
Statistical hypothesis testing is quite boring if you apply it to cases where you know the answer, or where the data speak loud and clear about one hypothesis being true or false. Life at the interface between testability and untestability is much more fun.

read more

by Tommaso Dorigo at February 27, 2018 04:58 PM

Lubos Motl - string vacua and pheno

Draining several parts of the swampland simutaneously
In 2005, my office was next to Cumrun Vafa's so I could watch him while plotting his plans to drain the swampland, more than a decade before Donald Trump (well, "drain the swamp" has been used for very different goals since the 19th century). His work preceded the main wave of the anti-stringy hysteria – that came in 2006 when two notorious cranks published their books – but I think it was at least partly motivated by various ludicrous claims about "unpredictive string theory".

He pointed out – and tried to clearly articulate and decorate with a new term – something that all string theorists always saw. The effective laws of physics that you may derive as long-distance approximations of string theory aren't just "any" or "generic" effective field theories. Effective field theories allow too many features that are prohibited in string theory. String theory seems to imply some extra conditions and regularities that couldn't have been derived by effective field theory itself.

These days, the Weak Gravity Conjecture (WGC), the first hep-th preprint of 2006, is probably the most well-known example of these swampland-like restrictions. Some people even know the term WGC while they're ignorant about the general concept of the swampland.

But WGC is just one type of general predictions that are special for string theory. Cumrun pointed out that the volume of moduli spaces seems to be finite in string theory – while effective field theories seemed to allow both finite and infinite volumes.

Also, the number of field species seems to be finite, like the ranks and dimensions of gauge groups. And in May 2006, Ooguri and Vafa discussed some general but characteristically stringy behavior of the particle spectrum near special points of the moduli space. The density grows with the mass in a certain way. I would add that the omnipresent "towers of states" in string theory have to be there because they're the precursors to the exponentially growing towers of black hole microstates in quantum gravity.

Among similar observations, we also find the observation that it's hard, to say the least, to realize "large scale inflation" within string theory. With canonically normalized kinetic terms for scalar fields, it seems hard to allow an inflaton to move by much more than one Planck mass. It seemed like string theory would prefer Guth's old inflation over Linde's new inflation.

In all of this business, the observed extra constraints were usually figured out by "experimentally watching" properties of string theory's vacua. But string theory is a consistent theory of quantum gravity and many of us find it likely that it's the only consistent theory of quantum gravity. So these stringy constraints may also be viewed as hypothetical restrictions that consistency of quantum gravity (a seemingly more general framework) imposes on effective field theories that may be incorporated.

More generally, seemingly non-stringy arguments may be constructed to argue that the conditions implied by string theory actually follow from the coherent unification of quantum mechanics and gravity. So far, all clear enough statements are consistent with the assumption that "string theory" and "consistent quantum gravity" are the same beast observed from different directions.

In the first hep-th preprint today,
Emergence and the Swampland Conjectures
Ben Heidenreich, Matthew Reece, and Tom Rudelius are trying to bring some order to these seemingly unrelated, chaotic observations about the "extra constraints" imposed by string theory. They derive the rich tower of light states from a new assumption they propose as a more fundamental one: the assumption that loop corrections drive both gravity and scalar interactions to the strong coupling at the same scale (well, this is a sort of a "soft unification" assumption and I will use that term for their assumption).

With this assumption, the collection of states that become light near a point of the moduli space automatically needs to be a "rich tower of states". The same "soft unification" assumption of theirs also seems to imply that the "large field inflation" should be prohibited.

These qualitative properties of string theory and/or quantum gravity should be understood increasingly well – including various logical relationships between seemingly independent assumptions of this sort. Along with some progress in the information loss paradox and entanglement/glue duality, a crisp new definition of string theory or quantum gravity could ultimately emerge in front of the eyes of someone whose thinking about the matters is simply clever.

by Luboš Motl ( at February 27, 2018 07:12 AM

February 25, 2018

February 22, 2018

Tommaso Dorigo - Scientificblogging

Diboson Resonances Review Available Online
This is just a short note - a record-keeping, if you like - to report that my long review on "Collider Searches for Diboson Resonances" has now appeared on the online Elsevier site of the journal "Progress of Particle and Nuclear Physics". 
I had previously pointed to the preprint version of the same article on this blog, with the aim of getting feedback from experts in the field, and I am happy that this has indeed happened: I was able to integrate some corrections from Robert Shrock, a theorist at SUNY, as well as some integrations to the references list by a couple of other colleagues.

read more

by Tommaso Dorigo at February 22, 2018 04:09 PM

Lubos Motl - string vacua and pheno

Questionable value of inequalities in physics
Bill Zajc brought my attention to a very good talk that Raphael Bousso gave about his recent and older work. Inequalities play a very important role in his work. I am much willing to appreciate the value of an inequality than what I was when I was a kid or a teenager. But much of that sentiment has survived: I don't really believe that a typical inequality tells us too much about the laws of physics.

First, my initial realization is that inequalities incorporate much less information than identities. Imagine that you're asked how much is \(8+9\). Many of you will be able to answer\[


\] The percentage of TRF readers who can do it is significantly higher than in almost all other websites in the world. ;-) OK, but some people could also say that they're not quite sure but\[

8+9 \gt 10.

\] Eight plus nine is greater than ten, they figure out. That's nice and it happens to be true. But this truth is much less unique. In fact, someone else could say\[

8+9 \gt 12

\] which is another inequality of the same type – a strictly stronger one, in fact.

So inequalities seem to be "far less unique" than identities. You could ask what is the strongest possible inequality of this kind. The answer would be something like\[

8+9 \gt 16.999\dots

\] Well, there is no single "strongest" inequality of this kind because the set of numbers that are smaller than \(8+9\) has a supremum but not a maximum – the limit, \(17\), is already outside the set. So you may replace the inequality by the statement that "the supremum of a set is \(17\)" but if you do so, the statement becomes an equality or identity. It is no longer an inequality.

Now, if you have competed in mathematical olympiads, you must have solved a large number of problems of the form "prove the inequality [this or that]". There are lots of inequalities you may prove. For positive numbers \(X_i\gt 0\), \(i=1,2,\dots N\), the arithmetic average is larger than the geometric one:\[

\frac{1}{N}\sum_{i=1}^N X_i \geq \sqrt[N]{\prod_{i=1}^N X_i}.

\] Whenever \(X_1=X_2=\dots = X_N\) is violated, the sign \(\geq\) may be replaced with the sharp \(\gt\). That's great. As kids, you may have learned some proofs of that inequality – and similar ones. You may have invented your favorite proofs yourself. Some of the fancier, "adult" proofs could involve the search for the minimum using the vanishing derivative. Many of us loved to design such transparent proofs and we were sometimes told that these "proofs based on calculus weren't allowed". But the proofs based on calculus are straightforward. Even in the "worst possible case", the inequality still holds, so it holds everywhere.

I don't want to waste your time with particular proofs. But what I want to emphasize is that the inequalities – such as the ordering of the arithmetic and geometric mean – are purely mathematical results. You may prove them by pure thought. The inequalities have some assumptions, such as \(X_i\gt 0\) here, but everything else follows from the laws of mathematics.

A point you should notice is that no laws of physics are needed to prove a purely mathematical inequality. Equivalently, when you prove such an inequality, you're not learning anything about the laws of physics, either. Imagine that you may hire as many great pure mathematicians as you want. There are many candidates and most of them are unable to look for the right laws of physics – which needs some special creativity as well as a comparison with some empirical data.

With these employees, it's clear that you're no longer interested in the detailed proofs of the inequalities. There are many ways to prove an inequality. You're not even interested in the inequalities themselves – there are many inequalities you may write down, as the example \(8+9\gt 10\) or \(12\) was supposed to remind you.

Instead, with this team of collaborators, you will be interested in the assumptions that are needed to prove the inequality.

So the statements such as \(X_i\in \RR^+\) may remain important because they're the types of statements that remain relevant in physics. In the context of physics, we have lots of defining identities for physical quantities such as the density of the electromagnetic energy:\[

\rho = \frac{|\vec E|^2 + |\vec B|^2}{2}.

\] By pure mathematics, the real vectors \(\vec E,\vec B\) automatically give you \(\rho \geq 0\). Is that statement important? Is it fundamental? Well, it's important enough because you need the positivity of the energy to make many other, physically important statements. The vacuum is stable. Superluminal signals or tachyons are outlawed. And so on. But I would say that the statement isn't fundamental. It's a derived one, almost by construction.

In physics, the energy conditions – some variations of the positivity of the energy density – is an intermediate case. Sometimes, you want to view it as a purely derived mathematical statement that follows from others. Sometimes, you want to view it as a general axiom that constrains your theories – and these theories' formulae for the energy density \(\rho\) in terms of the more fundamental fields. Only in the second approach, the energy conditions may become "fundamental". And I think that the fundamental status of such theories (or axiomatic systems) is unavoidably temporary.

As we agreed with Bill, there are two inequalities linked to important principles in old enough physics. One of them is\[

\Delta S \geq 0.

\] The entropy never (macroscopically) decreases. It's the second law of thermodynamics. Just like in the case of the energy conditions, it may be either viewed as an axiom or a fundamental principle; or as a derived mathematical statement.

In thermodynamics, the second law of thermodynamics is a fundamental principle. Thermodynamics was formulated before statistical physics. People were trying to construct a perpetuum mobile and after some failed attempts, they realized that the efforts were probably futile and their failures could have been generalized: the perpetuum mobile is impossible.

Some would-be perpetuum mobile gadgets are impossible because they violate the first law of thermodynamics, the energy conservation law. Others are impossible because they need the heat to move from the colder body to a warmer one, and processes like that are also impossible. They tried to think about the various ways to describe what's "wrong" about these apparently impossible processes and they invented the notion of entropy – decades before Ludwig Boltzmann wrote entropy as the logarithm of the number of macroscopically indistinguishable microstates:\[

S = k_B \ln N

\] Within Boltzmann's and other smart men's statistical physics, the second law becomes a mathematically derived law. The principle may suddenly be given a proof and the proposition along with the proof is usually called the H-theorem. My personal favorite proof – discussed in many TRF blog posts – is using the time reversal. The probability of the transition \(A\to B\) among two ensembles of microstates is related to the probability of \(B^* \to A^*\), the time-reversed evolution of the time-reversed states.

The probability for ensembles is calculated as a sum over the final microstates – \(B_i\) or \(A^*_j\) in this case. The summing appears because "OR" in the final state means that we don't care which microstate is obtained and the probabilities in this kind of final "OR" should be summed. But when it comes to the initial state, the probabilities should be averaged over the initial microstates. (The difference between summing and averaging – operations that take final and initial microstates into account – is the ultimate source of all the arrows of time. The past differs from the future already because of the basic calculus of probabilities applied to statements about events in time. Everyone who claims that there's no arrow of time at the level of basic probability and the asymmetry has to be artificially added by some engineering of the laws of physics – e.g. Sean Carroll – is a plain moron.) The averaging could be arithmetic but it could have some unequal weights, too. "OR" in the assumptions or the initial state means that the initial pie has to be divided to slices and the evolution of the slices has to be computed separately. The factor of \(1/N_{\rm initial}\) arises from the need to divide the initial pie of 100% of the probability.

OK, so the probability for \(P(A\to B)\) is a sum over the \(A_i,B_j\) microstates with the extra factor of \(1/N_A\); for \(P(B^*\to A^*)\), the extra factor is \(1/N_{B^*} = 1/N_B\). The numbers \(N_A,N_B\) may be written as the exponentiated entropies, \(N_A = \exp(S_A/k_B)\) etc., and when the entropies of \(A,B\) are macroscopically different, \(N_A,N_B\) differ by a huge number of orders of magnitude. Probabilities cannot exceed one so at most one of the two probabilities is comparable to one, the other must be infinitesimal i.e. basically zero. The probability that may be comparable to 100% is the probability of the evolution from a smaller \(N_A\) to a larger \(N_B\) because the fraction \(1/N_A\) isn't suppressing the number so much; the reverse evolution is prohibited! That's a very general proof of the second law. The conclusion is that either the probability \(P(A\to B)=0\) or \(N_A\leq N_B\).

That's nice. Statistical physics has allowed us to demystify the principles of thermodynamics. These principles are suddenly mathematical implications of models we have constructed – a huge class of models (the proof is easily generalized to quantum mechanics, too). It's a great story from the history of physics.

With hindsight, was the inequality \(\Delta S\gt 0\) important? And what did it allow us to do? Well, I would say that the thermodynamical version of the second law – when it was an unquestioned principle – was useful mainly practically. It has saved lots of time for sensible practical people who could have developed new engines instead of wasting time with the hopeless task to build a perpetuum mobile. Thermodynamics has been praised by Einstein as a principled counterpart of relativity – a theorists' invention par excellence. However, there's an equally good viewpoint that dismisses thermodynamics as a method of ultimate bottom-up phenomenologists if not engineers!

Those people were mostly practical men, not theorists. Did the principle help theorists to build better theorists of Nature? I am not so sure. Well, people finally build the statistical physics and understood the importance of the atoms and their entropy etc. But that progress didn't directly follow from the principles of thermodynamics. One may verify that the atomic hypothesis and statistical physics allow us to justify lots of previous knowledge from thermodynamics and other branches of physics. But you need to guess that there are atoms and you should count their microstates after an independent divine intervention. The principles of statistical physics aren't a sufficient guide.

And if you only want to understand the laws of Nature "in principle", one could even extremely argue that you don't need the second law of thermodynamics at all. Without some understanding of the law, you would have no idea what you should build as an engineer etc. Well, your ignorance would be embarrassing and hurtful even for some folks who are much more theoretical than inventors building new engines. But it's still true that from an extreme theorist's perspective, the second law of thermodynamics is just one mathematical consequence of your law of physics for the microstates that you don't need to know if you want to claim that you understand how Nature works in principle. (Just to be sure, I don't invite young readers to become theorists who are this extreme Fachidiots. Sometimes it's useful to know that there's some world around you.)

The second big inequality of well-established physics I want to mention is the uncertainty principle, e.g. in the form\[

\Delta X \cdot \Delta P \geq\frac{\hbar}{2}.

\] Using the apparatus of the wave functions, that inequality may be proven – a more general one may be proven for any pair of operators \(F,G\) and the commutator appears on the right hand side in that general case. For \(X,P\), the inequality is saturated by Gaussian packets moved anywhere in the phase space. Again, the inequality may be understood in two historically different mental perspectives:
  • as a principle that tells us something deeply surprising and incompatible with the implicit assumptions of many people who thought about physics so far
  • as a derived mathematical consequence of some laws after those laws become known.
These two stages are analogous to the second law of thermodynamics. That law was first found as a "moral guess" explaining the continuing failure of the (not yet?) crackpots who had been working on perpetuum mobile gadgets. Those generally assumed that it was possible but the principle says that it isn't possible. Here, almost all physicists assumed it was always in principle possible to measure the position and momentum at the same time but it isn't possible.

The second role of the uncertainty principle is a derived mathematical fact that follows from some computations involving wave functions, their inner products, and matrix elements of linear operators. That's analogous to the H-theorem – the inequality is derived from something else. That "something else" is finally more important for practicing physicists. In particular, \(XP-PX = i\hbar\) is an identity that replaces the inequality above. This identity, a nonzero commutator, is more specific and useful than the inequality although by some slightly creative thinking, one could argue that the nonzero commutator "directly" follows from the inequality.

Being skeptical about the value of inequalities since my childhood, I have gradually refined my attitude to similar claimed breakthroughs. If someone talks about some important new inequality, I want to know whether it is a postulated principle – that cannot be proven at this moment – or a derived mathematical fact. If it is a derived mathematical fact, I want to know what it is derived from, what are the most nontrivial assumptions that you need to make to complete the proof. These assumptions may be more important than the final inequality.

If it is claimed to be a postulated principle without a proof, I want to know what is the evidence, or at least what problems such an inequality would explain, and whether the inequality is at least partially canonical or unique, or whether it is similar to \(9+8\gt 10\). My general attitude is: Don't get carried away when someone tries to impress you with a new inequality. Inequalities may be cheap and non-unique.

The second law of thermodynamics and the uncertainty principle were examples of "valuable inequalities in well-established physics". The energy conditions arguably belong to that category, too. In the context of general relativity, the Null Energy Condition is one that is most credible. It makes sense to believe that \(T_{\mu\nu} k^\mu k^\nu \geq 0\) for any null vector \(k^\mu\). When some cosmological constant is nonzero, you probably need to add some terms, and when some entropy flows through some places, you need to fix it, too. Raphael knows that the Null Energy Condition (NEC) is the most appropriate one among the possible energy conditions. I think that good physicists generally agree here. Weak and Strong Energy Conditions may superficially look natural but there are proofs of concepts indicating that both may be violated.

The vague concept of the energy condition is important because that's what may be linked to the stability of the vacuum – if energy density could be negative, clumps of positive- and negative-energy matter could spontaneously be born in the vacuum, without violating the energy conservation law, and that would mean that the vacuum is unstable. Related, almost equivalent, consequences would be traversable wormholes, tachyons, superluminal signals and influences, and so on. One may show the equivalence between these principles – or the pathologies that violate the principles – by thinking about some special backgrounds etc.

One may also think about more complex and more general backgrounds and situations and look for more complicated versions of "the" inequalities. What is "the" generalization of the inequalities for one case or another? Well, I am afraid that there isn't necessarily a good canonical answer. The set of inequalities in mathematics contains lots of useless gibberish such as \(9+8\gt 10\) and it seems that if "being a generalization of some other inequality" is your only criterion, you're still far more likely to find gibberish than something important.

When it comes to holography, we generally agree that the bound on entropy of the form\[

S \leq \frac{A}{4G}

\] is the most general and "deep" insight of that type. Jacob Bekenstein has been essential to find this kind of laws. But he's also found the other Bekenstein bounds. There were various products of energies and radii on the right hand side. These laws generally applied to static situations only. But were the laws true? And were they fundamental?

Well, I don't know the answer to the first question but what I can say is that I don't really care. If it's true that the entropy is never larger than some product of a radius and the energy defined in a certain way, well, then it's true. But I will only believe it if you give me some proof. And the proof will unavoidably be similar to a proof of a purely mathematical inequality, such as the ordering of the arithmetic and geometric means. And when something may be proven purely mathematically, there's just no physical beef. Some proposed inequalities may be proven to be true, others may be proven to be false. But both groups contain infinitely many inequalities and most of them aren't really special or insightful. So why should we care about them? They will remain mathematical technicalities. They can't become deep physical principles – those should be more verbal or philosophical in character.

Some two decades ago, Raphael Bousso began to produce the covariant entropy bounds. They were supposed to hold for any time-dependent background and the maximum entropy crossing a surface \(\Sigma\). could have been bounded by some expressions such as \(A/4G\) for some properly, nontrivially chosen area \(A\), assuming that the surface \(\Sigma\) was null. Despite the fact that I think that the choice of the null slices proves Bousso's very good taste and is more likely to be on the right track than with spacelike or timelike slices, I still feel that the non-uniqueness of such inequalities may be even more extreme than the examples of the assorted static "Bekenstein bounds", and I haven't ever cared even about those.

In all such cases, I want to know what are the most nontrivial assumptions from which such inequalities, assuming they are true, may be proven – in that case, I am more interested in these "more fundamental" assumptions than the inequalities themselves. And if the inequalities are sold as principles with consequences, I want to know what are the proposed consequences, why it's better to believe in the inequality than in its violation. So I want to know either some assumption or consequences of such inequalities that are already considered important in the physics research, otherwise the whole game seems to be a purely mathematical and redundant addition to physics – not too different from a solution to a particular and randomly chosen exam problem.

That seems important to me because lots of this "unregulated search for new principles" is nothing else than indoctrination. Penrose's Cosmic Censorship Conjecture is an example. It may be a rather interesting – perhaps mathematical – question about solutions to the classical general relativity. But Penrose also offered us a potential answer, without a proof, at the same moment. And because he was so famous, people started to prefer his answer over its negation even though there was no truly rational reason for that attitude. With influences by famous people like that, physics may easily deteriorate to a religious cult. And the faith in the Cosmic Censorship Conjecture has been a religious cult of a sort. Even the weakest "flavors" extracted from the Cosmic Censorship Conjecture are considered false in \(D\gt 4\) these days.

The holographic and entropy bounds are supposed to be very important because they should lead us to a deeper, more holographic way to formulate the laws of quantum gravity. But is that hope justifiable? We saw that even in the case of the second law of thermodynamics where the relationship was actually correct, the principle of thermodynamics wasn't a terribly constructive guide in the search for statistical physics and the atomic hypothesis. In the case of the entropy bounds, we may expect that those won't be too helpful guides, either. On top of that, the very "sketch of the network of laws" may be invalid. The fundamental laws of quantum gravity may invalidate the entropy bounds in the Planckian regime and so on.

So it's possible that something comes out of these considerations but one must be careful not to get brainwshed. These covariant entropy bounds and similar stuff was a set of ideas that was supposed to lead to insights such as "entanglement is glue" – to the entanglement minirevolution in quantum gravity. But the historical fact seems to be that the entanglement minirevolution was started by very different considerations. As guides, the covariant entropy bounds etc. turned out to be rather useless.

One must be equally careful not to get brainwashed by a religious faith in the case of the Weak Gravity Conjecture (WGC), another inequality or a family of inequalities that I co-authored. Gravity is the weakest force, a general principle of quantum gravity seems to say. What it means is that we must be able to find (light enough) particle species whose gravitational attraction (to another copy of the same particle) is weaker than the electromagnetic or similar non-gravitational force between the two particles.

There are reasons to think it's true that are "principled" – and therefore analogous to the "non-existence of perpetuum mobile of the second kind" or the "tachyons and traversable wormholes" in the case of energy conditions. Among them, you find the required non-existence of remnants and the need for extremal black holes to Hawking evaporate, after all. And there are also ways to argue that the Weak Gravity Conjecture is true because it's a derived fact from some stringy vacua – although these proofs are less rigorous at this moment, they're analogous to the derivation of the H-theorem encoding the second law of thermodynamics.

We would like to know the most exact form of the WGC for the most general vacuum of quantum gravity. And we would also like to find the theory (probably a new description of string theory) that makes the validity of this inequality manifest. So far, the proofs of WGC may exist within families of string vacua (or descriptions) but the proof heavily depends on the description.

I think it's fair to say that – unless I missed something – there is no solid reason to think that there exists "the" canonical form of the most general WGC-style inequality. The existence of a unique inequality is a wishful thinking. Lots of inequalities may still be true but they may resemble \(8+9\gt 10\). So people must be warned. All of it looks very interesting but you may end up looking for a holy grail that doesn't exist. Well, it may exist but I can't guarantee (prove) it for you.

And even if we understood the most general form of WGC and what it implies for many vacua, would it help us to find the deeper formulation of string/M-theory? This statement is also uncertain. String theory seems to be more predictive than effective quantum field theories where WGC may apparently be easily violated. But effective QFTs probably mistreat the black hole interior and other things. Maybe if you just require some higher spacetime consistency, the WGC may follow – directly from some refined pseudo-local spacetime treatment. There have been lots of interesting papers linking the validity of WGC to other, seemingly totally different inequalities – well, including the aforementioned Cosmic Censorship Conjecture.

Many of us still feel that some very deep insights could be waiting for those who "play" with very similar ideas but this belief isn't demonstrated and it may be false in the future, too. I want people to think hard about it but only if they realize that no one can promise them that such a search will lead to a breakthrough. Even if someone found a real breakthrough while playing with WGC, I wouldn't take credit for it because it could have been a coincidence what the person plays with "right before" she makes the big new discovery.

At the end, even if the thinking about WGC could help you to think about the "right type of questions" – how is it possible that string theory imposes this constraint that effective QFTs seem to be indifferent to – there are probably other and perhaps more direct ways, completely avoiding the WGC, to get to the deeper principles. There have been ways to formulate quantum mechanics without thinking about the Heisenberg inequality, too. After all, Heisenberg wrote down quantum mechanics in 1925 and the inequality was only pointed out in 1927 – two hours later! (OK, without the warp speed, it was two years, thanks to Paul.) So by basic chronology, the inequality couldn't have been too useful in the search for the new laws of modern physics – quantum mechanics. At the level of chronology, the example of the uncertainty principle is different from the example of the second law.

When we generalize these thoughts a little bit more, it seems reasonable that bright people who will play with these and similar ideas are more likely to make a breakthrough. But the inequalities such as generalized energy conditions, generalized holographic bounds, and weak gravity and similar conditions are just players that you may use for orientation, not to get lost in the spacetimeless realm without any fixed point. But there's no "really strong evidence" supporting the belief that the playing with such inequalities will be very helpful. It might be that most of the work spent by games like that will be analogous to the purely mathematical efforts designed to prove the mathematical inequalities such as the inequality between the arithmetic and geometric means.

At the end, what we really want are the truly fundamental new principles and I think that inequalities can't be new principles of full-blown (e.g. constructive) theories.

And that's the memo.

by Luboš Motl ( at February 22, 2018 03:14 PM

Clifford V. Johnson - Asymptotia

A Chat with Henry Jenkins!

Yesterday Henry Jenkins and I had a great chat as a Facebook Live event. The video is here. The conversation started with the movie Black Panther, but wandered into many topics related to culture, media, science, representation, and beyond. Among other things, we talked about what we enjoyed about the movie, what graphic novels and comics we're reading now, and what comics source material we'd love to see given a film treatment. Oh, yes, we also mentioned The Dialogues!


-cvj Click to continue reading this post

The post A Chat with Henry Jenkins! appeared first on Asymptotia.

by Clifford at February 22, 2018 03:05 PM

Clifford V. Johnson - Asymptotia

Talk Nerdy!

I was on the Talk Nerdy podcast recently, talking with host Cara Santa Maria about all sorts of things. It was a fun conversation ranging over many topics in science, including some of the latest discoveries in astronomy using gravitational waves in concert with traditional telescopes to learn new things about our universe. And yes, my book The Dialogues was discussed too! A link to the podcast is here. You can find Talk Nerdy on many of your favourite podcast platforms. Why not subscribe? The whole show is full of great conversations!

-cvj Click to continue reading this post

The post Talk Nerdy! appeared first on Asymptotia.

by Clifford at February 22, 2018 02:35 PM

February 08, 2018

Sean Carroll - Preposterous Universe

Why Is There Something, Rather Than Nothing?

A good question!

Or is it?

I’ve talked before about the issue of why the universe exists at all (1, 2), but now I’ve had the opportunity to do a relatively careful job with it, courtesy of Eleanor Knox and Alastair Wilson. They are editing an upcoming volume, the Routledge Companion to the Philosophy of Physics, and asked me to contribute a chapter on this topic. Final edits aren’t done yet, but I’ve decided to put the draft on the arxiv:

Why Is There Something, Rather Than Nothing?
Sean M. Carroll

It seems natural to ask why the universe exists at all. Modern physics suggests that the universe can exist all by itself as a self-contained system, without anything external to create or sustain it. But there might not be an absolute answer to why it exists. I argue that any attempt to account for the existence of something rather than nothing must ultimately bottom out in a set of brute facts; the universe simply is, without ultimate cause or explanation.

As you can see, my basic tack hasn’t changed: this kind of question might be the kind of thing that doesn’t have a sensible answer. In our everyday lives, it makes sense to ask “why” this or that event occurs, but such questions have answers only because they are embedded in a larger explanatory context. In particular, because the world of our everyday experience is an emergent approximation with an extremely strong arrow of time, such that we can safely associate “causes” with subsequent “effects.” The universe, considered as all of reality (i.e. let’s include the multiverse, if any), isn’t like that. The right question to ask isn’t “Why did this happen?”, but “Could this have happened in accordance with the laws of physics?” As far as the universe and our current knowledge of the laws of physics is concerned, the answer is a resounding “Yes.” The demand for something more — a reason why the universe exists at all — is a relic piece of metaphysical baggage we would be better off to discard.

This perspective gets pushback from two different sides. On the one hand we have theists, who believe that they can answer why the universe exists, and the answer is God. As we all know, this raises the question of why God exists; but aha, say the theists, that’s different, because God necessarily exists, unlike the universe which could plausibly have not. The problem with that is that nothing exists necessarily, so the move is pretty obviously a cheat. I didn’t have a lot of room in the paper to discuss this in detail (in what after all was meant as a contribution to a volume on the philosophy of physics, not the philosophy of religion), but the basic idea is there. Whether or not you want to invoke God, you will be left with certain features of reality that have to be explained by “and that’s just the way it is.” (Theism could possibly offer a better account of the nature of reality than naturalism — that’s a different question — but it doesn’t let you wiggle out of positing some brute facts about what exists.)

The other side are those scientists who think that modern physics explains why the universe exists. It doesn’t! One purported answer — “because Nothing is unstable” — was never even supposed to explain why the universe exists; it was suggested by Frank Wilczek as a way of explaining why there is more matter than antimatter. But any such line of reasoning has to start by assuming a certain set of laws of physics in the first place. Why is there even a universe that obeys those laws? This, I argue, is not a question to which science is ever going to provide a snappy and convincing answer. The right response is “that’s just the way things are.” It’s up to us as a species to cultivate the intellectual maturity to accept that some questions don’t have the kinds of answers that are designed to make us feel satisfied.

by Sean Carroll at February 08, 2018 05:19 PM

February 07, 2018

Axel Maas - Looking Inside the Standard Model

How large is an elementary particle?
Recently, in the context of a master thesis, our group has begun to determine the size of the W boson. The natural questions on this project is: Why do you do that? Do we not know it already? And does elementary particles have a size at all?

It is best to answer these questions in reverse order.

So, do elementary particles have a size at all? Well, elementary particles are called elementary as they are the most basic constituents. In our theories today, they start out as pointlike. Only particles made from other particles, so-called bound states like a nucleus or a hadron, have a size. And now comes the but.

First of all, we do not yet know whether our elementary particles are really elementary. They may also be bound states of even more elementary particles. But in experiments we can only determine upper bounds to the size. Making better experiments will reduce this upper bound. Eventually, we may see that a particle previously thought of as point-like has a size. This has happened quite frequently over time. It always opened up a new level of elementary particle theories. Therefore measuring the size is important. But for us, as theoreticians, this type of question is only important if we have an idea about what could be the more elementary particles. And while some of our research is going into this direction, this project is not.

The other issue is that quantum effects give all elementary particles an 'apparent' size. This comes about by how we measure the size of a particle. We do this by shooting some other particle at it, and measure how strongly it becomes deflected. A truly pointlike particle has a very characteristic reflection profile. But quantum effects allow for additional particles to be created and destroyed in the vicinity of any particle. Especially, they allow for the existence of another particle of the same type, at least briefly. We cannot distinguish whether we hit the original particle or one of these. Since they are not at the same place as the original particle, their average distance looks like a size. This gives even a pointlike particle an apparent size, which we can measure. In this sense even an elementary particle has a size.

So, how can we then distinguish this size from an actual size of a bound state? We can do this by calculations. We determine the apparent size due to the quantum fluctuations and compare it to the measurement. Deviations indicate an actual size. This is because for a real bound state we can scatter somewhere in its structure, and not only in its core. This difference looks pictorially like this:

So, do we know the size already? Well, as said, we can only determine upper limits. Searching for them is difficult, and often goes via detours. One of such detours are so-called anomalous couplings. Measuring how they depend on energy provides indirect information on the size. There is an active program at CERN underway to do this experimentally. The results are so far say that the size of the W is below 0.0000000000000001 meter. This seems tiny, but in the world of particle physics this is not that strong a limit.

And now the interesting question: Why do we do this? As written, we do not want to make the W a bound state of something new. But one of our main research topics is driven by an interesting theoretical structure. If the standard model is taken seriously, the particle which we observe in an experiment and call the W is actually not the W of the underlying theory. Rather, it is a bound state, which is very, very similar to the elementary particle, but actually build from the elementary particles. The difference has been so small that identifying one with the other was a very good approximation up to today. But with better and better experiments may change. Thus, we need to test this.

Because then the thing we measure is a bound state it should have a, probably tiny, size. This would be a hallmark of this theoretical structure. And that we understood it. If the size is such that it could be actually measured at CERN, then this would be an important test of our theoretical understanding of the standard model.

However, this is not a simple quantity to calculate. Bound states are intrinsically complicated. Thus, we use simulations for this purpose. In fact, we actually go over the same detour as the experiments, and will determine an anomalous coupling. From this we then infer the size indirectly. In addition, the need to perform efficient simulations forces us to simplify the problem substantially. Hence, we will not get the perfect number. But we may get the order of magnitude, or be perhaps within a factor of two, or so. And this is all we need to currently say whether a measurement is possible, or whether this will have to wait for the next generation of experiments. And thus whether we will know whether we understood the theory within a few years or within a few decades.

by Axel Maas ( at February 07, 2018 11:18 AM

February 05, 2018

Matt Strassler - Of Particular Significance

In Memory of Joe Polchinski, the Brane Master

This week, the community of high-energy physicists — of those of us fascinated by particles, fields, strings, black holes, and the universe at large — is mourning the loss of one of the great theoretical physicists of our time, Joe Polchinski. It pains me deeply to write these words.

Everyone who knew him personally will miss his special qualities — his boyish grin, his slightly wicked sense of humor, his charming way of stopping mid-sentence to think deeply, his athleticism and friendly competitiveness. Everyone who knew his research will feel the absence of his particular form of genius, his exceptional insight, his unique combination of abilities, which I’ll try to sketch for you below. Those of us who were lucky enough to know him both personally and scientifically — well, we lose twice.

Image result for joe polchinski

Polchinski — Joe, to all his colleagues — had one of those brains that works magic, and works magically. Scientific minds are as individual as personalities. Each physicist has a unique combination of talents and skills (and weaknesses); in modern lingo, each of us has a superpower or two. Rarely do you find two scientists who have the same ones.

Joe had several superpowers, and they were really strong. He had a tremendous knack for looking at old problems and seeing them in a new light, often overturning conventional wisdom or restating that wisdom in a new, clearer way. And he had prodigious technical ability, which allowed him to follow difficult calculations all the way to the end, on paths that would have deterred most of us.

One of the greatest privileges of my life was to work with Joe, not once but four times. I think I can best tell you a little about him, and about some of his greatest achievements, through the lens of that unforgettable experience.

[To my colleagues: this post was obviously written in trying circumstances, and it is certainly possible that my memory of distant events is foggy and in error.  I welcome any corrections that you might wish to suggest.]

Our papers between 1999 and 2006 were a sequence of sorts, aimed at understanding more fully the profound connection between quantum field theory — the language of particle physics — and string theory — best-known today as a candidate for a quantum theory of gravity. In each of those papers, as in many thousands of others written after 1995, Joe’s most influential contribution to physics played a central role. This was the discovery of objects known as “D-branes”, which he found in the context of string theory. (The term is a generalization of the word `membrane’.)

I can already hear the polemical haters of string theory screaming at me. ‘A discovery in string theory,’ some will shout, pounding the table, ‘an untested and untestable theory that’s not even wrong, should not be called a discovery in physics.’ Pay them no mind; they’re not even close, as you’ll see by the end of my remarks.

The Great D-scovery

In 1989, Joe, working with two young scientists, Jin Dai and Rob Leigh, was exploring some details of string theory, and carrying out a little mathematical exercise. Normally, in string theory, strings are little lines or loops that are free to move around anywhere they like, much like particles moving around in this room. But in some cases, particles aren’t in fact free to move around; you could, for instance, study particles that are trapped on the surface of a liquid, or trapped in a very thin whisker of metal. With strings, there can be a new type of trapping that particles can’t have — you could perhaps trap one end, or both ends, of the string within a surface, while allowing the middle of the string to move freely. The place where a string’s end may be trapped — whether a point, a line, a surface, or something more exotic in higher dimensions — is what we now call a “D-brane”.  [The `D’ arises for uninteresting technical reasons.]

Joe and his co-workers hit the jackpot, but they didn’t realize it yet. What they discovered, in retrospect, was that D-branes are an automatic feature of string theory. They’re not optional; you can’t choose to study string theories that don’t have them. And they aren’t just surfaces or lines that sit still. They’re physical objects that can roam the world. They have mass and create gravitational effects. They can move around and scatter off each other. They’re just as real, and just as important, as the strings themselves!


Fig. 1: D branes (in green) are physical objects on which a fundamental string (in red) can terminate.

It was as though Joe and his collaborators started off trying to understand why the chicken crossed the road, and ended up discovering the existence of bicycles, cars, trucks, buses, and jet aircraft.  It was that unexpected, and that rich.

And yet, nobody, not even Joe and his colleagues, quite realized what they’d done. Rob Leigh, Joe’s co-author, had the office next to mine for a couple of years, and we wrote five papers together between 1993 and 1995. Yet I think Rob mentioned his work on D-branes to me just once or twice, in passing, and never explained it to me in detail. Their paper had less than twenty citations as 1995 began.

In 1995 the understanding of string theory took a huge leap forward. That was the moment when it was realized that all five known types of string theory are different sides of the same die — that there’s really only one string theory.  A flood of papers appeared in which certain black holes, and generalizations of black holes — black strings, black surfaces, and the like — played a central role. The relations among these were fascinating, but often confusing.

And then, on October 5, 1995, a paper appeared that changed the whole discussion, forever. It was Joe, explaining D-branes to those of us who’d barely heard of his earlier work, and showing that many of these black holes, black strings and black surfaces were actually D-branes in disguise. His paper made everything clearer, simpler, and easier to calculate; it was an immediate hit. By the beginning of 1996 it had 50 citations; twelve months later, the citation count was approaching 300.

So what? Great for string theorists, but without any connection to experiment and the real world.  What good is it to the rest of us? Patience. I’m just getting to that.

What’s it Got to Do With Nature?

Our current understanding of the make-up and workings of the universe is in terms of particles. Material objects are made from atoms, themselves made from electrons orbiting a nucleus; and the nucleus is made from neutrons and protons. We learned in the 1970s that protons and neutrons are themselves made from particles called quarks and antiquarks and gluons — specifically, from a “sea” of gluons and a few quark/anti-quark pairs, within which sit three additional quarks with no anti-quark partner… often called the `valence quarks’.  We call protons and neutrons, and all other particles with three valence quarks, `baryons”.   (Note that there are no particles with just one valence quark, or two, or four — all you get is baryons, with three.)

In the 1950s and 1960s, physicists discovered short-lived particles much like protons and neutrons, with a similar sea, but which  contain one valence quark and one valence anti-quark. Particles of this type are referred to as “mesons”.  I’ve sketched a typical meson and a typical baryon in Figure 2.  (The simplest meson is called a “pion”; it’s the most common particle produced in the proton-proton collisions at the Large Hadron Collider.)



Fig. 2: Baryons (such as protons and neutrons) and mesons each contain a sea of gluons and quark-antiquark pairs; baryons have three unpaired “valence” quarks, while mesons have a valence quark and a valence anti-quark.  (What determines whether a quark is valence or sea involves subtle quantum effects, not discussed here.)

But the quark/gluon picture of mesons and baryons, back in the late 1960s, was just an idea, and it was in competition with a proposal that mesons are little strings. These are not, I hasten to add, the “theory of everything” strings that you learn about in Brian Greene’s books, which are a billion billion times smaller than a proton. In a “theory of everything” string theory, often all the types of particles of nature, including electrons, photons and Higgs bosons, are tiny tiny strings. What I’m talking about is a “theory of mesons” string theory, a much less ambitious idea, in which only the mesons are strings.  They’re much larger: just about as long as a proton is wide. That’s small by human standards, but immense compared to theory-of-everything strings.

Why did people think mesons were strings? Because there was experimental evidence for it! (Here’s another example.)  And that evidence didn’t go away after quarks were discovered. Instead, theoretical physicists gradually understood why quarks and gluons might produce mesons that behave a bit like strings. If you spin a meson fast enough (and this can happen by accident in experiments), its valence quark and anti-quark may separate, and the sea of objects between them forms what is called a “flux tube.” See Figure 3. [In certain superconductors, somewhat similar flux tubes can trap magnetic fields.] It’s kind of a thick string rather than a thin one, but still, it shares enough properties with a string in string theory that it can produce experimental results that are similar to string theory’s predictions.


Fig. 3: One reason mesons behave like strings in experiment is that a spinning meson acts like a thick string, with the valence quark and anti-quark at the two ends.

And so, from the mid-1970s onward, people were confident that quantum field theories like the one that describes quarks and gluons can create objects with stringy behavior. A number of physicists — including some of the most famous and respected ones — made a bolder, more ambitious claim: that quantum field theory and string theory are profoundly related, in some fundamental way. But they weren’t able to be precise about it; they had strong evidence, but it wasn’t ever entirely clear or convincing.

In particular, there was an important unresolved puzzle. If mesons are strings, then what are baryons? What are protons and neutrons, with their three valence quarks? What do they look like if you spin them quickly? The sketches people drew looked something like Figure 3. A baryon would perhaps become three joined flux tubes (with one possibly much longer than the other two), each with its own valence quark at the end.  In a stringy cartoon, that baryon would be three strings, each with a free end, with the strings attached to some sort of junction. This junction of three strings was called a “baryon vertex.”  If mesons are little strings, the fundamental objects in a string theory, what is the baryon vertex from the string theory point of view?!  Where is it hiding — what is it made of — in the mathematics of string theory?


Fig. 4: A fast-spinning baryon looks vaguely like the letter Y — three valence quarks connected by flux tubes to a “baryon vertex”.  A cartoon of how this would appear from a stringy viewpoint, analogous to Fig. 3, leads to a mystery: what, in string theory, is this vertex?!

[Experts: Notice that the vertex has nothing to do with the quarks. It’s a property of the sea — specifically, of the gluons. Thus, in a world with only gluons — a world whose strings naively form loops without ends — it must still be possible, with sufficient energy, to create a vertex-antivertex pair. Thus field theory predicts that these vertices must exist in closed string theories, though they are linearly confined.]


The baryon puzzle: what is a baryon from the string theory viewpoint?

No one knew. But isn’t it interesting that the most prominent feature of this vertex is that it is a location where a string’s end can be trapped?

Everything changed in the period 1997-2000. Following insights from many other physicists, and using D-branes as the essential tool, Juan Maldacena finally made the connection between quantum field theory and string theory precise. He was able to relate strings with gravity and extra dimensions, which you can read about in Brian Greene’s books, with the physics of particles in just three spatial dimensions, similar to those of the real world, with only non-gravitational forces.  It was soon clear that the most ambitious and radical thinking of the ’70s was correct — that almost every quantum field theory, with its particles and forces, can alternatively be viewed as a string theory. It’s a bit analogous to the way that a painting can be described in English or in Japanese — fields/particles and strings/gravity are, in this context, two very different languages for talking about exactly the same thing.

The saga of the baryon vertex took a turn in May 1998, when Ed Witten showed how a similar vertex appears in Maldacena’s examples. [Note added: I had forgotten that two days after Witten’s paper, David Gross and Hirosi Ooguri submitted a beautiful, wide-ranging paper, whose section on baryons contains many of the same ideas.] Not surprisingly, this vertex was a D-brane — specifically a D-particle, an object on which the strings extending from freely-moving quarks could end. It wasn’t yet quite satisfactory, because the gluons and quarks in Maldacena’s examples roam free and don’t form mesons or baryons. Correspondingly the baryon vertex isn’t really a physical object; if you make one, it quickly diffuses away into nothing. Nevertheless, Witten’s paper made it obvious what was going on. To the extent real-world mesons can be viewed as strings, real-world protons and neutrons can be viewed as strings attached to a D-brane.


The baryon puzzle, resolved.  A baryon is made from three strings and a point-like D-brane. [Note there is yet another viewpoint in which a baryon is something known as a skyrmion, a soliton made from meson fields — but that is an issue for another day.]

It didn’t take long for more realistic examples, with actual baryons, to be found by theorists. I don’t remember who found one first, but I do know that one of the earliest examples showed up in my first paper with Joe, in the year 2000.


Working with Joe

That project arose during my September 1999 visit to the KITP (Kavli Institute for Theoretical Physics) in Santa Barbara, where Joe was a faculty member. Some time before that I happened to have studied a field theory (called N=1*) that differed from Maldacena’s examples only slightly, but in which meson-like objects do form. One of the first talks I heard when I arrived at KITP was by Rob Myers, about a weird property of D-branes that he’d discovered. During that talk I made a connection between Myers’ observation and a feature of the N=1* field theory, and I had one of those “aha” moments that physicists live for. I suddenly knew what the string theory that describes the N=1*  field theory must look like.

But for me, the answer was bad news. To work out the details was clearly going to require a very difficult set of calculations, using aspects of string theory about which I knew almost nothing [non-holomorphic curved branes in high-dimensional curved geometry.] The best I could hope to do, if I worked alone, would be to write a conceptual paper with lots of pictures, and far more conjectures than demonstrable facts.

But I was at KITP.  Joe and I had had a good personal rapport for some years, and I knew that we found similar questions exciting. And Joe was the brane-master; he knew everything about D-branes. So I decided my best hope was to persuade Joe to join me. I engaged in a bit of persistent cajoling. Very fortunately for me, it paid off.

I went back to the east coast, and Joe and I went to work. Every week or two Joe would email some research notes with some preliminary calculations in string theory. They had such a high level of technical sophistication, and so few pedagogical details, that I felt like a child; I could barely understand anything he was doing. We made slow progress. Joe did an important warm-up calculation, but I found it really hard to follow. If the warm-up string theory calculation was so complex, had we any hope of solving the full problem?  Even Joe was a little concerned.

Image result for polchinski joeAnd then one day, I received a message that resounded with a triumphant cackle — a sort of “we got ’em!” that anyone who knew Joe will recognize. Through a spectacular trick, he’d figured out how use his warm-up example to make the full problem easy! Instead of months of work ahead of us, we were essentially done.

From then on, it was great fun! Almost every week had the same pattern. I’d be thinking about a quantum field theory phenomenon that I knew about, one that should be visible from the string viewpoint — such as the baryon vertex. I knew enough about D-branes to develop a heuristic argument about how it should show up. I’d call Joe and tell him about it, and maybe send him a sketch. A few days later, a set of notes would arrive by email, containing a complete calculation verifying the phenomenon. Each calculation was unique, a little gem, involving a distinctive investigation of exotically-shaped D-branes sitting in a curved space. It was breathtaking to witness the speed with which Joe worked, the breadth and depth of his mathematical talent, and his unmatched understanding of these branes.

[Experts: It’s not instantly obvious that the N=1* theory has physical baryons, but it does; you have to choose the right vacuum, where the theory is partially Higgsed and partially confining. Then to infer, from Witten’s work, what the baryon vertex is, you have to understand brane crossings (which I knew about from Hanany-Witten days): Witten’s D5-brane baryon vertex operator creates a  physical baryon vertex in the form of a D3-brane 3-ball, whose boundary is an NS 5-brane 2-sphere located at a point in the usual three dimensions. And finally, a physical baryon is a vertex with n strings that are connected to nearby D5-brane 2-spheres. See chapter VI, sections B, C, and E, of our paper from 2000.]

Throughout our years of collaboration, it was always that way when we needed to go head-first into the equations; Joe inevitably left me in the dust, shaking my head in disbelief. That’s partly my weakness… I’m pretty average (for a physicist) when it comes to calculation. But a lot of it was Joe being so incredibly good at it.

Fortunately for me, the collaboration was still enjoyable, because I was almost always able to keep pace with Joe on the conceptual issues, sometimes running ahead of him. Among my favorite memories as a scientist are moments when I taught Joe something he didn’t know; he’d be silent for a few seconds, nodding rapidly, with an intent look — his eyes narrow and his mouth slightly open — as he absorbed the point.  “Uh-huh… uh-huh…”, he’d say.

But another side of Joe came out in our second paper. As we stood chatting in the KITP hallway, before we’d even decided exactly which question we were going to work on, Joe suddenly guessed the answer! And I couldn’t get him to explain which problem he’d solved, much less the solution, for several days!! It was quite disorienting.

This was another classic feature of Joe. Often he knew he’d found the answer to a puzzle (and he was almost always right), but he couldn’t say anything comprehensible about it until he’d had a few days to think and to turn his ideas into equations. During our collaboration, this happened several times. (I never said “Use your words, Joe…”, but perhaps I should have.) Somehow his mind was working in places that language doesn’t go, in ways that none of us outside his brain will ever understand. In him, there was something of an oracle.

Looking Toward The Horizon

Our interests gradually diverged after 2006; I focused on the Large Hadron Collider [also known as the Large D-brane Collider], while Joe, after some other explorations, ended up thinking about black hole horizons and the information paradox. But I enjoyed his work from afar, especially when, in 2012, Joe and three colleagues (Ahmed Almheiri, Don Marolf, and James Sully) blew apart the idea of black hole complementarity, widely hoped to be the solution to the paradox. [I explained this subject here, and also mentioned a talk Joe gave about it here.]  The wreckage is still smoldering, and the paradox remains.

Then Joe fell ill, and we began to lose him, at far too young an age.  One of his last gifts to us was his memoirs, which taught each of us something about him that we didn’t know.  Finally, on Friday last, he crossed the horizon of no return.  If there’s no firewall there, he knows it now.

What, we may already wonder, will Joe’s scientific legacy be, decades from now?  It’s difficult to foresee how a theorist’s work will be viewed a century hence; science changes in unexpected ways, and what seems unimportant now may become central in future… as was the path for D-branes themselves in the course of the 1990s.  For those of us working today, D-branes in string theory are clearly Joe’s most important discovery — though his contributions to our understanding of black holes, cosmic strings, and aspects of field theory aren’t soon, if ever, to be forgotten.  But who knows? By the year 2100, string theory may be the accepted theory of quantum gravity, or it may just be a little-known tool for the study of quantum fields.

Yet even if the latter were to be string theory’s fate, I still suspect it will be D-branes that Joe is remembered for. Because — as I’ve tried to make clear — they’re real.  Really real.  There’s one in every proton, one in every neutron. Our bodies contain them by the billion billion billions. For that insight, that elemental contribution to human knowledge, our descendants can blame Joseph Polchinski.

Thanks for everything, Joe.  We’ll miss you terribly.  You so often taught us new ways to look at the world — and even at ourselves.

Image result for joe polchinski


by Matt Strassler at February 05, 2018 03:59 PM

January 29, 2018

Sean Carroll - Preposterous Universe

Guest Post: Nicole Yunger Halpern on What Makes Extraordinary Science Extraordinary

Nicole Yunger Halpern is a theoretical physicist at Caltech’s Institute for Quantum Information and Matter (IQIM).  She blends quantum information theory with thermodynamics and applies the combination across science, including to condensed matter; black-hole physics; and atomic, molecular, and optical physics. She writes for Quantum Frontiers, the IQIM blog, every month.

What makes extraordinary science extraordinary?

Political junkies watch C-SPAN. Sports fans watch ESPN. Art collectors watch Christie’s. I watch scientists respond to ideas.

John Preskill—Caltech professor, quantum-information theorist, and my PhD advisor—serves as the Chief Justice John Roberts of my C-SPAN. Ideas fly during group meetings, at lunch outside a campus cafeteria, and in John’s office. Many ideas encounter a laconicism compared with which Ernest Hemingway babbles. “Hmm,” I hear. “Ok.” “Wait… What?”

The occasional idea provokes an “mhm.” The final syllable has a higher pitch than the first. Usually, the inflection change conveys agreement and interest. Receiving such an “mhm” brightens my afternoon like a Big Dipper sighting during a 9 PM trudge home.

Hearing “That’s cool,” “Nice,” or “I’m excited,” I cartwheel internally.

What distinguishes “ok” ideas from “mhm” ideas? Peeling the Preskillite trappings off this question reveals its core: What distinguishes good science from extraordinary science?

I’ve been grateful for opportunities to interview senior scientists, over the past few months, from coast to coast. The opinions I collected varied. Several interviewees latched onto the question as though they pondered it daily. A couple of interviewees balked (I don’t know; that’s tricky…) but summoned up a sermon. All the responses fired me up: The more wisps of mist withdrew from the nature of extraordinary science, the more I burned to contribute.

I’ll distill, interpret, and embellish upon the opinions I received. Italics flag lines that I assembled to capture ideas that I heard, as well as imperfect memories of others’ words. Quotation marks surround lines that others constructed. Feel welcome to chime in, in the “comments” section.

One word surfaced in all, or nearly all, my conversations: “impact.” Extraordinary science changes how researchers across the world think. Extraordinary science reaches beyond one subdiscipline.

This reach reminded me of answers to a question I’d asked senior scientists when in college: “What do you mean by ‘beautiful’?”  Replies had varied, but a synopsis had crystallized: “Beautiful science enables us to explain a lot with a little.” Schrodinger’s equation, which describes how quantum systems evolve, fits on one line. But the equation describes electrons bound to atoms, particles trapped in boxes, nuclei in magnetic fields, and more. Beautiful science, which overlaps with extraordinary science, captures much of nature in a small net.

Inventing a field constitutes extraordinary science. Examples include the fusion of quantum information with high-energy physics. Entanglement, quantum computation, and error correction are illuminating black holes, wormholes, and space-time.

Extraordinary science surprises us, revealing faces that we never expected nature to wear. Many extraordinary experiments generate data inexplicable with existing theories. Some extraordinary theory accounts for puzzling data; some extraordinary theory provokes experiments. I graduated from the Perimeter Scholars International Masters program,  at the Perimeter Institute for Theoretical Physics, almost five years ago. Canadian physicist Art McDonald presented my class’s commencement address. An interest in theory, he said, brought you to this institute. Plunge into theory, if you like. Theorem away. But keep a bead on experiments. Talk with experimentalists; work to understand them. McDonald won a Nobel Prize, two years later, for directing the Sudbury Neutrino Observatory (SNO). (SNOLab, with the Homestake experiment, revealed properties of subatomic particles called “neutrinos.” A neutrino’s species can change, and neutrinos have tiny masses. Neutrinos might reveal why the universe contains more matter than antimatter.)

Not all extraordinary theory clings to experiment like bubblegum to hair. Elliott Lieb and Mary Beth Ruskai proved that quantum entropies obey an inequality called “strong subadditivity” (SSA).  Entropies quantify uncertainty about which outcomes measurements will yield. Experimentalists could test SSA’s governance of atoms, ions, and materials. But no physical platform captures SSA’s essence.

Abstract mathematics underlies Lieb and Ruskai’s theorem: convexity and concavity (properties of functions), the Golden-Thompson inequality (a theorem about exponentials of matrices), etc. Some extraordinary theory dovetails with experiment; some wings away.

One interviewee sees extraordinary science in foundational science. At our understanding’s roots lie ideas that fertilize diverse sprouts. Other extraordinary ideas provide tools for calculating, observing, or measuring. Richard Feynman sped up particle-physics computations, for instance, by drawing diagrams.  Those diagrams depict high-energy physics as the creation, separation, recombination, and annihilation of particles. Feynman drove not only a technical, but also a conceptual, advance. Some extraordinary ideas transform our views of the world.

Difficulty preoccupied two experimentalists. An experiment isn’t worth undertaking, one said, if it isn’t difficult. A colleague, said another, “does the impossible and makes it look easy.”

Simplicity preoccupied two theorists. I wrung my hands, during year one of my PhD, in an email to John. The results I’d derived—now that I’d found them— looked as though I should have noticed them months earlier. What if the results lacked gristle? “Don’t worry about too simple,” John wrote back. “I like simple.”

Another theorist agreed: Simplification promotes clarity. Not all simple ideas “go the distance.” But ideas run farther when stripped down than when weighed down by complications.

Extraordinary scientists have a sense of taste. Not every idea merits exploration. Identifying the ideas that do requires taste, or style, or distinction. What distinguishes extraordinary science? More of the theater critic and Julia Child than I expected five years ago.

With gratitude to the thinkers who let me pick their brains.

by Sean Carroll at January 29, 2018 05:45 PM

Georg von Hippel - Life on the lattice

Looking for guest blogger(s) to cover LATTICE 2018
Since I will not be attending LATTICE 2018 for some excellent personal reasons, I am looking for a guest blogger or even better several guest bloggers from the lattice community who would be interested in covering the conference. Especially for advanced PhD students or junior postdocs, this might be a great opportunity to get your name some visibility. If you are interested, drop me a line either in the comment section or by email (my university address is easy to find).

by Georg v. Hippel ( at January 29, 2018 11:49 AM

January 25, 2018

Alexey Petrov - Symmetry factor

Rapid-response (non-linear) teaching: report

Some of you might remember my previous post about non-linear teaching, where I described a new teaching strategy that I came up with and was about to implement in teaching my undergraduate Classical Mechanics I class. Here I want to report on the outcomes of this experiment and share some of my impressions on teaching.

Course description

Our Classical Mechanics class is a gateway class for our physics majors. It is the first class they take after they are done with general physics lectures. So the students are already familiar with the (simpler version of the) material they are going to be taught. The goal of this class is to start molding physicists out of physics students. It is a rather small class (max allowed enrollment is 20 students; I had 22 in my class), which makes professor-student interaction rather easy.

Rapid-response (non-linear) teaching: generalities

To motivate the method that I proposed, I looked at some studies in experimental psychology, in particular in memory and learning studies. What I was curious about is how much is currently known about the process of learning and what suggestions I can take from the psychologists who know something about the way our brain works in retaining the knowledge we receive.

As it turns out, there are some studies on this subject (I have references, if you are interested). The earliest ones go back to 1880’s when German psychologist Hermann Ebbinghaus hypothesized the way our brain retains information over time. The “forgetting curve” that he introduced gives approximate representation of information retention as a function of time. His studies have been replicated with similar conclusions in recent experiments.

EbbinghausCurveThe upshot of these studies is that loss of learned information is pretty much exponential; as can be seen from the figure on the left, in about a day we only retain about 40% of what we learned.

Psychologists also learned that one of the ways to overcome the loss of information is to (meaningfully) retrieve it: this is how learning  happens. Retrieval is critical for robust, durable, and long-term learning. It appears that every time we retrieve learned information, it becomes more accessible in the future. It is, however, important how we retrieve that stored information: simple re-reading of notes or looking through the examples will not be as effective as re-working the lecture material. It is also important how often we retrieve the stored info.

So, here is what I decided to change in the way I teach my class in light of the above-mentioned information (no pun intended).

Rapid-response (non-linear) teaching: details

To counter the single-day information loss, I changed the way homework is assigned: instead of assigning homework sets with 3-4-5 problems per week, I introduced two types of homework assignments: short homeworks and projects.

Short homework assignments are single-problem assignments given after each class that must be done by the next class. They are designed such that a student needs to re-derive material that was discussed previously in class (with small new twist added). For example, if the block-down-to-incline problem was discussed in class, the short assignment asks to redo the problem with a different choice of coordinate axes. This way, instead of doing an assignment in the last minute at the end of the week, the students are forced to work out what they just learned in class every day (meaningful retrieval)!

The second type of assignments, project homework assignments are designed to develop understanding of how topics in a given chapter relate to each other. There are as many project assignments as there are chapters. Students get two weeks to complete them.

At the end, the students get to solve approximately the same number of problems over the course of the semester.

For a professor, the introduction of short homework assignments changes the way class material is presented. Depending on how students performed on the previous short homework, I adjusted the material (both speed and volume) that we discussed in class. I also designed examples for the future sections in such a way that I could repeat parts of the topic that posed some difficulties in comprehension. Overall, instead of a usual “linear” propagation of the course, we moved along something akin to helical motion, returning and spending more time on topics that students found more difficult (hence “rapid-response or non-linear” teaching).

Other things were easy to introduce: for instance, using Socrates’ method in doing examples. The lecture itself was an open discussion between the prof and students.


So, I have implemented this method in teaching Classical Mechanics I class in Fall 2017 semester. It was not an easy exercise, mostly because it was the first time I was teaching GraphNonlinearTeachingthis class and had no grader help. I would say the results confirmed my expectations: introduction of short homework assignments helps students to perform better on the exams. Now, my statistics is still limited: I only had 20 students in my class. Yet, among students there were several who decided to either largely ignore short homework assignments or did them irregularly. They were given zero points for each missed short assignment. All students generally did well on their project assignments, yet there appears some correlation (see graph above) between the total number of points acquired on short homework assignments and exam performance (measured by a total score on the Final and two midterms). This makes me thing that short assignments were beneficial for students. I plan to teach this course again next year, which will increase my statistics.

I was quite surprised that my students generally liked this way of teaching. In fact, they were disappointed that I decided not to apply this method for the Mechanics II class that I am teaching this semester. They also found that problems assigned in projects were considerably harder than the problems from the short assignments (this is how it was supposed to be).

For me, this was not an easy semester. I had to develop my set of lectures — so big thanks go to my colleagues Joern Putschke and Rob Harr who made their notes available. I spent a lot of time preparing this course, which, I think, affected my research outcome last semester. Yet, most difficulties are mainly Wayne State-specifics: Wayne State does not provide TAs for small classes, so I had to not only design all homework assignments, but also grade them (on top of developing the lectures from the ground up). During the semester, it was important to grade short assignments in the same day I received them to re-tune lectures, this did take a lot of my time. I would say TAs would certainly help to run this course — so I’ll be applying for some internal WSU educational grants to continue development of this method. I plan to employ it again next year to teach Classical Mechanics.


by apetrov at January 25, 2018 08:18 PM

January 22, 2018

Axel Maas - Looking Inside the Standard Model

Finding - and curing - disagreements
The topic of grand-unified theories came up in the blog several times, most recently last year in January. To briefly recap, such theories, called GUTs for short, predict that all three forces between elementary particles emerge from a single master force. That would explain a lot of unconnected observations we have in particle physics. For example, why atoms are electrically neutral. The latter we can describe, but not yet explain.

However, if such a GUT exists, then it must not only explain the forces, but also somehow why we see the numbers and kinds of elementary particles we observe in nature. And now things become complicated. As discussed in the last entry on GUTs there maybe a serious issue in how we determine which particles are actually described by such a theory.

To understand how this issue comes about, I need to put together many different things my research partners and I have worked on during the last couple of years. All of these issues are actually put into an expert language in the review of which I talked in the previous entry. It is now finished, and if your interested, you can get it free from here. But it is very technical.

So, let me explain it less technically.

Particle physics is actually superinvolved. If we would like to write down a theory which describes what we see, and only what we see, it would be terribly complicated. It is much more simple to introduce redundancies in the description, so-called gauge symmetries. This makes life much easier, though still not easy. However, the most prominent feature is that we add auxiliary particles to the game. Of course, they cannot be really seen, as they are just auxiliary. Some of them are very obviously unphysical, called therefore ghosts. They can be taken care of comparatively simply. For others, this is less simple.

Now, it turns out that the weak interaction is a very special beast. In this case, there is a unique one-to-one identification between a really observable particle and an auxiliary particle. Thus, it is almost correct to identify both. But this is due to the very special structure of this part of particle physics.

Thus, a natural question is whether, even if it is special, it is justified to do the same for other theories. Well, in some cases, this seems to be the case. But we suspected that this may not be the case in general. And especially not in GUTs.

Now, recently we were going about this much more systematically. You can again access the (very, very technical) result for free here. There, we looked at a very generic class of such GUTs. Well, we actually looked at the most relevant part of them, and still by far not all of them. We also ignored a lot of stuff, e.g. what would become quarks and leptons, and concentrated only on the generalization of the weak interaction and the Higgs.

We then checked, based on our earlier experiences and methods, whether a one-to-one identification of experimentally accessible and auxiliary particles works. And it does essentially never. Visually, this result looks like

On the left, it is seen that everything works nicely with a one-to-one identification in the standard model. On the right, if one-to-one identification would work in a GUT, everything would still be nice. But a our more precise calculation shows that the actually situation, which would be seen in an experiment, is different. There is non one-to-one identification possible. And thus the prediction of the GUT differs from what we already see inn experiments. Thus, a previously good GUT candidate is no longer good.

Though more checks are needed, as always, this is a baffling, and at the same time very discomforting, result.

Baffling as we did originally expect to have problems under very special circumstances. It now appears that actually the standard model of particles is the very special case, and having problems is the standard.

It is discomforting because in the powerful method of perturbation theory the one-to-one identification is essentially always made. As this tool is widely used, this seems to question the validity of many predictions on GUTs. That could have far-reaching consequences. Is this the case? Do we need to forget everything about GUTs we learned so far?

Well, not really, for two reasons. One is that we also showed that methods almost as easily handleable as perturbation theory can be used to fix the problems. This is good, because more powerful methods, like the simulations we used before, are much more cumbersome. However, this leaves us with the problem of having made so far wrong predictions. Well, this we cannot change. But this is just normal scientific progress. You try, you check, you fail, you improve, and then you try again.

And, in fact, this does not mean that GUTs are wrong. Just that we need to consider somewhat different GUTs, and make the predictions more carefully next time. Which GUTs we need to look at we still need to figure out, and that will not be simple. But, fortunately, the improved methods mentioned beforehand can use much of what has been done so far, so most technical results are still unbelievable useful. This will help enormously in finding GUTs which are applicable, and yield a consistent picture, without the one-to-one identification. GUTs are not dead. They likely just need a bit of changing.

This is indeed a dramatic development. But one which fits logically and technically to the improved understanding of the theoretical structures underlying particle physics, which were developed over the last decades. Thus, we are confident that this is just the next logical step in our understanding of how particle physics works.

by Axel Maas ( at January 22, 2018 04:54 PM

January 17, 2018

Sean Carroll - Preposterous Universe

Beyond Falsifiability

I have a backlog of fun papers that I haven’t yet talked about on the blog, so I’m going to try to work through them in reverse chronological order. I just came out with a philosophically-oriented paper on the thorny issue of the scientific status of multiverse cosmological models:

Beyond Falsifiability: Normal Science in a Multiverse
Sean M. Carroll

Cosmological models that invoke a multiverse – a collection of unobservable regions of space where conditions are very different from the region around us – are controversial, on the grounds that unobservable phenomena shouldn’t play a crucial role in legitimate scientific theories. I argue that the way we evaluate multiverse models is precisely the same as the way we evaluate any other models, on the basis of abduction, Bayesian inference, and empirical success. There is no scientifically respectable way to do cosmology without taking into account different possibilities for what the universe might be like outside our horizon. Multiverse theories are utterly conventionally scientific, even if evaluating them can be difficult in practice.

This is well-trodden ground, of course. We’re talking about the cosmological multiverse, not its very different relative the Many-Worlds interpretation of quantum mechanics. It’s not the best name, as the idea is that there is only one “universe,” in the sense of a connected region of space, but of course in an expanding universe there will be a horizon past which it is impossible to see. If conditions in far-away unobservable regions are very different from conditions nearby, we call the collection of all such regions “the multiverse.”

There are legitimate scientific puzzles raised by the multiverse idea, but there are also fake problems. Among the fakes is the idea that “the multiverse isn’t science because it’s unobservable and therefore unfalsifiable.” I’ve written about this before, but shockingly not everyone immediately agreed with everything I have said.

Back in 2014 the Edge Annual Question was “What Scientific Theory Is Ready for Retirement?”, and I answered Falsifiability. The idea of falsifiability, pioneered by philosopher Karl Popper and adopted as a bumper-sticker slogan by some working scientists, is that a theory only counts as “science” if we can envision an experiment that could potentially return an answer that was utterly incompatible with the theory, thereby consigning it to the scientific dustbin. Popper’s idea was to rule out so-called theories that were so fuzzy and ill-defined that they were compatible with literally anything.

As I explained in my short write-up, it’s not so much that falsifiability is completely wrong-headed, it’s just not quite up to the difficult task of precisely demarcating the line between science and non-science. This is well-recognized by philosophers; in my paper I quote Alex Broadbent as saying

It is remarkable and interesting that Popper remains extremely popular among natural scientists, despite almost universal agreement among philosophers that – notwithstanding his ingenuity and philosophical prowess – his central claims are false.

If we care about accurately characterizing the practice and principles of science, we need to do a little better — which philosophers work hard to do, while some physicists can’t be bothered. (I’m not blaming Popper himself here, nor even trying to carefully figure out what precisely he had in mind — the point is that a certain cartoonish version of his views has been elevated to the status of a sacred principle, and that’s a mistake.)

After my short piece came out, George Ellis and Joe Silk wrote an editorial in Nature, arguing that theories like the multiverse served to undermine the integrity of physics, which needs to be defended from attack. They suggested that people like me think that “elegance [as opposed to data] should suffice,” that sufficiently elegant theories “need not be tested experimentally,” and that I wanted to “to weaken the testability requirement for fundamental physics.” All of which is, of course, thoroughly false.

Nobody argues that elegance should suffice — indeed, I explicitly emphasized the importance of empirical testing in my very short piece. And I’m not suggesting that we “weaken” anything at all — I’m suggesting that we physicists treat the philosophy of science with the intellectual care that it deserves. The point is not that falsifiability used to be the right criterion for demarcating science from non-science, and now we want to change it; the point is that it never was, and we should be more honest about how science is practiced.

Another target of Ellis and Silk’s ire was Richard Dawid, a string theorist turned philosopher, who wrote a provocative book called String Theory and the Scientific Method. While I don’t necessarily agree with Dawid about everything, he does make some very sensible points. Unfortunately he coins the term “non-empirical theory confirmation,” which was an extremely bad marketing strategy. It sounds like Dawid is saying that we can confirm theories (in the sense of demonstrating that they are true) without using any empirical data, but he’s not saying that at all. Philosophers use “confirmation” in a much weaker sense than that of ordinary language, to refer to any considerations that could increase our credence in a theory. Of course there are some non-empirical ways that our credence in a theory could change; we could suddenly realize that it explains more than we expected, for example. But we can’t simply declare a theory to be “correct” on such grounds, nor was Dawid suggesting that we could.

In 2015 Dawid organized a conference on “Why Trust a Theory?” to discuss some of these issues, which I was unfortunately not able to attend. Now he is putting together a volume of essays, both from people who were at the conference and some additional contributors; it’s for that volume that this current essay was written. You can find other interesting contributions on the arxiv, for example from Joe Polchinski, Eva Silverstein, and Carlo Rovelli.

Hopefully with this longer format, the message I am trying to convey will be less amenable to misconstrual. Nobody is trying to change the rules of science; we are just trying to state them accurately. The multiverse is scientific in an utterly boring, conventional way: it makes definite statements about how things are, it has explanatory power for phenomena we do observe empirically, and our credence in it can go up or down on the basis of both observations and improvements in our theoretical understanding. Most importantly, it might be true, even if it might be difficult to ever decide with high confidence whether it is or not. Understanding how science progresses is an interesting and difficult question, and should not be reduced to brandishing bumper-sticker mottos to attack theoretical approaches to which we are not personally sympathetic.

by Sean Carroll at January 17, 2018 04:44 PM

January 06, 2018

Jon Butterworth - Life and Physics

Atom Land: A Guided Tour Through the Strange (and Impossibly Small) World of Particle Physics

Book review in Publishers’ Weekly.

Butterworth (Most Wanted Particle), a CERN alum and professor of physics at University College London, explains everything particle physics from antimatter to Z bosons in this charming trek through a landscape of “the otherwise invisible.” His accessible narrative cleverly relates difficult concepts, such as wave-particle duality or electron spin, in bite-size bits. Readers become explorers on Butterworth’s metaphoric map… Read more.

by Jon Butterworth at January 06, 2018 05:13 PM

Jon Butterworth - Life and Physics

December 30, 2017

Cormac O’Raifeartaigh - Antimatter (Life in a puzzling universe)

A week’s research and a New Year resolution

If anyone had suggested a few years ago that I would forgo a snowsports holiday in the Alps for a week’s research, I would probably not have believed them. Yet here I am, sitting comfortably in the library of the Dublin Institute for Advanced Studies.

ew3          The School of Theoretical Physics at the Dublin Institute for Advanced Studies

It’s been a most satisfying week. One reason is that a change truly is as good as a rest – after a busy teaching term, it’s very enjoyable to spend some time in a quiet spot, surrounded by books on the history of physics. Another reason is that one can accomplish an astonishing amount in one week’s uninterrupted study. That said, I’m not sure I could do this all year round, I’d miss the teaching!

As regards a resolution for 2018, I’ve decided to focus on getting a book out this year. For some time, I have been putting together a small introductory book on the big bang theory, based on a public lecture I give to diverse audiences, from amateur astronomers to curious taxi drivers. The material is drawn from a course I teach at both Waterford Institute of Technology and University College Dublin and is almost in book form already. The UCD experience is particularly useful, as the module is aimed at first-year students from all disciplines.

Of course, there are already plenty of books out there on this topic. My students have a comprehensive reading list, which includes classics such as A Brief History of Time (Hawking), The First Three Minutes (Weinberg) and The Big Bang (Singh). However, I regularly get feedback to the effect that the books are too hard (Hawking) or too long (Singh) or out of date (Weinberg). So I decided a while ago to put together my own effort; a useful exercise if nothing else comes of it.

In particular, I intend to take a historical approach to the story. I’m a great believer in the ‘how-we-found-out’ approach to explaining scientific theories (think for example of that great BBC4 documentary on the discovery of oxygen). My experience is that a historical approach allows the reader to share the excitement of discovery and makes technical material much easier to understand. In addition, much of the work of the early pioneers remains relevant today. The challenge will be to present a story that is also concise – that’s the hard part!

by cormac at December 30, 2017 04:26 PM

December 28, 2017

Life as a Physicist

Christmas Project

Every Christmas I try to do some sort of project. Something new. Sometimes it turns into something real, and last for years. Sometimes it goes no where. Normally, I have an idea of what I’m going to attempt – usually it has been bugging me for months and I can’t wait till break to get it started. This year, I had none.

But, I arrived home at my parent’s house in New Jersey and there it was waiting for me. The house is old – more 200 yrs old – and the steam furnace had just been replaced. For those of you unfamiliar with this method of heating a house: it is noisy! The furnace boils water, and the steam is forced up through the pipes to cast iron radiators. The radiators hiss through valves as the air is forced up – an iconic sound from my childhood. Eventually, after traveling sometimes four floors, the super hot steam reaches the end of a radiator and the valve shuts off. The valves are cool – heat sensitive! The radiator, full of hot steam, then warms the room – and rather effectively.

The bane of this system, however, is that it can leak. And you have no idea where the leak is in the whole house! The only way you know: the furnace reservoir needs refilling too often. So… the problem: how to detect the reservoir needs refilling? Especially with this new modern furnace which can automatically refill its resevoir.

Me: Oh, look, there is a little LED that comes on when the automatic refilling system comes on! I can watch that! Dad: Oh, look, there is a little light that comes on when the water level is low. We can watch that.

Dad’s choice of tools: a wifi cam that is triggered by noise. Me: A Raspberry Pi 3, a photo-resistor, and a capacitor. Hahahaha. Game on!

IMG_20171227_030002What’s funny? Neither of us have detected a water-refill since we started this project. The first picture at the right you can see both of our devices – in the foreground taped to the gas input line is the CAM watching the water refill light through a mirror, and in the background (look for the yellow tape) is the Pi taped to the refill controller (and the capacitor and sensor hanging down looking at the LED on the bottom of the box).

I chose the Pi because I’ve used it once before – for a Spotify end-point. But never for anything that it is designed for. An Arduino is almost certainly better suited to this – but I wasn’t confident that I could get it up and running in the 3 days I had to make this (including time for ordering and shipping of all parts from Amazon). It was a lot of fun! And consumed a bunch of time. “Hey, where is Gordon? He needs to come for Christmas dinner!” “Wait, are you working on Christmas day?” – for once I could answer that last one with a honest no! Hahaha. Smile

I learned a bunch:

  • I had to solder! It has been a loooong time since I’ve done that. My first graduate student, whom I made learn how to solder before I let him graduate, would have laughed at how rusty my skills were!
  • I was surprised to learn, at the start, that the Pi has no analog to digital converter. I stole a quick and dirty trick that lots of people have used to get around this problem: time how long it takes to charge a capacitor up with a photoresistor. This is probably the biggest source of noise in my system, but does for crude measurements.
  • I got to write all my code in Python. Even interrupt handling (ok, no call backs, but still!)
  • The Pi, by default, runs a full build of Linux. Also, python 3! I made full use of this – all my code is in python, and a bit in bash to help it get going. I used things like cron and pip – they were either there, or trivial to install. Really, for this project, I was never consious of the Pi being anything less than a full computer.
  • At first I tried to write auto detection code – that would see any changes in the light levels and write them to a file… which was then served on a nginx simple webserver (seriously – that was about 2 lines of code to install). But the noise in the system plus the fact that we’ve not had a fill so I don’t know what my signal looks like yet… So, that code will have to be revised.
  • In the end, I have to write a file with the raw data in it, and analyze that – at least, until I know what an actual signal looks like. So… how to get that data off the Pi – especially given that I can’t access it anymore now that I’ve left New Jersey? In the end I used some Python code to push the files to OneDrive. Other than figuring out how to deal with OAuth2, it was really easy (and I’m still not done fighting the authentication battle). What will happen if/when it fails? Well… I’ve recorded the commands my Dad will have to execute to get the new authentication files down there. Hopefully there isn’t going to be an expiration!
  • imageTo analyze the raw data I’ve used a new tool I’ve recently learned at work: numpy and Jupyter notebooks. They allow me to produce a plot like this one. The dip near the left hand side of the plot is my Dad shining the flashlight at my sensors to see if I could actually see anything. The joker.

Pretty much the only thing I’d used before was Linux, and some very simple things with an older Raspberry Pi 2. If anyone is on the fence about this – I’d definately recommend trying it out. It is very easy and there are 1000’s of web pages with step by step instructions for most things you’ll want to do!

    by gordonwatts at December 28, 2017 06:25 AM