Particle Physics Planet

October 22, 2014

The n-Category Cafe

Where Do Probability Measures Come From?

Guest post by Tom Avery

Tom (here Tom means me, not him — Tom) has written several times about a piece of categorical machinery that, when given an appropriate input, churns out some well-known mathematical concepts. This machine is the process of constructing the codensity monad of a functor.

In this post, I’ll give another example of a well-known concept that arises as a codensity monad; namely probability measures. This is something that I’ve just written a paper about.

The Giry monads

Write <semantics>Meas<annotation encoding="application/x-tex">\mathbf{Meas}</annotation></semantics> for the category of measurable spaces (sets equipped with a <semantics>σ<annotation encoding="application/x-tex">\sigma</annotation></semantics>-algebra of subsets) and measurable maps. I’ll also write <semantics>I<annotation encoding="application/x-tex">I</annotation></semantics> for the unit interval <semantics>[0,1]<annotation encoding="application/x-tex">[0,1]</annotation></semantics>, equipped with the Borel <semantics>σ<annotation encoding="application/x-tex">\sigma</annotation></semantics>-algebra.

Let <semantics>ΩMeas<annotation encoding="application/x-tex">\Omega \in \mathbf{Meas}</annotation></semantics>. There are lots of different probability measures we can put on <semantics>Ω<annotation encoding="application/x-tex">\Omega</annotation></semantics>; write <semantics>GΩ<annotation encoding="application/x-tex">G\Omega</annotation></semantics> for the set of all of them.

Is <semantics>GΩ<annotation encoding="application/x-tex">G\Omega</annotation></semantics> a measurable space? Yes: An element of <semantics>GΩ<annotation encoding="application/x-tex">G\Omega</annotation></semantics> is a function that sends measurable subsets of <semantics>Ω<annotation encoding="application/x-tex">\Omega</annotation></semantics> to numbers in <semantics>I<annotation encoding="application/x-tex">I</annotation></semantics>. Turning this around, we have, for each measurable <semantics>AΩ<annotation encoding="application/x-tex">A \subseteq \Omega</annotation></semantics>, an evaluation map <semantics>ev A:GΩI<annotation encoding="application/x-tex">ev_A \colon G\Omega \to I</annotation></semantics>. Let’s give <semantics>GΩ<annotation encoding="application/x-tex">G\Omega</annotation></semantics> the smallest <semantics>σ<annotation encoding="application/x-tex">\sigma</annotation></semantics>-algebra such that all of these are measurable.

Is <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics> a functor? Yes: Given a measurable map <semantics>g:ΩΩ<annotation encoding="application/x-tex">g \colon \Omega \to \Omega'</annotation></semantics> and <semantics>πGΩ<annotation encoding="application/x-tex">\pi \in G\Omega</annotation></semantics>, we can define the pushforward <semantics>Gg(π)<annotation encoding="application/x-tex">G g(\pi)</annotation></semantics> of <semantics>π<annotation encoding="application/x-tex">\pi</annotation></semantics> along <semantics>g<annotation encoding="application/x-tex">g</annotation></semantics> by

<semantics>Gg(π)(A)=π(g 1A)<annotation encoding="application/x-tex"> G g(\pi)(A') = \pi(g^{-1} A') </annotation></semantics>

for measurable <semantics>AΩ<annotation encoding="application/x-tex">A' \subseteq \Omega'</annotation></semantics>.

Is <semantics>G<annotation encoding="application/x-tex">G</annotation></semantics> a monad? Yes: Given <semantics>ωΩ<annotation encoding="application/x-tex">\omega \in \Omega</annotation></semantics> we can define <semantics>η(ω)GΩ<annotation encoding="application/x-tex">\eta(\omega) \in G\Omega</annotation></semantics> by

<semantics>η(ω)(A)=χ A(ω)<annotation encoding="application/x-tex"> \eta(\omega)(A) = \chi_A (\omega) </annotation></semantics>

where <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics> is a measurable subset of <semantics>Ω<annotation encoding="application/x-tex">\Omega</annotation></semantics> and <semantics>χ A<annotation encoding="application/x-tex">\chi_A</annotation></semantics> is its characteristic function. In other words <semantics>η(ω)<annotation encoding="application/x-tex">\eta(\omega)</annotation></semantics> is the Dirac measure at <semantics>ω<annotation encoding="application/x-tex">\omega</annotation></semantics>. Given <semantics>ρGGΩ<annotation encoding="application/x-tex">\rho \in G G\Omega</annotation></semantics>, let

<semantics>μ(ρ)(A)= GΩev Adρ<annotation encoding="application/x-tex"> \mu(\rho)(A) = \int_{\G\Omega} ev_A \,\mathrm{d}\rho </annotation></semantics>

for measurable <semantics>AΩ<annotation encoding="application/x-tex">A \subseteq \Omega</annotation></semantics>, where <semantics>ev A:GΩI<annotation encoding="application/x-tex">\ev_A \colon G\Omega \to I</annotation></semantics> is as above.

This is the Giry monad <semantics>𝔾=(G,η,μ)<annotation encoding="application/x-tex">\mathbb{G} = (G,\eta,\mu)</annotation></semantics>, first defined (unsurprisingly) by Giry in “A categorical approach to probability theory”.

A finitely additive probability measure <semantics>π<annotation encoding="application/x-tex">\pi</annotation></semantics> is just like a probability measure, except that it is only well-behaved with respect to finite disjoint unions, rather than arbitrary countable disjoint unions. More precisely, rather than having

<semantics>π( i=1 A i)= i=1 π(A i)<annotation encoding="application/x-tex"> \pi\left(\bigcup_{i=1}^{\infty} A_i\right) = \sum_{i=1}^{\infty} \pi(A_i) </annotation></semantics>

for disjoint <semantics>A i<annotation encoding="application/x-tex">A_i</annotation></semantics>, we just have

<semantics>π( i=1 nA i)= i=1 nπ(A i)<annotation encoding="application/x-tex"> \pi\left(\bigcup_{i=1}^{n} A_i\right) = \sum_{i=1}^{n} \pi(A_i) </annotation></semantics>

for disjoint <semantics>A i<annotation encoding="application/x-tex">A_i</annotation></semantics>.

We could repeat the definition of the Giry monad with “probability measure” replaced by “finitely additive probability measure”; doing so would give the finitely additive Giry monad <semantics>𝔽=(F,η,μ)<annotation encoding="application/x-tex">\mathbb{F} = (F,\eta,\mu)</annotation></semantics>. Every probability measure is a finitely additive probability measure, but not all finitely additive probability measures are probability measures. So <semantics>𝔾<annotation encoding="application/x-tex">\mathbb{G}</annotation></semantics> is a proper submonad of <semantics>𝔽<annotation encoding="application/x-tex">\mathbb{F}</annotation></semantics>.

The Kleisli category of <semantics>𝔾<annotation encoding="application/x-tex">\mathbb{G}</annotation></semantics> is quite interesting. Its objects are just the measurable spaces, and the morphisms are a kind of non-deterministic map called a Markov kernel or conditional probability distribution. As a special case, a discrete space equipped with an endomorphism in the Kleisli category is a discrete-time Markov chain.

I’ll explain how the Giry monads arise as codensity monads, but first I’d like to mention a connection with another example of a codensity monad; namely the ultrafilter monad.

An ultrafilter <semantics>𝒰<annotation encoding="application/x-tex">\mathcal{U}</annotation></semantics> on a set <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> is a set of subsets of <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> satisfying some properties. So <semantics>𝒰<annotation encoding="application/x-tex">\mathcal{U}</annotation></semantics> is a subset of the powerset <semantics>𝒫X<annotation encoding="application/x-tex">\mathcal{P}X</annotation></semantics> of <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics>, and is therefore determined by its characteristic function, which takes values in <semantics>{0,1}I<annotation encoding="application/x-tex">\{0,1\} \subseteq I</annotation></semantics>. In other words, an ultrafilter on <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> can be thought of as a special function

<semantics>𝒫XI.<annotation encoding="application/x-tex"> \mathcal{P}X \to I. </annotation></semantics>

It turns out that “special function” here means “finitely additive probability measure defined on all of <semantics>𝒫X<annotation encoding="application/x-tex">\mathcal{P}X</annotation></semantics> and taking values in <semantics>{0,1}<annotation encoding="application/x-tex">\{0,1\}</annotation></semantics>”.

So the ultrafilter monad on <semantics>Set<annotation encoding="application/x-tex">\mathbf{Set}</annotation></semantics> (which sends a set to the set of ultrafilters on it) is a primitive version of the finitely additive Giry monad. With this in mind, and given the fact that the ultrafilter monad is the codensity monad of the inclusion of the category of finite sets into the category of sets, it is not that surprising that the Giry monads are also codensity monads. In particular, we might expect <semantics>𝔽<annotation encoding="application/x-tex">\mathbb{F}</annotation></semantics> to be the codensity monad of some functor involving spaces that are “finite” in some sense, and for <semantics>𝔾<annotation encoding="application/x-tex">\mathbb{G}</annotation></semantics> we’ll need to include some information pertaining to countable additivity.

Integration operators

If you have a measure on a space then you can integrate functions on that space. The converse is also true: if you have a way of integrating functions on a space then you can extract a measure.

There are various ways of making this precise, the most famous of which is the Riesz-Markov-Kakutani Representation Theorem:

Theorem. Let <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> be a compact Hausdorff space. Then the space of finite, signed Borel measures on <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics> is canonically isomorphic to

<semantics>NVS(Top(X,),)<annotation encoding="application/x-tex"> \mathbf{NVS}(\mathbf{Top}(X,\mathbb{R}),\mathbb{R}) </annotation></semantics>

as a normed vector space, where <semantics>Top<annotation encoding="application/x-tex">\mathbf{Top}</annotation></semantics> is the category of topological spaces, and <semantics>NVS<annotation encoding="application/x-tex">\mathbf{NVS}</annotation></semantics> is the category of normed vector spaces.

Given a finite, signed Borel measure <semantics>π<annotation encoding="application/x-tex">\pi</annotation></semantics> on <semantics>X<annotation encoding="application/x-tex">X</annotation></semantics>, the corresponding map <semantics>Top(X,)<annotation encoding="application/x-tex">\mathbf{Top}(X,\mathbb{R}) \to \mathbb{R}</annotation></semantics> sends a function to its integral with respect to <semantics>π<annotation encoding="application/x-tex">\pi</annotation></semantics>. There are various different versions of this theorem that go by the same name.

My paper contains the following more modest version, which is a correction of a claim by Sturtz.

Proposition. Finitely additive probability measures on a measurable space <semantics>Ω<annotation encoding="application/x-tex">\Omega</annotation></semantics> are canonically in bijection with functions <semantics>ϕ:Meas(Ω,I)I<annotation encoding="application/x-tex">\phi \colon \mathbf{Meas}(\Omega,I) \to I</annotation></semantics> that are

  • affine: if <semantics>f,gMeas(Ω,I)<annotation encoding="application/x-tex">f,g \in \mathbf{Meas}(\Omega,I)</annotation></semantics> and <semantics>rI<annotation encoding="application/x-tex">r \in I</annotation></semantics> then

<semantics>ϕ(rf+(1r)g)=rϕ(f)+(1r)ϕ(g),<annotation encoding="application/x-tex"> \phi(r f + (1-r)g) = r\phi(f) + (1-r)\phi(g), </annotation></semantics>


  • weakly averaging: if <semantics>r¯<annotation encoding="application/x-tex">\bar{r}</annotation></semantics> denotes the constant function with value <semantics>r<annotation encoding="application/x-tex">r</annotation></semantics> then <semantics>ϕ(r¯)=r<annotation encoding="application/x-tex">\phi(\bar{r}) = r</annotation></semantics>.

Call such a function a finitely additive integration operator. The bijection restricts to a correspondence between (countably additive) probability measures and functions <semantics>ϕ<annotation encoding="application/x-tex">\phi</annotation></semantics> that additionally

  • respect limits: if <semantics>f nMeas(Ω,I)<annotation encoding="application/x-tex">f_n \in \mathbf{Meas}(\Omega,I)</annotation></semantics> is a sequence of functions converging pointwise to <semantics>0<annotation encoding="application/x-tex">0</annotation></semantics> then <semantics>ϕ(f n)<annotation encoding="application/x-tex">\phi(f_n)</annotation></semantics> converges to <semantics>0<annotation encoding="application/x-tex">0</annotation></semantics>.

Call such a function an integration operator. The integration operator corresponding to a probability measure <semantics>π<annotation encoding="application/x-tex">\pi</annotation></semantics> sends a function <semantics>f<annotation encoding="application/x-tex">f</annotation></semantics> to

<semantics> Ωfdπ,<annotation encoding="application/x-tex"> \int_{\Omega}f \mathrm{d}\pi, </annotation></semantics>

which justifies the name. In the other direction, given an integration operator <semantics>ϕ<annotation encoding="application/x-tex">\phi</annotation></semantics>, the value of the corresponding probability measure on a measurable set <semantics>AΩ<annotation encoding="application/x-tex">A \subseteq \Omega</annotation></semantics> is <semantics>ϕ(χ A)<annotation encoding="application/x-tex">\phi(\chi_A)</annotation></semantics>.

These bijections are measurable (with respect to a natural <semantics>σ<annotation encoding="application/x-tex">\sigma</annotation></semantics>-algebra on the set of finitely additive integration operators) and natural in <semantics>Ω<annotation encoding="application/x-tex">\Omega</annotation></semantics>, so they define isomorphisms of endofunctors of <semantics>Meas<annotation encoding="application/x-tex">\mathbf{Meas}</annotation></semantics>. Hence we can transfer the monad structures across the isomorphisms, and obtain descriptions of the Giry monads in terms of integration operators.

The Giry monads via codensity monads

So far so good. But what does this have to do with codensity monads? First let’s recall the definition of a codensity monad. I won’t go into a great deal of detail; for more information see Tom’s first post on the topic.

Let <semantics>U:<annotation encoding="application/x-tex">U \colon \mathbb{C} \to \mathcal{M}</annotation></semantics> be a functor. The codensity monad of <semantics>U<annotation encoding="application/x-tex">U</annotation></semantics> is the right Kan extension of <semantics>U<annotation encoding="application/x-tex">U</annotation></semantics> along itself. This consists of a functor <semantics>T U:<annotation encoding="application/x-tex">T^U \colon \mathcal{M} \to \mathcal{M}</annotation></semantics> satisfying a universal property, which equips <semantics>T U<annotation encoding="application/x-tex">T^U</annotation></semantics> with a canonical monad structure. The codensity monad doesn’t always exist, but it will whenever <semantics><annotation encoding="application/x-tex">\mathbb{C}</annotation></semantics> is small and <semantics><annotation encoding="application/x-tex">\mathcal{M}</annotation></semantics> is complete. You can think of <semantics>T U<annotation encoding="application/x-tex">T^U</annotation></semantics> as a generalisation of the monad induced by the adjunction between <semantics>U<annotation encoding="application/x-tex">U</annotation></semantics> and its left adjoint that makes sense when the left adjoint doesn’t exist. In particular, when the left adjoint does exist, the two monads coincide.

The end formula for right Kan extensions gives

<semantics>T Um= c[(m,Uc),Uc],<annotation encoding="application/x-tex"> T^U m = \int_{c \in \mathbb{C}} [\mathcal{M}(m,U c),U c], </annotation></semantics>

where <semantics>[(m,Uc),Uc]<annotation encoding="application/x-tex">[\mathcal{M}(m,U c),U c]</annotation></semantics> denotes the <semantics>(m,Uc)<annotation encoding="application/x-tex">\mathcal{M}(m,U c)</annotation></semantics> power of <semantics>Uc<annotation encoding="application/x-tex">U c</annotation></semantics> in <semantics><annotation encoding="application/x-tex">\mathcal{M}</annotation></semantics>, i.e. the product of <semantics>(m,Uc)<annotation encoding="application/x-tex">\mathcal{M}(m,U c)</annotation></semantics> (a set) copies of <semantics>Uc<annotation encoding="application/x-tex">U c</annotation></semantics> (an object of <semantics><annotation encoding="application/x-tex">\mathcal{M}</annotation></semantics>) in <semantics><annotation encoding="application/x-tex">\mathcal{M}</annotation></semantics>.

It doesn’t matter too much if you’re not familiar with ends because we can give an explicit description of <semantics>T Um<annotation encoding="application/x-tex">T^U m</annotation></semantics> in the case that <semantics>=Meas<annotation encoding="application/x-tex">\mathcal{M} = \mathbf{Meas}</annotation></semantics>: The elements of <semantics>T UΩ<annotation encoding="application/x-tex">T^U\Omega</annotation></semantics> are families <semantics>α<annotation encoding="application/x-tex">\alpha</annotation></semantics> of functions

<semantics>α c:Meas(Ω,Uc)Uc<annotation encoding="application/x-tex"> \alpha_c \colon \mathbf{Meas}(\Omega, U c) \to U c </annotation></semantics>

that are natural in <semantics>c<annotation encoding="application/x-tex">c \in \mathbb{C}</annotation></semantics>. For each <semantics>c<annotation encoding="application/x-tex">c \in \mathbb{C}</annotation></semantics> and measurable <semantics>f:ΩUc<annotation encoding="application/x-tex">f \colon \Omega \to U c</annotation></semantics> we have <semantics>ev f:T UΩI<annotation encoding="application/x-tex">\ev_f \colon T^U \Omega \to I</annotation></semantics> mapping <semantics>α<annotation encoding="application/x-tex">\alpha</annotation></semantics> to <semantics>α c(f)<annotation encoding="application/x-tex">\alpha_c (f)</annotation></semantics>. The <semantics>σ<annotation encoding="application/x-tex">\sigma</annotation></semantics>-algebra on <semantics>T UΩ<annotation encoding="application/x-tex">T^U \Omega</annotation></semantics> is the smallest such that each of these maps is measurable.

All that’s left is to say what we should choose <semantics><annotation encoding="application/x-tex">\mathbb{C}</annotation></semantics> and <semantics>U<annotation encoding="application/x-tex">U</annotation></semantics> to be in order to get the Giry monads.

A subset <semantics>c<annotation encoding="application/x-tex">c</annotation></semantics> of a real vector space <semantics>V<annotation encoding="application/x-tex">V</annotation></semantics> is convex if for any <semantics>x,yc<annotation encoding="application/x-tex">x,y \in c</annotation></semantics> and <semantics>rI<annotation encoding="application/x-tex">r \in I</annotation></semantics> the convex combination <semantics>rx+(1r)y<annotation encoding="application/x-tex">r x + (1-r)y</annotation></semantics> is also in <semantics>c<annotation encoding="application/x-tex">c</annotation></semantics>, and a map <semantics>h:cc<annotation encoding="application/x-tex">h \colon c \to c'</annotation></semantics> between convex sets is called affine if it preserves convex combinations. So there’s a category of convex sets and affine maps between them. We will be interested in certain full subcategories of this.

Let <semantics>d 0<annotation encoding="application/x-tex">d_0</annotation></semantics> be the (convex) set of sequences in <semantics>I<annotation encoding="application/x-tex">I</annotation></semantics> that converge to <semantics>0<annotation encoding="application/x-tex">0</annotation></semantics> (it is a subset of the vector space <semantics>c 0<annotation encoding="application/x-tex">c_0</annotation></semantics> of all real sequences converging to <semantics>0<annotation encoding="application/x-tex">0</annotation></semantics>). Now we can define the categories of interest:

  • Let <semantics><annotation encoding="application/x-tex">\mathbb{C}</annotation></semantics> be the category whose objects are all finite powers <semantics>I n<annotation encoding="application/x-tex">I^n</annotation></semantics> of <semantics>I<annotation encoding="application/x-tex">I</annotation></semantics>, with all affine maps between them.

  • Let <semantics>𝔻<annotation encoding="application/x-tex">\mathbb{D}</annotation></semantics> be the category whose objects are all finite powers of <semantics>I<annotation encoding="application/x-tex">I</annotation></semantics>, together with <semantics>d 0<annotation encoding="application/x-tex">d_0</annotation></semantics>, and all affine maps between them.

All the objects of <semantics><annotation encoding="application/x-tex">\mathbb{C}</annotation></semantics> and <semantics>𝔻<annotation encoding="application/x-tex">\mathbb{D}</annotation></semantics> can be considered as measurable spaces (as subspaces of powers of <semantics>I<annotation encoding="application/x-tex">I</annotation></semantics>), and all the affine maps between them are then measurable, so we have (faithful but not full) inclusions <semantics>U:Meas<annotation encoding="application/x-tex">U \colon \mathbb{C} \to \mathbf{Meas}</annotation></semantics> and <semantics>V:𝔻Meas<annotation encoding="application/x-tex">V \colon \mathbb{D} \to \mathbf{Meas}</annotation></semantics>.

Theorem. The codensity monad of <semantics>U<annotation encoding="application/x-tex">U</annotation></semantics> is the finitely additive Giry monad, and the codensity monad of <semantics>V<annotation encoding="application/x-tex">V</annotation></semantics> is the Giry monad.

Why should this be true? Let’s start with <semantics>U<annotation encoding="application/x-tex">U</annotation></semantics>. An element of <semantics>T UΩ<annotation encoding="application/x-tex">T^U \Omega</annotation></semantics> is a family of functions

<semantics>α I n:Meas(Ω,I n)I n.<annotation encoding="application/x-tex"> \alpha_{I^n} \colon\mathbf{Meas}(\Omega,I^n) \to I^n. </annotation></semantics>

But a map into <semantics>I n<annotation encoding="application/x-tex">I^n</annotation></semantics> is determined by its composites with the projections to <semantics>I<annotation encoding="application/x-tex">I</annotation></semantics>, and these projections are affine. This means that <semantics>α<annotation encoding="application/x-tex">\alpha</annotation></semantics> is completely determined by <semantics>α I<annotation encoding="application/x-tex">\alpha_{I}</annotation></semantics>, and the other components are obtained by applying <semantics>α I<annotation encoding="application/x-tex">\alpha_{I}</annotation></semantics> separately in each coordinate. In other words, an element of <semantics>T UΩ<annotation encoding="application/x-tex">T^U \Omega</annotation></semantics> is a special sort of function

<semantics>Meas(Ω,I)I.<annotation encoding="application/x-tex"> \mathbf{Meas}(\Omega, I) \to I. </annotation></semantics>

Look familiar? As you might guess, the functions with the above domain and codomain that define elements of <semantics>T UΩ<annotation encoding="application/x-tex">T^U \Omega</annotation></semantics> are precisely the finitely additive integration operators.

The affine and weakly averaging properties of <semantics>α I<annotation encoding="application/x-tex">\alpha_{I}</annotation></semantics> are enforced by naturality with respect to certain affine maps. For example, the naturality square involving the affine map

<semantics>rπ 1+(1r)π 2:I 2I<annotation encoding="application/x-tex"> r\pi_1 + (1-r)\pi_2 \colon I^2 \to I </annotation></semantics>

(where <semantics>π i<annotation encoding="application/x-tex">\pi_i</annotation></semantics> are the projections) forces <semantics>α I<annotation encoding="application/x-tex">\alpha_I</annotation></semantics> to preserve convex combinations of the form <semantics>rf+(1r)g<annotation encoding="application/x-tex">r f + (1-r)g</annotation></semantics>. The weakly averaging condition comes from naturality with respect to constant maps.

How is the situation different for <semantics>T V<annotation encoding="application/x-tex">T^V</annotation></semantics>? As before <semantics>αT VΩ<annotation encoding="application/x-tex">\alpha \in T^V \Omega</annotation></semantics> is determined by <semantics>α I<annotation encoding="application/x-tex">\alpha_I</annotation></semantics>, and <semantics>α d 0<annotation encoding="application/x-tex">\alpha_{d_0}</annotation></semantics> is obtained by applying <semantics>α I<annotation encoding="application/x-tex">\alpha_I</annotation></semantics> in each coordinate, thanks to naturality with respect to the projections. A measurable map <semantics>f:Ωd 0<annotation encoding="application/x-tex">f \colon \Omega \to d_0</annotation></semantics> is a sequence of maps <semantics>f n:ΩI<annotation encoding="application/x-tex">f_n \colon \Omega \to I</annotation></semantics> converging pointwise to <semantics>0<annotation encoding="application/x-tex">0</annotation></semantics>, and

<semantics>α d 0(f)=(α I(f i)) i=1 .<annotation encoding="application/x-tex"> \alpha_{d_0}(f) = (\alpha_I(f_i))_{i=1}^{\infty}. </annotation></semantics>

But <semantics>α d 0(f)d 0<annotation encoding="application/x-tex">\alpha_{d_0}(f) \in d_0</annotation></semantics>, so <semantics>α I(f i)<annotation encoding="application/x-tex">\alpha_I(f_i)</annotation></semantics> must converge to <semantics>0<annotation encoding="application/x-tex">0</annotation></semantics>. So <semantics>α I<annotation encoding="application/x-tex">\alpha_I</annotation></semantics> is an integration operator!

The rest of the proof consists of checking that these assignments <semantics>αα I<annotation encoding="application/x-tex">\alpha \mapsto \alpha_{I}</annotation></semantics> really do define isomorphisms of monads.

It’s natural to wonder how much you can alter the categories <semantics><annotation encoding="application/x-tex">\mathbb{C}</annotation></semantics> and <semantics>𝔻<annotation encoding="application/x-tex">\mathbb{D}</annotation></semantics> without changing the codensity monads. Here’s a result to that effect:

Proposition. The categories <semantics><annotation encoding="application/x-tex">\mathbb{C}</annotation></semantics> and <semantics>𝔻<annotation encoding="application/x-tex">\mathbb{D}</annotation></semantics> can be replaced by the monoids of affine endomorphisms of <semantics>I 2<annotation encoding="application/x-tex">I^2</annotation></semantics> and <semantics>d 0<annotation encoding="application/x-tex">d_0</annotation></semantics> respectively (regarded as 1-object categories, with the evident functors to <semantics>Meas<annotation encoding="application/x-tex">\mathbf{Meas}</annotation></semantics>) without changing the codensity monads.

This gives categories of convex sets that are minimal such that their inclusions into <semantics>Meas<annotation encoding="application/x-tex">\mathbf{Meas}</annotation></semantics> give rise to the Giry monads. Here I mean minimal in the sense that they contain the fewest objects with all affine maps between them. They are not uniquely minimal; there are other convex sets whose monoids of affine endomorphisms also give rise to the Giry monads.

This result gives yet another characterisation of (finitely and countably) additive probability measures: a probability measure on <semantics>Ω<annotation encoding="application/x-tex">\Omega</annotation></semantics> is an <semantics>End(d 0)<annotation encoding="application/x-tex">\mathrm{End}(d_0)</annotation></semantics>-set morphism

<semantics>Meas(Ω,d 0)d 0,<annotation encoding="application/x-tex"> \mathbf{Meas}(\Omega,d_0) \to d_0, </annotation></semantics>

where <semantics>End(d 0)<annotation encoding="application/x-tex">\mathrm{End}(d_0)</annotation></semantics> is the monoid of affine endomorphisms of <semantics>d 0<annotation encoding="application/x-tex">d_0</annotation></semantics>. Similarly for finitely additive probability measures, with <semantics>d 0<annotation encoding="application/x-tex">d_0</annotation></semantics> replaced by <semantics>I 2<annotation encoding="application/x-tex">I^2</annotation></semantics>.

What about maximal categories of convex sets giving rise to the Giry monads? I don’t have a definitive answer to this question, but you can at least throw in all bounded, convex subsets of Euclidean space:

Proposition. Let <semantics><annotation encoding="application/x-tex">\mathbb{C}'</annotation></semantics> be the category of all bounded, convex subsets of <semantics> n<annotation encoding="application/x-tex">\mathbb{R}^n</annotation></semantics> (where <semantics>n<annotation encoding="application/x-tex">n</annotation></semantics> varies) and affine maps. Let <semantics>𝔻<annotation encoding="application/x-tex">\mathbb{D}'</annotation></semantics> be <semantics><annotation encoding="application/x-tex">\mathbb{C}'</annotation></semantics> but with <semantics>d 0<annotation encoding="application/x-tex">d_0</annotation></semantics> adjoined. Then replacing <semantics><annotation encoding="application/x-tex">\mathbb{C}</annotation></semantics> by <semantics><annotation encoding="application/x-tex">\mathbb{C}'</annotation></semantics> and <semantics>𝔻<annotation encoding="application/x-tex">\mathbb{D}</annotation></semantics> by <semantics>𝔻<annotation encoding="application/x-tex">\mathbb{D}'</annotation></semantics> does not change the codensity monads.

The definition of <semantics>𝔻<annotation encoding="application/x-tex">\mathbb{D}'</annotation></semantics> is a bit unsatisfying; <semantics>d 0<annotation encoding="application/x-tex">d_0</annotation></semantics> feels (and literally is) tacked on. It would be nice to have a characterisation of all the subsets of <semantics> <annotation encoding="application/x-tex">\mathbb{R}^{\mathbb{N}}</annotation></semantics> (or indeed all the convex sets) that can be included in <semantics>𝔻<annotation encoding="application/x-tex">\mathbb{D}'</annotation></semantics>. But so far I haven’t found one.

by leinster ( at October 22, 2014 06:59 PM

The Great Beyond - Nature blog

AstraZeneca neither confirms nor denies that it will ditch antibiotics research

A computer image of a cluster of drug-resistant Mycobacterium tuberculosis.

US Centers for Disease Control and Prevention/ Melissa Brower

The fight against antibiotic-resistant microbes could suffer a major blow if widely circulated rumours are confirmed that pharmaceutical giant AstraZeneca is due to disband its in-house antibiotic development. The company called the rumours “highly speculative”, while not explicitly denying them.

On 23 October, drug-industry consultant David Shlaes wrote on his blog that AstraZeneca, a multinational behemoth headquartered in London, “has told its antibiotics researchers that they should make efforts to find other jobs in the near future”, and that in his opinion this heralds the end of in-house antibiotic development at the company. “As far as antibiotic discovery and development goes, this has to be the most disappointing news of the entire antibiotic era,” wrote Shlaes.

AstraZeneca would not directly address these claims when approached by Nature for comment. In its statement it said, in full:

The blog is highly speculative. We continue to be active in anti-infectives and have a strong pipeline of drugs in development. However, we have previously said on a number of occasions that as we focus on our core therapy areas (Oncology, CVMD [cardiovascular and metabolic diseases] and Respiratory, Inflammation and Autoimmune) we will continue to remain opportunity driven in infection and neuroscience, in particular exploring partnering opportunities to maximise the value of our pipeline and portfolio.

Research into antibiotics is notorious for its high cost and high failure rate. AstraZeneca has previously said that its main research focus would be on areas other than antibiotic development.

Public-health experts have been warning about a trend among large pharmaceutical companies to move away from antibiotics research — just as the World Health Organization and others have pointed to the rising threat of deadly multi-drug-resistant strains of bacteria such as Mycobacterium tuberculosis or Staphylococcus aureus (see ‘Antibiotic resistance: The last resort‘).

by Daniel Cressey at October 22, 2014 05:56 PM

Peter Coles - In the Dark

Cosmology, to be precise…

After an extremely busy morning I had the pleasant task this afternoon of talking to the participants of a collaboration meeting of the Dark Energy Survey that’s going on here at Sussex. Now there’s the even more pleasant task in front of me of having drinks and dinner with the crowd. At some point I’ll post the slides of my talk on here, but for the mean time here’s a pretty accurate summary..


by telescoper at October 22, 2014 05:28 PM

Emily Lakdawalla - The Planetary Society Blog

Herschel observations of Comet Siding Spring initiated by an amateur astronomer
The European satellite Herschel acquired images of Comet Siding Spring before its death in 2013 — thanks to an observing proposal from an amateur astronomer!

October 22, 2014 04:26 PM

Quantum Diaries

Have we detected Dark Matter Axions?

An interesting headline piqued my interest when browsing the social networking and news website Reddit the other day. It simply said:

“The first direct detection of dark matter particles may have been achieved.”

Well, that was news to me! 
Obviously, the key word here is “may”. Nonetheless, I was intrigued, not being aware of any direct detection experiments publishing such results around this time. As a member of LUX, there are usually collaboration-wide emails sent out when a big paper is published by a rival group, most recently the DarkSide-50 results . Often an email like this is followed by a chain of comments, both good and bad, from the senior members of our group. I can’t imagine there being a day where I think I could read a paper and instantly have intelligent criticisms to share like those guys – but maybe when I’ve been in the dark matter business for 20+ years I will!

It is useful to look at other work similar to our own. We can learn from the mistakes and successes of the other groups within our community, and most of the time rivalry is friendly and professional. 
So obviously I took a look at this claimed direct detection. Note that there are three methods to dark matter detection, see figure. To summarise quickly,

The three routes to dark matter detection

  • Direct detection is the observation of an interaction of a dark matter particle with a standard model one
  • Indirect detection is the observation of annihilation products that have no apparent standard model source and so are assumed to be the products of dark matter annihilation.
  • Production is the measurement of missing energy and momentum in a particle interaction (generally a collider experiment) that could signify the creation of dark matter (this method must be very careful, as this is how the neutrinos are measured in collider experiments).

So I was rather surprised to find the article linked was about a space telescope – the XMM-Newton observatory. These sort of experiments are usually for indirect detection. The replies on the Reddit link reflected my own doubt – aside from the personification of x-rays, this comment was also my first thought:

“If they detected x-rays who are produced by dark matter axions then it’s not direct detection.”

These x-rays supposedly come from a particle called an axion – a dark matter candidate. But to address the comment, I considered LUX, a direct dark matter detector, where what we are actually detecting is photons. These are produced by the recoil of a xenon nuclei that interacted with a dark matter particle, and yet we call it direct – because the dark matter has interacted with a standard model particle, the xenon. 
So to determine whether this possible axion detection is direct, we need to understand the effect producing the x-rays. And for that, we need to know about axions.

I haven’t personally studied axions much at all. At the beginning of my PhD, I read a paper called “Expected Sensitivity to Galactic/Solar Axions and Bosonic Super-WIMPs based on the Axio-electric Effect in Liquid Xenon Dark Matter Detectors” – but I couldn’t tell you a single thing from that paper now, without re-reading it. After some research I have a bit more understanding under my belt, and for those of you that are physicists, I can summarise the idea:

  • The axion is a light boson, proposed by Roberto Peccei and Helen Quinn in 1977 to solve the strong CP problem (why does QCD not break CP-symmetry when there is no theoretical reason it shouldn’t?).
  • The introduction of the particle causes the strong CP violation to go to zero (by some fancy maths that I can’t pretend to understand!).
It has been considered as a cold dark matter candidate because it is neutral and very weakly interacting, and could have been produced with the right abundance.
Conversion of an axion to  a photon within a magnetic field (Yamanaka, Masato et al)

Conversion of an axion to a photon within a magnetic field (Yamanaka, Masato et al)

For non-physicists, the key thing to understand is that the axion is a particle predicted by a separate theory (nothing to do with dark matter) that solves another problem in physics. It just so happens that its properties make it a suitable candidate for dark matter. Sounds good so far – the axion kills two birds with one stone. We could detect a dark matter axion via an effect that converts an axion to an x-ray photon within a magnetic field. The XMM-Newton observatory orbits the Earth and looks for x-rays produced by the conversion of an axion within the Earth’s magnetic field. Although there is no particular interaction with a standard model particle (one is produced), the axion is not annihilating to produce the photons, so I think it is fair to call this direct detection.

What about the actual results? What has actually been detected is a seasonal variation in the cosmic x-ray background. The conversion signal is expected to be greater in summer due to the changing visibility of the magnetic field region facing the sun, and that’s exactly what was observed. In the paper’s conclusion the authors state:

“On the basis of our results from XMM-Newton, it appears plausible that axions – dark matter particle candidates – are indeed produced in the core of the Sun and do indeed convert to soft X-rays in the magnetic field of the Earth, giving rise to a significant, seasonally-variable component of the 2-6 keV CXB”



Conversion of solar axions into photons within the Earth’s magnetic field (University of Leicester)

Note the language used – “it appears plausible”. This attitude of physicists to always be cautious and hold back from bold claims is a wise one – look what happened to BICEP2. It is something I am personally becoming familiar with, last week having come across a lovely LUX event that passed my initial cuts and looked very much like it could have been a WIMP. My project partner from my masters degree at the University of Warwick is now a new PhD student at UCL – and he takes great joy in embarrassing me in whatever way he can. So after I shared my findings with him, he told everyone we came across that I had found WIMPs. Even upon running into my supervisor, he asked “Have you seen Sally’s WIMP?”. I was not pleased – that is not a claim I want to make as a mere second year PhD student. Sadly, but not unexpectedly, my “WIMP” has now been cut away. But not for one second did I truly believe it could have been one – surely there’s no way I‘m going to be the one that discovers dark matter! (Universe, feel free to prove me wrong.)

These XMM-Newton results are nice, but tentative – they need confirming by more experiments. I can’t help but wonder how many big discoveries end up delayed or even discarded due to the cautiousness of physicists, who can scarcely believe they have found something so great. I look forward to the time when someone actually comes out and says ‘We did it – we found it.” with certainty. It would be extra nice if it were LUX. But realistically, to really convince anyone that dark matter has been found, detection via several different methods and in several different places is needed. There is a lot of work to do yet.

It’s an exciting time to be in this field, and papers like the XMM-Newton one keep us on our toes! LUX will be starting up again soon for what we hope will be a 300 day run, and an increase in sensitivity to WIMPs of around 5x. Maybe it’s time for me to re-read that paper on the axio-electric effect in liquid xenon detectors!

by Sally Shaw at October 22, 2014 04:07 PM

The Great Beyond - Nature blog

More than half of 2007-2012 research articles now free to read

More than half of all peer-reviewed research articles published from 2007 to 2012 are now free to download somewhere on the Internet, according to a report produced for the European Commission, published today. That is a step up from the situation last year, when only one year  – 2011 – reached the 50% free mark. But the report also underlines how availability dips in the most recent year, because many papers are only made free after a delay.

nature_chart_open access_30 10 14

“A substantial part of the material openly available is relatively old, or as some would say, outdated,” writes Science-Metrix, a consultancy in Montreal, Canada, who conducted the study, one of a series of reports on open access policies and open data.

The study (which has not been formally peer-reviewed) forms part of the European Commission’s efforts to track the evolution of open access. Science-Metrix uses automated software to search online for hundreds of thousands of papers from the Scopus database.

The company finds that the proportion of new papers published directly in open-access journals reached almost 13% in 2012. The bulk of the Internet’s free papers are available through other means – made open by publishers after a delay, or by authors archiving their manuscripts online. But their proportion of the total seems to have stuck at around 40% for the past few years. That apparent lack of impetus is partly because of a ‘backfilling’ effect, whereby the past is made to look more open as authors upload versions of older paywalled papers into online repositories, the report says. During this last year, for instance, close to 14,000 papers originally published in 1996 were made available for free.

“The fundamental problem highlighted by the Science-Metrix findings is timing,” writes Stevan Harnad, an open-access advocate and cognitive scientist at the University of Quebec in Montreal, Canada. “Over 50% of all articles published between 2007 and 2012 are freely available today. But the trouble is that their percentage in the most critical years, namely, the 1-2 years following publication, is far lower than that. This is partly because of publisher open access embargoes, partly because of author fears and sluggishness, but mostly because not enough strong, effective open access mandates have as yet been adopted by institutions and funders.”

The report’s conclusions are only estimates, as the automated software does not pick up every free paper, and this incompleteness must be adjusted for in the figures (typically adding around 5-6% to the total, a margin calculated by testing the software on a smaller, hand-checked sample of papers). And many of the articles, although free to read, do not meet formal definitions of open access – for example, they do not include details on whether readers can freely reuse the material. Éric Archambault, the founder and president of Science-Metrix, says it is still hard to track different kinds of open manuscripts, and when they became free to read.

The proportion of free papers also differs by country and by subject. Biomedical research (71% estimated free between 2011 and 2013) is far more open than chemistry (39%), for example. The study suggests that from 2008-2013, the world’s average was 54%, with Brazil (76%) and the Netherlands (74%) particularly high. The United Kingdom, where the nation’s main public funder, Research Councils UK, has set a 45% target for 2013-14, has already reached 64% in previous years, the report suggests.

The study comes during Open Access week, which is seeing events around the world promoting the ideas of open access to research. Yesterday saw the launch of the ‘Open Access Button’ in London – a website and app that allows users to find free research. If no free copy is available, the app promises to email authors asking them to upload a free version of their paper – with an explanation direct from the user who needs the manuscript. “We are trying to make open access personal – setting up a conversation between the author and the person who wants access,” says Joe McArthur, who co-founded the project and works at the Right to Research Coalition, an advocacy group in London.

by Richard Van Noorden at October 22, 2014 04:00 PM

Tommaso Dorigo - Scientificblogging

The Quote Of The Week - Shocked And Disappointed
"Two recent results from other experiments add to the excitement of Run II. The results from Brookhaven's g-minus-two experiments with muons have a straightforward interpretation as signs of supersymmetry. The increasingly interesting results from BABAR at the Stanford Linear Accelerator Center add to the importance of B physics in Run II, and also suggest new physics. I will be shocked and disappointed if we don't have at least one major discovery."

read more

by Tommaso Dorigo at October 22, 2014 03:20 PM

Quantum Diaries

New high-speed transatlantic network to benefit science collaborations across the U.S.

This Fermilab press release came out on Oct. 20, 2014.

ESnet to build high-speed extension for faster data exchange between United States and Europe. Image: ESnet

ESnet to build high-speed extension for faster data exchange between United States and Europe. Image: ESnet

Scientists across the United States will soon have access to new, ultra-high-speed network links spanning the Atlantic Ocean thanks to a project currently under way to extend ESnet (the U.S. Department of Energy’s Energy Sciences Network) to Amsterdam, Geneva and London. Although the project is designed to benefit data-intensive science throughout the U.S. national laboratory complex, heaviest users of the new links will be particle physicists conducting research at the Large Hadron Collider (LHC), the world’s largest and most powerful particle collider. The high capacity of this new connection will provide U.S. scientists with enhanced access to data at the LHC and other European-based experiments by accelerating the exchange of data sets between institutions in the United States and computing facilities in Europe.

DOE’s Brookhaven National Laboratory and Fermi National Accelerator Laboratory—the primary computing centers for U.S. collaborators on the LHC’s ATLAS and CMS experiments, respectively—will make immediate use of the new network infrastructure once it is rigorously tested and commissioned. Because ESnet, based at DOE’s Lawrence Berkeley National Laboratory, interconnects all national laboratories and a number of university-based projects in the United States, tens of thousands of researchers from all disciplines will benefit as well.

The ESnet extension will be in place before the LHC at CERN in Switzerland—currently shut down for maintenance and upgrades—is up and running again in the spring of 2015. Because the accelerator will be colliding protons at much higher energy, the data output from the detectors will expand considerably—to approximately 40 petabytes of raw data per year compared with 20 petabytes for all of the previous lower-energy collisions produced over the three years of the LHC first run between 2010 and 2012.

The cross-Atlantic connectivity during the first successful run for the LHC experiments, which culminated in the discovery of the Higgs boson, was provided by the US LHCNet network, managed by the California Institute of Technology. In recent years, major research and education networks around the world—including ESnet, Internet2, California’s CENIC, and European networks such as DANTE, SURFnet and NORDUnet—have increased their backbone capacity by a factor of 10, using sophisticated new optical networking and digital signal processing technologies. Until recently, however, higher-speed links were not deployed for production purposes across the Atlantic Ocean—creating a network “impedance mismatch” that can harm large, intercontinental data flows.

An evolving data model
This upgrade coincides with a shift in the data model for LHC science. Previously, data moved in a more predictable and hierarchical pattern strongly influenced by geographical proximity, but network upgrades around the world have now made it possible for data to be fetched and exchanged more flexibly and dynamically. This change enables faster science outcomes and more efficient use of storage and computational power, but it requires networks around the world to perform flawlessly together.

“Having the new infrastructure in place will meet the increased need for dealing with LHC data and provide more agile access to that data in a much more dynamic fashion than LHC collaborators have had in the past,” said physicist Michael Ernst of DOE’s Brookhaven National Laboratory, a key member of the team laying out the new and more flexible framework for exchanging data between the Worldwide LHC Computing Grid centers.

Ernst directs a computing facility at Brookhaven Lab that was originally set up as a central hub for U.S. collaborators on the LHC’s ATLAS experiment. A similar facility at Fermi National Accelerator Laboratory has played this role for the LHC’s U.S. collaborators on the CMS experiment. These computing resources, dubbed Tier 1 centers, have direct links to the LHC at the European laboratory CERN (Tier 0).  The experts who run them will continue to serve scientists under the new structure. But instead of serving as hubs for data storage and distribution only among U.S.-based collaborators at Tier 2 and 3 research centers, the dedicated facilities at Brookhaven and Fermilab will be able to serve data needs of the entire ATLAS and CMS collaborations throughout the world. And likewise, U.S. Tier 2 and Tier 3 research centers will have higher-speed access to Tier 1 and Tier 2 centers in Europe.

“This new infrastructure will offer LHC researchers at laboratories and universities around the world faster access to important data,” said Fermilab’s Lothar Bauerdick, head of software and computing for the U.S. CMS group. “As the LHC experiments continue to produce exciting results, this important upgrade will let collaborators see and analyze those results better than ever before.”

Ernst added, “As centralized hubs for handling LHC data, our reliability, performance and expertise have been in demand by the whole collaboration, and now we will be better able to serve the scientists’ needs.”

An investment in science
ESnet is funded by DOE’s Office of Science to meet networking needs of DOE labs and science projects. The transatlantic extension represents a financial collaboration, with partial support coming from DOE’s Office of High Energy Physics (HEP) for the next three years. Although LHC scientists will get a dedicated portion of the new network once it is in place, all science programs that make use of ESnet will now have access to faster network links for their data transfers.

“We are eagerly awaiting the start of commissioning for the new infrastructure,” said Oliver Gutsche, Fermilab scientist and member of the CMS Offline and Computing Management Board. “After the Higgs discovery, the next big LHC milestones will come in 2015, and this network will be indispensable for the success of the LHC Run 2 physics program.”

This work was supported by the DOE Office of Science.
Fermilab is America’s premier national laboratory for particle physics and accelerator research. A U.S. Department of Energy Office of Science laboratory, Fermilab is located near Chicago, Illinois, and operated under contract by the Fermi Research Alliance, LLC. Visit Fermilab’s website at and follow us on Twitter at @FermilabToday.

Brookhaven National Laboratory is supported by the Office of Science of the U.S. Department of Energy.  The Office of Science is the single largest supporter of basic research in the physical sciences in the United States, and is working to address some of the most pressing challenges of our time.  For more information, please visit

One of ten national laboratories overseen and primarily funded by the Office of Science of the U.S. Department of Energy (DOE), Brookhaven National Laboratory conducts research in the physical, biomedical, and environmental sciences, as well as in energy technologies and national security. Brookhaven Lab also builds and operates major scientific facilities available to university, industry and government researchers. Brookhaven is operated and managed for DOE’s Office of Science by Brookhaven Science Associates, a limited-liability company founded by the Research Foundation for the State University of New York on behalf of Stony Brook University, the largest academic user of Laboratory facilities, and Battelle, a nonprofit applied science and technology organization.

Visit Brookhaven Lab’s electronic newsroom for links, news archives, graphics, and more at, follow Brookhaven Lab on Twitter,, or find us on Facebook,

The DOE Office of Science is the single largest supporter of basic research in the physical sciences in the United States and is working to address some of the most pressing challenges of our time. For more information, please visit

Media contacts:

  • Karen McNulty-Walsh, Brookhaven Media and Communications Office,, 631-344-8350
  • Kurt Riesselmann, Fermilab Office of Communication,, 630-840-3351
  • Jon Bashor, Computing Sciences Communications Manager, Lawrence Berkeley National Laboratory,, 510-486-5849

Computing contacts:

  • Lothar Bauerdick, Fermilab, US CMS software computing,, 630-840-6804
  • Oliver Gutsche, Fermilab, CMS Offline and Computing Management Board,, 630-840-8909

by Fermilab at October 22, 2014 03:15 PM

CERN Bulletin

CERN Bulletin Issue No. 43-44/2014
Link to e-Bulletin Issue No. 43-44/2014Link to all articles in this issue No.

October 22, 2014 02:58 PM

The Great Beyond - Nature blog

Outbreak of great quakes underscores Cascadia risk

Posted on behalf of Alexandra Witze.

The 18 great earthquakes that have struck Earth in the past decade hold ominous lessons for western North America, a top seismologist has warned. Many of these large quakes — including the 2004 Sumatra quake that spawned the Indian Ocean tsunami, and the 2011 Tohoku disaster in Japan — were surprisingly different from one another despite their similar geologic settings.

That variety implies that almost any scenario is possible in another part of the Pacific Rim where quake risk is thought to be high — along the Cascadia subduction zone offshore of Washington, Oregon, and other parts of the western United States and Canada.

“We do not fully understand the limits of what can happen,” says Thorne Lay, a seismologist at the University of California, Santa Cruz. “We have to be broadly prepared to respond.”

Lay spoke on 21 October at the Geological Society of America meeting in Vancouver, Canada, a city on the front lines of Cascadia earthquake risk.

The last great quake in the region happened in 1700. Conventional wisdom holds that the next one, perhaps as large as magnitude 9, could strike at any time in the next several hundred years. Geologically speaking, Cascadia is a classic subduction zone, where one plate of Earth’s crust plunges beneath another, building up stress and occasionally relieving it in large earthquakes.

The recent spate of great subduction-zone quakes, of magnitude 8 or larger, began with the 2004 Sumatra earthquake. On average, each year since then has brought 1.8 great quakes, more than twice the rate of the previous century.

In large part, they happened where and when seismologists expected them. “The quakes are basically filling in a deficiency of activity,” Lay says. But their details have been surprising.

The 2004 Sumatra quake, for instance, ruptured unexpected portions of a subduction zone off Indonesia, where the fault zone bends as opposed to running straight. That implies that areas in Cascadia with unusual geometry might also be at risk, Lay says.

In 2007, in Peru, a major earthquake began to happen, then essentially stopped for 60 seconds before picking up again and eventually generating a large tsunami. That start-stop-start pattern raises challenges for Cascadia because seismologists are trying to develop an accurate earthquake early warning system there.

And in April 2014, a Chilean quake ruptured a far shorter portion of a subduction zone than scientists had expected. That suggests that researchers can’t be complacent about thinking they know which parts of Cascadia might break, Lay says. (The worst-case scenario for Cascadia involves a rupture of approximately 1,000 kilometres.)

That’s not to say scientists aren’t preparing. The recently launched M9 project, coordinated out of the University of Washington in Seattle, aims to help officials cope with the risk of a great Cascadia quake. At the Vancouver meeting, Arthur Frankel of the US Geological Survey in Seattle showed early results of calculations of where the ground might shake the most. Enclosed basins, like Seattle, amplify the shaking, he reported.

by Lauren Morello at October 22, 2014 02:34 PM

Clifford V. Johnson - Asymptotia

I Dare!
sunday_assembly_3(Click photos* for larger view) Yes. I dare to show equations during public lectures. There'll be equations in my book too. If we do not show the tools we use, how can we give a complete picture of how science works? If we keep hiding the mathematics, won't people be even more afraid of this terrifying horror we are "protecting" them from? I started my Sunday Assembly talk reflecting upon the fact that next year will make 100 years after Einstein published one of the most beautiful and far-reaching scientific works in history, General Relativity, describing how gravity works. In the first 30 seconds of the talk, I put up the equations. Just because they deserve to be seen, and to drive home the point that its not just a bunch of words, but an actual method of computation, that allows you to do quantitative science about the largest physical object we know of - the entire universe! sunday_assembly_1 It was a great audience, who seemed to enjoy the 20 minute talk as part of [...] Click to continue reading this post

by Clifford at October 22, 2014 02:29 PM

astrobites - astro-ph reader's digest

The Singles’ Club
Title:The Kepler dichotomy among the M dwarfs: half of systems contain five or more coplanar planets
Authors: Sarah Ballard & John Johnson
First author’s institution: University of Washington
Status: Submitted to ApJ

The Kepler dichotomy

The Kepler spacecraft hasn’t just found transiting exoplanets: it’s found transiting exoplanet systems. Hundreds of alternative Solar-systems have been spotted just a few hundred light years away, and Ballard & Johnson want to use them to describe the population of planetary systems across the entire galaxy.

Exoplanets only transit if they pass between their host star and the Earth, blocking out a little light once every orbital period. Sometimes we see more than one planet transit in the same system, which means they lie in the same plane and have small ‘mutual inclinations’ (the angles between exoplanet orbits within a planetary system). But how can you be sure that you detect all the planets in a system? If planets always orbited their stars in a single plane, with very small mutual inclinations, you’d see them all transit. But we know (through radial velocity measurements that reveal additional non-transiting planets in the systems, and through planets passing in front of star-spots, or in front of other planets) that planetary systems are often not well aligned and sometimes have large mutual inclinations.

Although Kepler has found lots of multiple planet systems (multis), it has also found lots of single planets (singletons – that’s what Ballard & Johnson call them). I mean LOTS of singletons. Is this large number of singletons what you would expect to see, given some distribution of modest mutual inclinations? Can the single population be explained by assuming that they have non-transiting friends? Or is there some mysterious process generating all these singletons?

This mystery has been investigated previously for Sun-like stars (Morton & Winn, 2014), where evidence was found for separate populations of singletons and multis. Ballard & Johnson apply the same logic to the M dwarfs, the small stars, to see whether the same dual population phenomenon exists in the mini-planetary systems.

The planet machine

Screen Shot 2014-10-20 at 21.23.32

Figure 1. The model is shown in red with its 1 and 2 \sigma intervals. The blue histogram represents the observations. Ballard & Johnson find that they cannot adequately reproduce the observations with a single population of planets.

Ballard & Johnson compare the exoplanets observed by Kepler to a fake set of planets. They generate thousands of M dwarf planetary systems with between 1 and 8 planets and mutual inclination scatter ranging from 0-10o. They then actually tested whether their fake planetary systems were stable and got rid of any that would shake themselves apart through planet-planet interactions. For each system they recorded the number of planets that would be seen to transit and created a histogram of the number of transiting planets. This histogram was parameterised by N (the total number of planets per star) and \sigma (the scatter in mutual inclinations of the planets). They then determined which values of N and \sigma best describe the observations by comparing their fake-data histogram to the real-data histogram using a Poissonian likelihood function.

Two populations of planets . . .

Figure 2. The same as figure 1 but this time two planet populations are used: one with one planet only and one with 2-8 planets.

Figure 2. The same as Figure 1 but this time two planet populations are used: one with 1 planet only and one with 2-8 planets.

Ballard & Johnson found that they couldn’t reproduce the observations with this simple model: they couldn’t create enough singletons to match the observations. This is shown in Figure 1 (above)—see how the model (in red) just can’t quite get up to the same height as the observations (the blue histogram) for the singletons.

Next, the authors tried generating two populations of planets: one of singletons only, and another with 2-8 planets and a range of mutual inclination scatters. The ratio of number-of-singletons to number-of-multis, f, was an additional parameter of their model. Ballard & Johnson generated thousands of planetary systems with varying numbers of planets, mutual inclinations and fs. Once again, they counted how many would transit and made a histogram, then found the values of N, \sigma and f that best reproduced the data. This time they were able to reproduce the observations—see Figure 2. The sharp upturn in the number of singletons seen in the observations (the blue histogram) is matched by the model (in red). They find a value of f that best reproduces the data: 0.55 +0.23-0.12, so around half of the systems are multis and half are singletons. For the multis, they find that there must be more than five planets per planet–hosting star, with small mutual inclinations, in order to reproduce the observations.

Trending hosts?

Having found that two populations of planets, one of singles and one of multis, best describe the data, Ballard & Johnson ask: is there a fundamental difference between the singleton hosts and the multi hosts? They looked at the host stars’ rotation periods, metallicities (the amount of heavier elements like Helium, Carbon, etc, in the star) and positions in the galaxy and found that the multi-hosting stars tend to be rotating more rapidly, are more metal poor and are closer to the galaxy’s midplane. The rotation trend might actually be an age trend: old stars spin slower than young stars, so perhaps we’re seeing that young systems have lots of planets that get shed over time. These findings are consistent with previous studies.

In this paper, Ballard & Johnson show that with some smooth statistical moves, you can probe an underlying population of objects even though you only observe a fraction of them. They show that the Kepler dichotomy persists for the mini Solar systems—intensifying the mystery behind the singleton excess. Some process that we don’t yet understand is generating all these singletons…. turns out it’s a lonely existence for most exoplanets.


by Ruth Angus at October 22, 2014 12:57 PM

Tommaso Dorigo - Scientificblogging

ECFA Workshop: Planning For The High Luminosity LHC
I am spending a few days in Aix Les Bains, a pleasant lakeside resort in the French southwest, to follow the works of the second ECFA workshop, titled "High-Luminosity LHC". ECFA stands for "European Committee for Future Accelerators" but this particular workshop is indeed centred on the future of the LHC, despite the fact that there are at present at least half a dozen international efforts toward the design of more powerful hadron colliders, more precise linear electron-positron colliders, or still other solutions.

read more

by Tommaso Dorigo at October 22, 2014 11:04 AM

astrobites - astro-ph reader's digest

Newer Horizons Beyond Pluto
In the 1970s and 1980s, the Voyager probes visited the outer solar system, giving us some of the first close-up images of the giant planets at the edge of our solar system. Voyager 1 visited Jupiter and Saturrn before beginning a journey out of the solar system, while Voyager 2 continued along the plane of the planets and visited Uranus and Neptune. Not for the last time, Pluto was left out.

To right this wrong (and to learn a lot about Pluto), NASA launched the New Horizons probe towards Pluto in 2006. After a nine year journey, it will reach Pluto in July, 2015. That sounds like a slow trip, but the distance to Pluto is huge: even light takes eight hours to get out there. New Horizons is actually moving away from the sun at more than 10 miles a second!

New Horizons before launch, including humans for scale.

New Horizons before launch, including humans for scale.

The high speed of the probe is great for getting to Pluto, but terrible for staying at Pluto. To put a probe in orbit around a planet, we need to get the probe and the planet moving at almost exactly the same speed and in the same direction. And it gets worse: the smaller the planet, the smaller the gravitational field, so the closer you need to match the velocities. For the giant planets this is pretty easy: their gravitational pulls are so strong that just getting the probes there is a good start; once the spacecraft is near the planet will do a lot of the work.

For smaller bodies near us (like Mars), we use a special orbit called a Homann transfer orbit to send the probes to the planets. The Homann orbit places the spacecraft in an elliptical orbit, using the Sun’s gravity to slow down the probe on its way out of the solar system. This technique is optimized for minimizing the amount of fuel needed to put a probe in orbit around another planet, but is very slow. Earth-Mars transfers take about nine months; an Earth-Pluto transfer would take about 200 years!

Gravity transfers are out: we don’t want to wait 200 years to see Pluto. Our only other reasonable option is to fire New Horizons’ thrusters near Pluto to slow it down. This process would need a lot of fuel; we’re decelerating to zero from more than 10 miles per second, remember! Fuel isn’t light, and adding all this fuel would make the spacecraft significantly heavier. This is a problem: now that we’ve added weight, we need more fuel to even get New Horizons off the Earth. And this fuel takes up room, so we need to build bigger tanks. But this is more weight that we’ll have to decelerate, so we need more fuel on board to slow them down too. See the problem? For every pound you add in fuel for the thrusters, you really add way more than one pound of mission. New Horizons weighed about 1,000 pounds at launch; to use its thrusters to stop at Pluto, we would have needed to launch it with almost 70,000 pounds of fuel!

We’re out of options. The only conclusion is that we can’t stop at Pluto. As a result, New Horizons is a flyby mission. It’s going to come within 6,000 miles of Pluto, but only once. It might seem like a waste to just go past Pluto once and end the mission; NASA agrees with you! Since launch, the plan has been for New Horizons to visit another Kuiper belt object after visiting Pluto. The problem is that we don’t know of many objects close enough to Pluto for New Horizons to visit. At launch, we didn’t know of any.

In 2011, the team started a search for new objects near Pluto to visit. They collected images from telescopes in Hawaii and Chile, where they were sensitive to objects larger than about 50 kilometers (30 miles) in size. While they found 50 objects, none of them were close enough to Pluto to be appropriate for New Horizons! New Horizons’ post-Pluto plans were on the precipice of peril.

This year, the astronomers turned to their last hope: the mighty Hubble Space Telescope. Being above the Earth’s atmosphere, Hubble is sensitive to even smaller objects than the ground-based telescopes were. In this case, Hubble came successfully to the rescue, finding three potential targets! Last week, the team announced the top choice (although not necessarily final selection) for a post-Pluto mission, the romantically-named Kuiper belt object “1110113Y.”

We know how bright the object is, but have to rely on models of its composition and reflectivity to estimate its size. The best estimate is that 1110113Y is about 40 kilometers (25 miles) across. Based on 1110113Ys position and motion, New Horizons should speed along to it and visit in January, 2019.

So why do we care? New Horizons’ goal is to study the outer solar system, and these observations will give us close-up information on a Kuiper belt object like never before. Kuiper belt objects are believed to be the building blocks of Pluto and the most similar objects to the original planetesimals that formed the planets. Therefore, studying Kuiper belt objects really enables us to probe the Earth’s formation, giving us an initial condition to use when modeling the effects of 4.5 billion years of orbiting the Sun.

by Ben Montet at October 22, 2014 03:42 AM

The Great Beyond - Nature blog

Geologists face off over Yukon frontier

Posted on behalf of Alexandra Witze. 

The walls of the Geological Survey of Canada’s Vancouver office are, not surprisingly, plastered with maps. There’s one of the country of Canada, one of the province of British Columbia, and even a circumpolar Arctic map centered on the North Pole.


The Klondike schist of Canada (shown in green) stops at the border with the United States.

Alexandra Witze

All display that distinctive rainbow mélange so typical of professional geologic maps. Each major rock formation is represented by its own colour, so that pinks and purples and yellows swirl in great stretches representing mountain ranges, coastal plains, and every conceivable landscape in between.

But lying on the table of the survey’s main conference room is a much more problematic map. It shows part of the far northern boundary between the United States and Canada, along a stretch between Alaska and the Yukon territory. And the two sides, on either side of the international border, do not match.

It’s not a question of Canada using one set of colours for its map and the United States using another. The geology simply does not line up. To the east, Canadian mappers have sketched a formation called the Klondike schist, which is associated with the gold-rich rocks that fueled the Klondike gold rush in the late 1890s. To the west, US maps show nothing like it.

“We don’t know why,” says Jamey Jones, a geologist with the US Geological Survey (USGS) in Anchorage, Alaska. “We have got to figure out why these aren’t matching.”

He and two dozen scientists from both sides of the border — but clad equally in plaid shirts and hiking boots — met in Vancouver on 20 October to try to hammer out the discrepancies. For two hours they compared mapping strategies, laid out who needed to explore what next, and swapped tips about the best ways to get helicopters in the region.

The last frontier

At one level, the differing maps are a relatively minor academic point to sort out. Such glitches are fairly common whenever geologists have to match one ‘quadrangle’ mapped from one era or with one technique against another from a different time. And it’s not unusual for geology to not quite line up across international borders.

But American and Canadian geologists have reconciled their maps along nearly the entire northern stretch where Alaska and the Yukon meet, says Frederic “Ric” Wilson, a geologist with the USGS in Anchorage. This last bit is the only one that does not match — and it may well be because the Canadian maps are four years old, while the American ones are four decades old.

The US maps stretch back to the days of legendary geologist Helen Foster, who mapped large parts of Alaska after making her name as a post-war military geologist in former Japanese territories. “With her, you walked every single ridge,” recalls Wilson. “Every single ridge.”

All that walking produced maps of huge stretches of the remote Alaskan landscape. They include the 1970 quadrangle map now in question, which abuts a much newer Canadian quadrangle to the east. Together the maps span part of a massive geological feature known as the Yukon-Tanana Terrane, a collection of rocks caught up in the mighty smearing crush where the Pacific crustal plate collides against North America.

The Canadian side of the map is in good shape. Prompted in part by intense mining interest, geologists there have mapped the Klondike in modern detail.  “I’m willing to integrate any piece of data that comes in,” says Mo Colpron, a geologist with the Yukon Geological Survey. “If you guys come up with things that affect how our side of the border works, then we can sit down and talk and try to mesh it.”

That leaves the burden of work on the US side, to update the Foster maps. “The reconciliation project is what it’s called,” says Rick Saltus, a geologist with the USGS in Denver, Colorado, who served as meeting emcee. “We’re taking a three-year look at cross-border tectonic connections, because things look a little different from one side to the other.”

This summer, Jones and his colleagues hired a helicopter to take them everywhere the Foster maps ran up against the Klondike formation. “We’ve seen a lot of rocks we didn’t anticipate seeing,” he says. That data will go into the new and improved US maps.

There is, however, only so much scientists can do. Citing border regulations, Jones says, the helicopter pilot was unwilling to take them just a tiny bit over into Canada so they could see the geology on the Yukon side.

by Lauren Morello at October 22, 2014 12:41 AM

October 21, 2014

Christian P. Robert - xi'an's og

delayed acceptance [alternative]

In a comment on our Accelerating Metropolis-Hastings algorithms: Delayed acceptance with prefetching paper, Philip commented that he had experimented with an alternative splitting technique retaining the right stationary measure: the idea behind his alternative acceleration is again (a) to divide the target into bits and (b) run the acceptance step by parts, towards a major reduction in computing time. The difference with our approach is to represent the  overall acceptance probability

\min_{k=0,..,d}\left\{\prod_{j=1}^k \rho_j(\eta,\theta),1\right\}

and, even more surprisingly than in our case, this representation remains associated with the right (posterior) target!!! Provided the ordering of the terms is random with a symmetric distribution on the permutation. This property can be directly checked via the detailed balance condition.

In a toy example, I compared the acceptance rates (acrat) for our delayed solution (letabin.R), for this alternative (letamin.R), and for a non-delayed reference (letabaz.R), when considering more and more fractured decompositions of a Bernoulli likelihood.

> system.time(source("letabin.R"))
user system elapsed
225.918 0.444 227.200
> acrat
[1] 0.3195 0.2424 0.2154 0.1917 0.1305 0.0958
> system.time(source("letamin.R"))
user system elapsed
340.677 0.512 345.389
> acrat
[1] 0.4045 0.4138 0.4194 0.4003 0.3998 0.4145
> system.time(source("letabaz.R"))
user system elapsed
49.271 0.080 49.862
> acrat
[1] 0.6078 0.6068 0.6103 0.6086 0.6040 0.6158

A very interesting outcome since the acceptance rate does not change with the number of terms in the decomposition for the alternative delayed acceptance method… Even though it logically takes longer than our solution. However, the drawback is that detailed balance implies picking the order at random, hence loosing on the gain in computing the cheap terms first. If reversibility could be bypassed, then this alternative would definitely get very appealing!

Filed under: Books, Kids, Statistics, University life Tagged: acceleration of MCMC algorithms, delayed acceptance, detailed balance, MCMC, Monte Carlo Statistical Methods, reversibility, simulation

by xi'an at October 21, 2014 10:14 PM

ZapperZ - Physics and Physicists

Scientific Evidence Points To A Designer?
We have had these types of anthropic universe arguments before, and I don't see this being settled anytime soon, unless we encounter an alien life form or something that dramatic.

Apparently, this physicists have been making the rounds giving talks on scientific evidence that points to a designer. Unfortunately, this claim is highly misleading. There are several issues that need to be clarified here:

1. These so-called evidence have many varying interpretations. In the hands of Stephen Hawking, he sees this as evidence that we do NOT need a designer for the universe to exist. So to claim it that they point to a designer is highly misleading, because obviously there are very smart people out there who think of the opposite.

2. Scientific evidence have varying degree of certainty. The evidence that Niobium undergoes a superconducting transition at 9.3 K is a lot more certain than many of the astrophysical parameters that we have gathered so far. It is just the nature of the study and the field.

3. It is also interesting to note that even if the claim is true, it has a significant conflict with many of the orthodox religious view of the origin of the universe, including the fact that it allows for significant time for speciation and evolution.

4. The argument that the universe has been fine-tuned for us to live in is very weak in my book. Who is there to say that if any of these parameters is different that a different type of universe couldn't appear and that different type of life forms would dominate? We are still at an infant knowledge as far as how different types of universes could form, which is one of the argument that Hawking used when he invoked the multiverse scenario. So unless that there is a convincing argument that our universe is the one and only universe that can exist, and nothing else can, then this argument falls very flat.

I find that this type of seminar can't be very productive unless there is a panel discussion presenting both sides. People who listened to this may not be aware of the holes in such arguments, and I would point out also to the any talk by those on the opposite side as well. It would have been better if they invited two scientists with opposing view, and they can show to the public how the same set of evidence leads to different conclusions. This is what happens when the full set of evidence to paint a clear picture isn't available.


by ZapperZ ( at October 21, 2014 06:38 PM

Quantum Diaries

I feel it mine

On Saturday, 4 October, Nikhef – the Dutch National Institute for Subatomic Physics where I spend long days and efforts – opened its doors, labs and facilities to the public. In addition to Nikhef, all the other institutes located in the so-called “Science Park” – the scientific district located in the east part of Amsterdam – welcomed people all day long.

It’s the second “Open Day” that I’ve attended, both as a guest and as guide. Together with my fellow theoreticians we provided answers and explanations to people’s questions and curiosities, standing in the “Big Bang Theory Corner” of the main hall. Each department in Nikhef arranged its own stand and activities, and there were plenty of things to be amazed at to cover the entire day.

The research institutes in Science Park (and outside it) offer a good overview of the concept of research, looking for what is beyond the current status of knowledge. “Verder kijken”, or looking further, is the motto of Vrije Universiteit Amsterdam, my Dutch alma mater.

I deeply like this attitude of research, the willingness to investigating what’s around the corner. As they like to define themselves, Dutch people are “future oriented”: this is manifest in several things, from the way they read the clock (“half past seven” becomes “half before eight” in Dutch) to some peculiarities of the city itself, like the presence of a lot of cultural and research institutes.

This abundance of institutes, museums, exhibitions, public libraries, music festivals, art spaces, and independent cinemas makes me feel this city as cultural place. People interact with culture in its many manifestations and are connected to it in a more dynamic way than if they were only surrounded by historical and artistic.

Back to the Open Day and Nikhef, I was pleased to see lots of people, families with kids running here and there, checking out delicate instruments with their curious hands, and groups of guys and girls (also someone who looked like he had come straight from a skate-park) stopping by and looking around as if it were their own courtyard.

The following pictures give some examples of the ongoing activities:

We had a model of the ATLAS detector built with Legos: amazing!


Copyright Nikhef

And not only toy-models. We had also true detectors, like a cloud chamber that allowed visitors to see the traces of particles passing by!


Copyright Nikhef

Weak force and anti-matter are also cool, right?


Copyright Nikhef

The majority of people here (not me) are blond and/or tall, but not tall enough to see cosmic rays with just their eyes… So, please ask the experts!


Copyright Nikhef

I think I can summarize the huge impact and the benefit of such a cool day with the words of one man who stopped by one of the experimental setups. He listened to the careful (but a bit fuzzy) explanation provided by one of the students, and said “Thanks. Now I feel it mine too.”

Many more photos are available here: enjoy!

by Andrea Signori at October 21, 2014 05:23 PM

Peter Coles - In the Dark

A Dirge

Rough Wind, that moanest loud
Grief too sad for song;
Wild wind, when sullen cloud
Knells all the night long;
Sad storm, whose tears are vain,
Bare woods, whose branches strain,
Deep caves and dreary main, _
Wail, for the world’s wrong!

by Percy Bysshe Shelley



by telescoper at October 21, 2014 04:09 PM

John Baez - Azimuth

Network Theory Seminar (Part 3)


This time we use the principle of minimum power to determine what a circuit made of resistors actually does. Its ‘behavior’ is described by a functor sending circuits to linear relations between the potentials and currents at the input and output terminals. We call this the ‘black box’ functor, since it takes a circuit:

and puts a metaphorical ‘black box’ around it:

hiding the circuit’s internal details and letting us see only how it acts as viewed ‘from outside’.

For more, see the lecture notes here:

Network theory (part 32).


by John Baez at October 21, 2014 03:17 PM

Symmetrybreaking - Fermilab/SLAC

Costumes to make zombie Einstein proud

These physics-themed Halloween costume ideas are sure to entertain—and maybe even educate. Terrifying, we know.

So you haven’t picked a Halloween costume, and the big night is fast approaching. If you’re looking for something a little funny, a little nerdy and sure to impress fellow physics fans, look no further. We’ve got you covered.

1. Dark energy

This is an active costume, perfect for the party-goer who plans to consume a large quantity of sugar. Suit up in all black or camouflage, then spend your evening squeezing between people and pushing them apart.

Congratulations! You’re dark energy: a mysterious force causing the accelerating expansion of the universe, intriguing in the lab and perplexing on the dance floor.

2. Cosmic inflation

Theory says that a fraction of a second after the big bang, the universe grew exponentially, expanding so that tiny fluctuations were stretched into the seeds of entire galaxies.

But good luck getting that costume through the door.

Instead, take a simple yellow life vest and draw the cosmos on it: stars, planets, asteroids, whatever you fancy. When friends pull on the emergency tab, the universe will grow.

3. Heisenberg Uncertainty Principle

Here’s a great excuse to repurpose your topical Breaking Bad costume from last year.

Walter White—aka “Heisenberg”—may have been a chemistry teacher, but the Heisenberg Uncertainty Principle is straight out of physics. Named after Werner Heisenberg, a German physicist credited with the creation of quantum mechanics, the Heisenberg Uncertainty Principle states that the more accurately you know the position of a particle, the less information you know about its momentum.

Put on Walter White’s signature hat and shades (or his yellow suit and respirator), but then add some uncertainty by pasting Riddler-esque question marks to your outfit.

4. Bad neutrino

A warning upfront: Only the ambitious and downright extroverted should attempt this costume.

Neutrinos are ghostly particles that pass through most matter undetected. In fact, trillions of neutrinos pass through your body every second without your knowledge.

But you aren’t going to go as any old neutrino. Oh no. You’re a bad neutrino—possibly the worst one in the universe—so you run into everything: lampposts, trees, haunted houses and yes, people. Don a simple white sheet and spend the evening interacting with everyone and everything.

5. Your favorite physics experiment

You physics junkies know that there are a lot of experiments with odd acronyms and names that are ripe for Halloween costumes. You can go as ATLAS (experiment at the Large Hadron Collider / character from Greek mythology), DarkSide (dark matter experiment at Gran Sasso National Laboratory / good reason to repurpose your Darth Vader costume), PICASSO (dark matter experiment at SNOLAB / creator of Cubism), MINERvA (Fermilab neutrino experiment / Roman goddess of wisdom), or the Dark Energy Survey (dark energy camera located at the Blanco Telescope in Chile / good opportunity for a pun).

Physics-loving parents can go as explorer Daniel Boone, while the kids go as neutrino experiments MicroBooNE and MiniBooNE. The kids can wear mini fur hats of their own or dress as detector tanks to be filled with candy.

6. Feynman diagram

You might know that a Feynman diagram is a drawing that uses lines and squiggles to represent a particle interaction. But have you ever noticed that they sometimes look like people? Try out this new take on the black outfit/white paint skeleton costume. Bonus points for going as a penguin diagram.

7. Antimatter

Break out the bell-bottoms and poster board. In bold letters, scrawl the words of your choosing: “I hate things!,” “Stuff is awful!,” and “Down with quarks!” will all do nicely. Protest from house to house and declare with pride that you are antimatter. It’s a fair critique: Physicists still aren’t sure why matter dominates the universe when equal amounts of matter and antimatter should have been created in the big bang.

Fortunately, you don’t have to solve this particular puzzle on your quest for candy. Just don’t high five anyone; you might annihilate.

8. Entangled particles

Einstein described quantum entanglement as “spooky action at a distance”—the perfect costume for Halloween. Entangled particles are extremely strange. Measuring one automatically determines the state of the other, instantaneously.

Find someone you are extremely in tune with and dress in opposite colors, like black and white. When no one is observing you, you can relax. But when interacting with people, be sure to coordinate movements. They spin to the left, you spin to the right. They wave with the right hand? You wave with the left. You get the drill.

You can also just wrap yourselves together in a net. No one said quantum entanglement has to be hard.

9. Holographic you(niverse)

The universe may be like a hologram, according to a theory currently being tested at Fermilab’s Holometer experiment. If so, information about spacetime is chunked into 2-D bits that only appear three-dimensional from our perspective.

Help others imagine this bizarre concept by printing out a photo of yourself and taping it to your front. You’ll still technically be 3-D, but that two-dimensional picture of your face will still start some interesting discussions. Perhaps best not to wear this if you have a busy schedule or no desire to discuss the nature of time and space while eating a Snickers.

10. Your favorite particle

There are many ways to dress up as a fundamental particle. Bring a lamp along to trick-or-treat to go as the photon, carrier of light. Hand out cookies to go as the Higgs boson, giver of mass. Spend the evening attaching things to people to go as a gluon.

To branch out beyond the Standard Model of particle physics, go as a supersymmetric particle, or sparticle: Wear a gladiator costume and shout, “I am Sparticle!” whenever someone asks about your costume.

Or grab a partner to become a meson, a particle made of a quark and antiquark. Mesons are typically unstable, so whenever you unlink arms, be sure to decay in a shower of electrons and neutrinos—or candy corn.


Like what you see? Sign up for a free subscription to symmetry!

by Lauren Biron at October 21, 2014 02:51 PM

Jester - Resonaances

Dark matter or pulsars? AMS hints it's neither.
Yesterday AMS-02 updated their measurement of cosmic-ray positron and electron fluxes. The newly published data extend to positron energies 500 GeV, compared to 350 GeV in the previous release. The central value of the positron fraction in the highest energy bin is one third of the error bar lower than the central value of the next-to-highestbin.  This allows the collaboration to conclude that the positron fraction has a maximum and starts to decrease at high energies :]  The sloppy presentation and unnecessary hype obscures the fact that AMS actually found something non-trivial.  Namely, it is interesting that the positron fraction, after a sharp rise between 10 and 200 GeV, seems to plateau at higher energies at the value around 15%.  This sort of behavior, although not expected by popular models of cosmic ray propagation, was actually predicted a few years ago, well before AMS was launched.  

Before I get to the point, let's have a brief summary. In 2008 the PAMELA experiment observed a steep rise of the cosmic ray positron fraction between 10 and 100 GeV. Positrons are routinely produced by scattering of high energy cosmic rays (secondary production), but the rise was not predicted by models of cosmic ray propagations. This prompted speculations of another (primary) source of positrons: from pulsars, supernovae or other astrophysical objects, to  dark matter annihilation. The dark matter explanation is unlikely for many reasons. On the theoretical side, the large annihilation cross section required is difficult to achieve, and it is difficult to produce a large flux of positrons without producing an excess of antiprotons at the same time. In particular, the MSSM neutralino entertained in the last AMS paper certainly cannot fit the cosmic-ray data for these reasons. When theoretical obstacles are overcome by skillful model building, constraints from gamma ray and radio observations disfavor the relevant parameter space. Even if these constraints are dismissed due to large astrophysical uncertainties, the models poorly fit the shape the electron and positron spectrum observed by PAMELA, AMS, and FERMI (see the addendum of this paper for a recent discussion). Pulsars, on the other hand, are a plausible but handwaving explanation: we know they are all around and we know they produce electron-positron pairs in the magnetosphere, but we cannot calculate the spectrum from first principles.

But maybe primary positron sources are not needed at all? The old paper by Katz et al. proposes a different approach. Rather than starting with a particular propagation model, it assumes the high-energy positrons observed by PAMELA are secondary, and attempts to deduce from the data the parameters controlling the propagation of cosmic rays. The logic is based on two premises. Firstly, while production of cosmic rays in our galaxy contains many unknowns, the production of different particles is strongly correlated, with the relative ratios depending on nuclear cross sections that are measurable in laboratories. Secondly, different particles propagate in the magnetic field of the galaxy in the same way, depending only on their rigidity (momentum divided by charge). Thus, from an observed flux of one particle, one can predict the production rate of other particles. This approach is quite successful in predicting the cosmic antiproton flux based on the observed boron flux. For positrons, the story is more complicated because of large energy losses (cooling) due to synchrotron and inverse-Compton processes. However, in this case one can make the  exercise of computing the positron flux assuming no losses at all. The result correspond to roughly 20% positron fraction above 100 GeV. Since in the real world cooling can only suppress the positron flux, the value computed assuming no cooling represents an upper bound on the positron fraction.

Now, at lower energies, the observed positron flux is a factor of a few below the upper bound. This is already intriguing, as hypothetical primary positrons could in principle have an arbitrary flux,  orders of magnitude larger or smaller than this upper bound. The rise observed by PAMELA can be interpreted that the suppression due to cooling decreases as positron energy increases. This is not implausible: the suppression depends on the interplay of the cooling time and mean propagation time of positrons, both of which are unknown functions of energy. Once the cooling time exceeds the propagation time the suppression factor is completely gone. In such a case the positron fraction should saturate the upper limit. This is what seems to be happening at the energies 200-500 GeV probed by AMS, as can be seen in the plot. Already the previous AMS data were consistent with this picture, and the latest update only strengthens it.

So, it may be that the mystery of cosmic ray positrons has a simple down-to-galactic-disc explanation. If further observations show the positron flux climbing  above the upper limit or dropping suddenly, then the secondary production hypothesis would be invalidated. But, for the moment, the AMS data seems to be consistent with no primary sources, just assuming that the cooling time of positrons is shorter than predicted by the state-of-the-art propagation models. So, instead of dark matter, AMS might have discovered models of cosmic-ray propagation need a fix. That's less spectacular, but still worthwhile.

Thanks to Kfir for the plot and explanations. 

by Jester ( at October 21, 2014 08:49 AM

October 20, 2014

Christian P. Robert - xi'an's og

control functionals for Monte Carlo integration

This new arXival by Chris Oates, Mark Girolami, and Nicolas Chopin (warning: they all are colleagues & friends of mine!, at least until they read those comments…) is a variation on control variates, but with a surprising twist namely that the inclusion of a control variate functional may produce a sub-root-n (i.e., faster than √n) convergence rate in the resulting estimator. Surprising as I did not know one could get to sub-root-n rates..! Now I had forgotten that Anne Philippe and I used the score in an earlier paper of ours, as a control variate for Riemann sum approximations, with faster convergence rates, but this is indeed a new twist, in particular because it produces an unbiased estimator.

The control variate writes

\psi_\phi (x) = \nabla_x \cdot \phi(x) + \phi(x)\cdot \nabla \pi(x)

where π is the target density and φ is a free function to be optimised. (Under the constraint that πφ is integrable. Then the expectation of ψφ is indeed zero.) The “explanation” for the sub-root-n behaviour is that ψφ is chosen as an L2 regression. When looking at the sub-root-n convergence proof, the explanation is more of a Rao-Blackwellisation type, assuming a first level convergent (or presistent) approximation to the integrand [of the above form ψφ can be found. The optimal φ is the solution of a differential equation that needs estimating and the paper concentrates on approximating strategies. This connects with Antonietta Mira’s zero variance control variates, but in a non-parametric manner, adopting a Gaussian process as the prior on the unknown φ. And this is where the huge innovation in the paper resides, I think, i.e. in assuming a Gaussian process prior on the control functional and in managing to preserve unbiasedness. As in many of its implementations, modelling by Gaussian processes offers nice features, like ψφ being itself a Gaussian process. Except that it cannot be shown to lead to presistency on a theoretical basis. Even though it appears to hold in the examples of the paper. Apart from this theoretical difficulty, the potential hardship with the method seems to be in the implementation, as there are several parameters and functionals to be calibrated, hence calling for cross-validation which may often be time-consuming. The gains are humongous, so the method should be adopted whenever the added cost in implementing it is reasonable, cost which evaluation is not clearly provided by the paper. In the toy Gaussian example where everything can be computed, I am surprised at the relatively poor performance of a Riemann sum approximation to the integral, wondering at the level of quadrature involved therein. The paper also interestingly connects with O’Hagan’s (1991) Bayes-Hermite [polynomials] quadrature and quasi-Monte Carlo [obviously!].

Filed under: Books, Statistics, University life Tagged: control variate, convergence rate, Gaussian processes, Monte Carlo Statistical Methods, simulation, University of Warwick

by xi'an at October 20, 2014 10:14 PM

John Baez - Azimuth

Network Theory (Part 32)

Okay, today we will look at the ‘black box functor’ for circuits made of resistors. Very roughly, this takes a circuit made of resistors with some inputs and outputs:

and puts a ‘black box’ around it:

forgetting the internal details of the circuit and remembering only how the it behaves as viewed from outside. As viewed from outside, all the circuit does is define a relation between the potentials and currents at the inputs and outputs. We call this relation the circuit’s behavior. Lots of different choices of the resistances R_1, \dots, R_6 would give the same behavior. In fact, we could even replace the whole fancy circuit by a single edge with a single resistor on it, and get a circuit with the same behavior!

The idea is that when we use a circuit to do something, all we care about is its behavior: what it does as viewed from outside, not what it’s made of.

Furthermore, we’d like the behavior of a system made of parts to depend in a simple way on the external behaviors of its parts. We don’t want to have to ‘peek inside’ the parts to figure out what the whole will do! Of course, in some situations we do need to peek inside the parts to see what the whole will do. But in this particular case we don’t—at least in the idealization we are considering. And this fact is described mathematically by saying that black boxing is a functor.

So, how do circuits made of resistors behave? To answer this we first need to remember what they are!


Remember that for us, a circuit made of resistors is a mathematical structure like this:

It’s a cospan where:

\Gamma is a graph labelled by resistances. So, it consists of a finite set N of nodes, a finite set E of edges, two functions

s, t : E \to N

sending each edge to its source and target nodes, and a function

r : E \to (0,\infty)

that labels each edge with its resistance.

i: I \to \Gamma is a map of graphs labelled by resistances, where I has no edges. A labelled graph with no edges has nothing but nodes! So, the map i is just a trick for specifying a finite set of nodes called inputs and mapping them to N. Thus i picks out some nodes of \Gamma and declares them to be inputs. (However, i may not be one-to-one! We’ll take advantage of that subtlety later.)

o: O \to \Gamma is another map of graphs labelled by resistances, where O again has no edges, and we call its nodes outputs.

The principle of minimum power

So what does a circuit made of resistors do? This is described by the principle of minimum power.

Recall from Part 27 that when we put it to work, our circuit has a current I_e flowing along each edge e \in E. This is described by a function

I: E \to \mathbb{R}

It also has a voltage across each edge. The word ‘across’ is standard here, but don’t worry about it too much; what matters is that we have another function

V: E \to \mathbb{R}

describing the voltage V_e across each edge e.

Resistors heat up when current flows through them, so they eat up electrical power and turn this power into heat. How much? The power is given by

\displaystyle{ P = \sum_{e \in E} I_e V_e }

So far, so good. But what does it mean to minimize power?

To understand this, we need to manipulate the formula for power using the laws of electrical circuits described in Part 27. First, Ohm’s law says that for linear resistors, the current is proportional to the voltage. More precisely, for each edge e \in E,

\displaystyle{ I_e = \frac{V_e}{r_e} }

where r_e is the resistance of that edge. So, the bigger the resistance, the less current flows: that makes sense. Using Ohm’s law we get

\displaystyle{ P = \sum_{e \in E} \frac{V_e^2}{r_e} }

Now we see that power is always nonnegative! Now it makes more sense to minimize it. Of course we could minimize it simply by setting all the voltages equal to zero. That would work, but that would be boring: it gives a circuit with no current flowing through it. The fun starts when we minimize power subject to some constraints.

For this we need to remember another law of electrical circuits: a spinoff of Kirchhoff’s voltage law. This says that we can find a function called the potential

\phi: N \to \mathbb{R}

such that

V_e = \phi_{s(e)} - \phi_{t(e)}

for each e \in E. In other words, the voltage across each edge is the difference of potentials at the two ends of this edge.

Using this, we can rewrite the power as

\displaystyle{ P = \sum_{e \in E} \frac{1}{r_e} (\phi_{s(e)} - \phi_{t(e)})^2 }

Now we’re really ready to minimize power! Our circuit made of resistors has certain nodes called terminals:

T \subseteq N

These are the nodes that are either inputs or outputs. More precisely, they’re the nodes in the image of

i: I \to \Gamma


o: O \to \Gamma

The principle of minimum power says that:

If we fix the potential \phi on all terminals, the potential at other nodes will minimize the power

\displaystyle{ P(\phi) = \sum_{e \in E} \frac{1}{r_e} (\phi_{s(e)} - \phi_{t(e)})^2 }

subject to this constraint.

This should remind you of all the other minimum or maximum principles you know, like the principle of least action, or the way a system in thermodynamic equilibrium maximizes its entropy. All these principles—or at least, most of them—are connected. I could talk about this endlessly. But not now!

Now let’s just use the principle of minimum power. Let’s see what it tells us about the behavior of an electrical circuit.

Let’s imagine changing the potential \phi by adding some multiple of a function

\psi: N \to \mathbb{R}

If this other function vanishes at the terminals:

\forall n \in T \; \; \psi(n) = 0

then \phi + x \psi doesn’t change at the terminals as we change the number x.

Now suppose \phi obeys the principle of minimum power. In other words, supposes it minimizes power subject to the constraint of taking the values it does at the terminals. Then we must have

\displaystyle{ \frac{d}{d x} P(\phi + x \psi)\Big|_{x = 0} }


\forall n \in T \; \; \psi(n) = 0

This is just the first derivative test for a minimum. But the converse is true, too! The reason is that our power function is a sum of nonnegative quadratic terms. Its graph will look like a paraboloid. So, the power has no points where its derivative vanishes except minima, even when we constrain \phi by making it lie on a linear subspace.

We can go ahead and start working out the derivative:

\displaystyle{ \frac{d}{d x} P(\phi + x \psi)! = ! \frac{d}{d x} \sum_{e \in E} \frac{1}{r_e} (\phi_{s(e)} - \phi_{t(e)} + x(\psi_{s(e)} -\psi_{t(e)}))^2  }

To work out the derivative of these quadratic terms at x = 0, we only need to keep the part that’s proportional to x. The rest gives zero. So:

\begin{array}{ccl} \displaystyle{ \frac{d}{d t} P(\phi + x \psi)\Big|_{x = 0} } &=& \displaystyle{ \frac{d}{d x} \sum_{e \in E} \frac{x}{r_e} (\phi_{s(e)} - \phi_{t(e)}) (\psi_{s(e)} - \psi_{t(e)}) \Big|_{x = 0} } \\ \\  &=&   \displaystyle{  \sum_{e \in E} \frac{1}{r_e} (\phi_{s(e)} - \phi_{t(e)}) (\psi_{s(e)} - \psi_{t(e)}) }  \end{array}

The principle of minimum power says this is zero whenever \psi : N \to \mathbb{R} is a function that vanishes at terminals. By linearity, it’s enough to consider functions \psi that are zero at every node except one node n that is not a terminal. By linearity we can also assume \psi(n) = 1.

Given this, the only nonzero terms in the sum

\displaystyle{ \sum_{e \in E} \frac{1}{r_e} (\phi_{s(e)} - \phi_{t(e)}) (\psi_{s(e)} - \psi_{t(e)}) }

will be those involving edges whose source or target is n. We get

\begin{array}{ccc} \displaystyle{ \frac{d}{d x} P(\phi + x \psi)\Big|_{x = 0} } &=& \displaystyle{ \sum_{e: \; s(e) = n}  \frac{1}{r_e} (\phi_{s(e)} - \phi_{t(e)})}  \\  \\        && -\displaystyle{ \sum_{e: \; t(e) = n}  \frac{1}{r_e} (\phi_{s(e)} - \phi_{t(e)}) }   \end{array}

So, the principle of minimum power says precisely

\displaystyle{ \sum_{e: \; s(e) = n}  \frac{1}{r_e} (\phi_{s(e)} - \phi_{t(e)}) = \sum_{e: \; t(e) = n}  \frac{1}{r_e} (\phi_{s(e)} - \phi_{t(e)}) }

for all nodes n that aren’t terminals.

What does this mean? You could just say it’s a set of linear equations that must be obeyed by the potential \phi. So, the principle of minimum power says that fixing the potential at terminals, the potential at other nodes must be chosen in a way that obeys a set of linear equations.

But what do these equations mean? They have a nice meaning. Remember, Kirchhoff’s voltage law says

V_e = \phi_{s(e)} - \phi_{t(e)}

and Ohm’s law says

\displaystyle{ I_e = \frac{V_e}{r_e} }

Putting these together,

\displaystyle{ I_e = \frac{1}{r_e} (\phi_{s(e)} - \phi_{t(e)}) }

so the principle of minimum power merely says that

\displaystyle{ \sum_{e: \; s(e) = n} I_e = \sum_{e: \; t(e) = n}  I_e }

for any node n that is not a terminal.

This is Kirchhoff’s current law: for any node except a terminal, the total current flowing into that node must equal the total current flowing out! That makes a lot of sense. We allow current to flow in or out of our circuit at terminals, but ‘inside’ the circuit charge is conserved, so if current flows into some other node, an equal amount has to flow out.

In short: the principle of minimum power implies Kirchoff’s current law! Conversely, we can run the whole argument backward and derive the principle of minimum power from Kirchhoff’s current law. (In both the forwards and backwards versions of this argument, we use Kirchhoff’s voltage law and Ohm’s law.)

When the node n is a terminal, the quantity

\displaystyle{  \sum_{e: \; s(e) = n} I_e \; - \; \sum_{e: \; t(e) = n}  I_e }

need not be zero. But it has an important meaning: it’s the amount of current flowing into that terminal!

We’ll call this I_n, the current at the terminal n \in T. This is something we can measure even when our circuit has a black box around it:

So is the potential \phi_n at the terminal n. It’s these currents and potentials at terminals that matter when we try to describe the behavior of a circuit while ignoring its inner workings.

Black boxing

Now let me quickly sketch how black boxing becomes a functor.

A circuit made of resistors gives a linear relation between the potentials and currents at terminals. A relation is something that can hold or fail to hold. A ‘linear’ relation is one defined using linear equations.

A bit more precisely, suppose we choose potentials and currents at the terminals:

\psi : T \to \mathbb{R}

J : T \to \mathbb{R}

Then we seek potentials and currents at all the nodes and edges of our circuit:

\phi: N \to \mathbb{R}

I : E \to \mathbb{R}

that are compatible with our choice of \psi and J. Here compatible means that

\psi_n = \phi_n


J_n = \displaystyle{  \sum_{e: \; s(e) = n} I_e \; - \; \sum_{e: \; t(e) = n}  I_e }

whenever n \in T, but also

\displaystyle{ I_e = \frac{1}{r_e} (\phi_{s(e)} - \phi_{t(e)}) }

for every e \in E, and

\displaystyle{  \sum_{e: \; s(e) = n} I_e \; = \; \sum_{e: \; t(e) = n}  I_e }

whenever n \in N - T. (The last two equations combine Kirchoff’s laws and Ohm’s law.)

There either exist I and \phi making all these equations true, in which case we say our potentials and currents at the terminals obey the relation… or they don’t exist, in which case we say the potentials and currents at the terminals don’t obey the relation.

The relation is clearly linear, since it’s defined by a bunch of linear equations. With a little work, we can make it into a linear relation between potentials and currents in

\mathbb{R}^I \oplus \mathbb{R}^I

and potentials and currents in

\mathbb{R}^O \oplus \mathbb{R}^O

Remember, I is our set of inputs and O is our set of outputs.

In fact, this process of getting a linear relation from a circuit made of resistors defines a functor:

\blacksquare : \mathrm{ResCirc} \to \mathrm{LinRel}

Here \mathrm{ResCirc} is the category where morphisms are circuits made of resistors, while \mathrm{LinRel} is the category where morphisms are linear relations.

More precisely, here is the category \mathrm{ResCirc}:

• an object of \mathrm{ResCirc} is a finite set;

• a morphism from I to O is an isomorphism class of circuits made of resistors:

having I as its set of inputs and O as its set of outputs;

• we compose morphisms in \mathrm{ResCirc} by composing isomorphism classes of cospans.

(Remember, circuits made of resistors are cospans. This lets us talk about isomorphisms between them. If you forget the how isomorphism between cospans work, you can review it in Part 31.)

And here is the category \mathrm{LinRel}:

• an object of \mathrm{LinRel} is a finite-dimensional real vector space;

• a morphism from U to V is a linear relation R \subseteq U \times V, meaning a linear subspace of the vector space U \times V;

• we compose a linear relation R \subseteq U \times V and a linear relation S \subseteq V \times W in the usual way we compose relations, getting:

SR = \{(u,w) \in U \times W : \; \exists v \in V \; (u,v) \in R \mathrm{\; and \;} (v,w) \in S \}

Next steps

So far I’ve set up most of the necessary background but not precisely defined the black boxing functor

\blacksquare : \mathrm{ResCirc} \to \mathrm{LinRel}

There are some nuances I’ve glossed over, like the difference between inputs and outputs as elements of I and O and their images in N. If you want to see the precise definition and the proof that it’s a functor, read our paper:

• John Baez and Brendan Fong, A compositional framework for passive linear networks.

The proof is fairly long: there may be a much quicker one, but at least this one has the virtue of introducing a lot of nice ideas that will be useful elsewhere.

Perhaps next time I will clarify the nuances by doing an example.

by John Baez at October 20, 2014 10:00 PM

Clifford V. Johnson - Asymptotia

Secrets of the Earth
Screen Shot 2014-10-20 at 1.17.23 PMMy guess is that most of you don't know that you can find original science programming on the Weather Channel. (Just like, say, 8 years ago most of you would not have been tuning to the History Channel for original science programming about how the Universe works, but many of you know better - (and thanks for watching The Universe!)) Well, this week one of the series that they have that does do some science, Secrets of the Earth, comes back for a new season. I made some contributions to several of the episodes, and I think I appear in at least two of them as a guest. So look at the whole season for some tasty bits of science about the world around you, and if inclined to, do [...] Click to continue reading this post

by Clifford at October 20, 2014 08:21 PM

Emily Lakdawalla - The Planetary Society Blog

When Good Rockets Go Bad: Orion's Launch Abort System
One of the tricky parts of launching humans into space is deciding what to do if something goes wrong. And that's where Orion's Launch Abort System comes in.

October 20, 2014 06:30 PM

Emily Lakdawalla - The Planetary Society Blog

Status update: All Mars missions fine after Siding Spring flyby
All seven Mars spacecraft are doing perfectly fine after comet Siding Spring's close encounter with Mars.

October 20, 2014 05:07 PM

arXiv blog

How An Intelligent Text Message Service Aims To Tackle Ebola In Western Africa

A computer-controlled text message service could direct Ebola cases to appropriate medical facilities and track the spread of the disease in the process–provided it can raise the necessary funding.

Back in July, Cedric Moro started a crowdsourced mapping service to keep track of the spread of Ebola in Sierra Leone, Liberia and Guinea. Moro is a risk consultant who has created several crowdsourced maps of this kind using the openStreetMap project Umap.

October 20, 2014 03:49 PM

Emily Lakdawalla - The Planetary Society Blog

Collaboration Between OSIRIS-REx and Hayabusa-2
The University of Arizona (UA) hosted representatives of the Hayabusa-2 asteroid sample return mission to explore opportunities for collaboration with the OSIRIS-REx team.

October 20, 2014 03:35 PM

ATLAS Experiment

Defending Your Life (Part 2)

I’ve been working on our simulation software for a long time, and I’m often asked “what on earth is that?” This is my attempt to help you love simulation as much as I do. This is a follow up to Part 1, which told you all about the first step of good simulation software, called “event generation”. In that step, we had software that gave us a list of stable particles that our detector might be able to see. And we’re trying to find some “meons” that our friend the theorist dreamed up.

One little problem with those wonderful event generators is that they don’t know anything about our experiment, ATLAS. We need a different piece of software to take those particles and move them through the detector one by one, helping model the detector’s response to each one of the particles as it goes. There are a few pieces of software that can do that, but the one that we use most is called Geant4. Geant4 is publicly available, and is described as a “toolkit” on their webpage. What that means is that it knows about basic concepts, but it doesn’t do specifics. Like building a giant lego house out of a bag of bricks, you have to figure out what fits where, and often throw out things that don’t fit.

One of the detector layouts that we simulate

The first part of a good detector simulation is the detector description. Every piece of the detector has to be put together, with the right material assigned to each. We have a detector description with over five million (!) volumes and about 400 different materials (from Xenon to Argon to Air to Aerogel and Kapton Cable). There are a few heroes of ATLAS who spend a lot of time taking technical drawings (and photographs, because the technical drawings aren’t always right!) of the detector and translating them into something Geant4 can use. You can’t put every wire and pipe in – the simulation would take an eternity! – so you have to find shortcuts sometimes. It’s a painstaking process that’s still ongoing today. We continuously refine and improve our description, adding pieces that weren’t important at the beginning several years ago but are starting to be important now (like polyboron neutron shielding in our forward region; few people thought early on that we would be able to model low-energy neutron flux in our detector with Geant4, because it’s really complex nuclear physics, but we’re getting so close to being able to do so that we’ve gone back to re-check that our materials’ neutron capture properties are correct). And sometimes we go back and revise things that were done approximately in the beginning because we think we can do better. This part also involves making a detailed magnetic field map. We can’t measure the field everywhere in the detector (like deep in the middle of the calorimeter), and it takes too much time to constantly simulate the currents flowing through the magnets and their effect on the particles moving through the detector, so we do that simulation once and save the magnetic field that results.

A simulated black hole event. But what do meons look like?

Next is a good set of physics models. Geant4 has a whole lot of them that you can use and (fortunately!) they have a default that works pretty well for us. Those physics models describe each process (the photoelectric effect, Compton scattering, bremsstrahlung, ionization, multiple scattering, decays, nuclear interactions, etc) for each particle. Some are very, very complicated, as you can probably imagine. You have to choose, at this point, what physics you’re interested in. Geant4 can be used for simulation of space, simulation of cells and DNA, and simulations of radioactive environments. If we used the most precise models for everything, our simulation would never finish running! Instead, we take the fastest model whose results we can’t really distinguish from the most detailed models. That is, we turn off everything that we don’t really notice in our detector anyway. Sometimes we don’t get that right and have to go back and adjust things further – but usually we’ve erred on the side of a slower, more accurate simulation.

The last part is to “teach” Geant4 what you want to save. All Geant4 cares about is particles and materials – it doesn’t inherently know the difference between some silicon that is a part of a computer chip somewhere in the detector and the silicon that makes up the sensors in much of our inner detector. So we have to say “these are the parts of the detector that we care about most” (called “sensitive” detectors). There are a lot of technical tricks to optimizing the storage, but in the end we want to write files with all the little energy deposits that Geant4 has made, their time and location – and sometimes information (that we call “truth”) about what really happened in the simulation, so later we can find out how good our reconstruction software was at correctly identifying photons and their conversions into electron-positron pairs, for example.

The fun part of working on the simulation software is that you have to learn everything about the experiment. You have to know how much time after the interaction every piece of the detector is sensitive, so that you can avoid wasting time simulating particles long after that time. You get to learn when things were installed incorrectly or are misaligned, because you need those effects in the simulation. When people want to upgrade a part of the detector, you have to learn what they have in mind, and then (often) help them think of things they haven’t dealt with yet that might affect other parts of the detector (like cabling behind their detector, which we often have to think hard about). You also have to know about the physics that each detector is sensitive to, what approximations are reasonable, and what approximations you’re already making that they might need to check on.

That also brings us back to our friend’s meons. If they decay very quickly into Standard Model particles, then the event generator will do all the hard work. But if they stick around long enough to interact with the detector, then we have to ask our friend for a lot more information, like how they interact with different materials. For some funny theoretical particles like magnetic monopoles, R-hadrons, and stable charginos, we have to write our own Geant4 physics modules, with a lot of help from theorists.

The detector simulation is a great piece of software to work on – but that’s not the end of it! After the simulation comes the final step, “digitization”, which I’ll talk about next time – and we’ll find out the fate of our buddy’s meon theory.

ZachMarshall Zach Marshall is a Divisional Fellow at the Lawrence Berkeley National Laboratory in California. His research is focused on searches for supersymmetry and jet physics, with a significant amount of time spent working on software and trying to help students with physics and life in ATLAS.

by Zachary Marshall at October 20, 2014 03:21 PM

Jester - Resonaances

Weekend Plot: Bs mixing phase update
Today's featured plot was released last week by the LHCb collaboration:

It shows the CP violating phase in Bs meson mixing, denoted as φs,  versus the difference of the decay widths between the two Bs meson eigenstates. The interest in φs comes from the fact that it's  one of the precious observables that 1) is allowed by the symmetries of the Standard Model, 2) is severely suppressed due to the CKM structure of flavor violation in the Standard Model. Such observables are a great place to look for new physics (other observables in this family include Bs/Bd→μμ, K→πνν, ...). New particles, even too heavy to be produced directly at the LHC, could produce measurable contributions to φs as long as they don't respect the Standard Model flavor structure. For example, a new force carrier with a mass as large as 100-1000 TeV and order 1 flavor- and CP-violating coupling to b and s quarks would be visible given the current experimental precision. Similarly, loops of supersymmetric particles with 10 TeV masses could show up, again if the flavor structure in the superpartner sector is not aligned with that in the  Standard Model.

The phase φs can be measured in certain decays of neutral Bs mesons where the process involves an interference of direct decays and decays through oscillation into the anti-Bs meson. Several years ago measurements at Tevatron's D0 and CDF experiments suggested a large new physics contribution. The mild excess has gone away since, like many other such hints.  The latest value quoted by LHCb is φs = - 0.010 ± 0.040, which combines earlier measurements of the Bs → J/ψ π+ π- and  Bs → Ds+ Ds- decays with  the brand new measurement of the Bs → J/ψ K+ K- decay. The experimental precision is already comparable to the Standard Model prediction of φs = - 0.036. Further progress is still possible, as the Standard Model prediction can be computed to a few percent accuracy.  But the room for new physics here is getting tighter and tighter.

by Jester ( at October 20, 2014 02:20 PM

Symmetrybreaking - Fermilab/SLAC

Transatlantic data-transfer gets a boost

New links will improve the flow of data from the Large Hadron Collider to US institutions.

Scientists across the US will soon have access to new, ultra high-speed network links spanning the Atlantic Ocean.

A new project is currently underway to extend the US Department of Energy’s Energy Sciences Network, or ESnet, to London, Amsterdam and Geneva.

Although the project is designed to benefit data-intensive science throughout the US national laboratory complex, heaviest users of the new links will be particle physicists conducting research at the Large Hadron Collider, the world’s largest and most powerful particle collider. The high capacity of this new connection will provide US-based scientists with enhanced access to data at the LHC and other European-based experiments by accelerating the exchange of data sets between institutions in the US and computing facilities in Europe.

“After the Higgs discovery, the next big LHC milestones will come in 2015,” says Oliver Gutsche, Fermilab scientist and member of the CMS Offline and Computing Management Board. “And this network will be indispensable for the success of the [next LHC physics program].”

DOE’s Brookhaven National Laboratory and Fermi National Accelerator Laboratory—the primary computing centers for US collaborators on the LHC’s ATLAS and CMS experiments, respectively—will make immediate use of the new network infrastructure, once it is rigorously tested and commissioned. Because ESnet, based at DOE’s Lawrence Berkeley National Laboratory, interconnects all national laboratories and a number of university-based projects in the US, tens of thousands of researchers from other disciplines will benefit as well. 

The ESnet extension will be in place before the LHC at CERN in Switzerland—currently shut down for maintenance and upgrades—is up and running again in the spring of 2015. Because the accelerator will be colliding protons at much higher energy, the data output from the detectors will expand considerably to approximately 40 petabytes of RAW data per year, compared with 20 petabytes for all of the previous lower-energy collisions produced over the three years of the LHC’s first run between 2010 and 2012.

The cross-Atlantic connectivity during the first successful run for the LHC experiments was provided by the US LHCNet network, managed by the California Institute of Technology. In recent years, major research and education networks around the world—including ESnet, Internet2, California’s CENIC, and European networks such as DANTE, SURFnet and NORDUnet—have increased their backbone capacity by a factor of 10, using sophisticated new optical networking and digital signal processing technologies. Until recently, however, higher-speed links were not deployed for production purposes across the Atlantic Ocean. 

Courtesy of: Brookhaven/Fermilab

An evolving data model

This upgrade coincides with a shift in the data model for LHC science. Previously, data moved in a more predictable and hierarchical pattern strongly influenced by geographical proximity, but network upgrades around the world have now made it possible for data to be fetched and exchanged more flexibly and dynamically. This change enables faster science outcomes and more efficient use of storage and computational power, but it requires networks around the world to perform flawlessly together. 

“Having the new infrastructure in place will meet the increased need for dealing with LHC data and provide more agile access to that data in a much more dynamic fashion than LHC collaborators have had in the past,” says physicist Michael Ernst of Brookhaven National Laboratory, a key member of the team laying out the new and more flexible framework for exchanging data between the Worldwide LHC Computing Grid centers. 

Ernst directs a computing facility at Brookhaven Lab that was originally set up as a central hub for US collaborators on the LHC’s ATLAS experiment. A similar facility at Fermi National Accelerator Laboratory has played this role for the LHC’s US collaborators on the CMS experiment. These computing resources, dubbed “Tier 1” centers, have direct links to the LHC at Europe’s CERN laboratory (Tier 0).

The experts who run them will continue to serve scientists under the new structure. But instead of serving only as hubs for data storage and distribution among US-based collaborators at Tier 2 and 3 research centers, the dedicated facilities at Brookhaven and Fermilab will also be able to serve data needs of the entire ATLAS and CMS collaborations throughout the world. And likewise, US Tier 2 and Tier 3 research centers will have higher-speed access to Tier 1 and Tier 2 centers in Europe. 

“This new infrastructure will offer LHC researchers at laboratories and universities around the world faster access to important data," says Fermilab’s Lothar Bauerdick, head of software and computing for the US CMS group. "As the LHC experiments continue to produce exciting results, this important upgrade will let collaborators see and analyze those results better than ever before.”

Ernst adds, “As centralized hubs for handling LHC data, our reliability, performance, and expertise have been in demand by the whole collaboration and now we will be better able to serve the scientists’ needs.”

Fermilab published a version of this article as a press release.


Like what you see? Sign up for a free subscription to symmetry!

October 20, 2014 01:00 PM

Peter Coles - In the Dark

Controlled Nuclear Fusion: Forget about it


You’ve probably heard that Lockheed Martin has generated a lot of excitement with a recent announcement about a “breakthrough” in nuclear fusion technology. Here’s a pessimistic post from last year. I wonder if it will be proved wrong?

Originally posted on Protons for Breakfast Blog:

Man or woman doing a technical thing with a thingy told with laser induced nuclear fusion.

Man or woman adjusting the ‘target positioner’ (I think) within the target chamber of the US Lawrence Livermore National Laboratory.

The future is very difficult to predict. But I am prepared to put on record my belief that controlled nuclear fusion as a source of power on Earth will never be achieved.

This is not something I want to believe. And the intermittent drip of news stories about ‘progress‘ and ‘breakthroughs‘ might make one think that the technique would eventually yield to humanity’s collective ingenuity.

But  in fact that just isn’t going to happen. Let me explain just some of the problems and you can judge for yourself whether you think it will ever work.

One option for controlled fusion is called Inertial Fusion Energy, and the centre of research is the US National Ignition Facility. Here the most powerful laser…

View original 601 more words

by telescoper at October 20, 2014 12:17 PM

astrobites - astro-ph reader's digest

Gravitational waves and the need for fast galaxy surveys

Gravitational waves are ripples in space-time that occur, for example, when two very compact celestial bodies merge. Their direct detection would allow scientists to characterize these mergers and understand the physics of systems undergoing strong gravitational interactions. Perhaps that era is not so distant; gravitational wave detectors such as advanced LIGO and Virgo are expected to come online by 2016. While this is a very exciting prospect, gravitational wave detectors have limited resolution; they can constrain the location of the source to within an area of 10 to 1000 deg2  on the sky, depending on the number of detectors and the strength of the signal.

An artist's rendering of two white dwarfs coallescing and producing gravitational wave emission.

An artist’s rendering of two white dwarfs coallescing and producing gravitational wave emission.

To understand the nature of the source of gravitational waves, scientists hope to be able to locate it more accurately by searching for its electromagnetic counterpart immediately after the gravitational wave is detected. How can telescopes help in this endeavor? The authors of this paper explore the possibility of performing very fast galaxy surveys to identify and characterize the birthplace of gravitational waves.

Gravitational waves from the merger of two neutron stars can be detected out to 200 Mpc, which is roughly 800 times the distance to the Andromeda galaxy. It is expected that LIGO-Virgo will detect about 40 of these events per year. There are roughly 8 galaxies per 1 deg2 within 200 Mpc - that is 800 candidate galaxies in an area of 100 deg2. Hence, a quick survey that would pinpoint those possible galaxy counterparts to the gravitational wave emission would be very useful. After potential hosts are identified, they could be followed-up with telescopes with smaller fields-of-view to measure the light emitted by the source of gravitational waves.

The electromagnetic emission following the gravitational wave detection only lasts for short periods of time (for a kilonova, the timescale is of approximately a week), and this drives the need for fast surveys. To devise an efficient search strategy, the authors suggest looking for galaxies with high star formation rates. It is expected that those galaxies will have higher chances of hosting a gravitational wave event. (Although they clarify that the rate of mergers of compact objects might be better correlated with the mass of the galaxy rather than its star formation activity.) A good proxy for star formation in a galaxy is the light it emits in the red H-alpha line, coming from  hydrogen atoms in clouds of gas that act as stellar nurseries. The issue is whether current telescopes can survey large areas fast enough to find a good fraction of all star forming galaxies within the detection area of LIGO-Virgo.

The authors consider a 2m-size telescope and estimate the typical observing time needed to identify a typical star forming galaxy up to a distance of 200 Mpc. This ranges from 40-80 seconds depending on the observing conditions. It would take this type of telescope a week to cover 100 deg2This result matches very well the expected duration of the visible light signal from these events! Mergers of black holes and neutron stars could be detected out to larger distances (~450 Mpc). To find possible galaxy hosts out to these distances, a 2m-class telescope would cover 30 deg2 in a week. Without a doubt, the exciting prospect of gravitational wave detection will spur more detailed searches for the best strategies to locate their sources.

by Elisa Chisari at October 20, 2014 08:44 AM

Lubos Motl - string vacua and pheno

ETs, hippies, loons introduce Andrew Strominger
...or a yogi and another nude man?

Exactly one week ago, Andrew Strominger of Harvard gave a Science and Cocktails talk in Christiania – a neighborhood of Copenhagen, Denmark.

The beginning of this 64-minute lecture on "Black Holes, String Theory and the Fundamental Laws of Nature" is rather extraordinary and if you only want to see the weirdest introduction of a fresh winner of the Dirac Medal, just listen to the first three minutes of the video.

However, you will obviously be much more spiritually enriched if you continue to watch for another hour – even though some people who have seen similar popular talks by Andy may feel that some of the content is redundant and similar to what they have heard.

After the introduction, you may appreciate how serious and credible Andy's and Andy daughter's illustrations are (sorry, I can't distinguish these two artists!) in comparison with the mainstream culture in the Danish capital.

At the beginning, Andy said that it's incredible how much we already know about the Universe. We may design a space probe and land it on Mars and predict the landing within a second. We are even able to feed roast beef to Andrew Strominger and make him talk as a consequence of the food, and even predict that he would talk.

It's equally shocking when we may find something clear we don't understand – something that looks like a contradiction. Such paradoxes have been essential in the history of physics. Einstein was thinking what he would see in the mirror if he were running by the speed of light (or faster than that) and looking at his image in the mirror in front of him. Newton's and Maxwell's theories gave different answers. Einstein was bothered by that.

The puzzle was solved... there is a universal speed limit, special relativity, and all this stuff. About 6 other steps in physics are presented as resolutions to paradoxes of similar types. If we don't understand, it's not a problem: it's an opportunity.

Soon afterwards, Andy focuses on general relativity, spacetime curvature etc. The parabolic trajectories of freely falling objects are actually the straigh(est) lines in the curved spacetime. After a few words, he gets to the uncertainty principle and also emphasizes that everything has to be subject to the principle – it's not possible to give "exceptions" to anyone. And the principle has to apply to the space's geometry, too.

There is a cookbook how to "directly quantize" any theory, and this procedure is amazingly tested. If you apply the cookbook to gravity, GR, you get carbagan which is great because it's a lot of fun. ;-) He says we will need "time" to figure out whether the solution we have, string theory, is right in Nature. However, already now, some more basic Harvard courses have to be fixed by some insights from the string course.

Suddenly he mentions Hawking and Bekenstein's ideas about black holes. What do black holes have to do with these issues? They have everything to do with them, it surprisingly turns out. An introduction to black holes follows. Lots of matter, escape velocity, surpasses the speed of light – the basic logic of this introduction is identical to my basic school talk in the mountains a few months ago. ;-) The talks would remain identical even when Andy talks about the ability of Karl Schwarzschild to exactly solve Einstein's equations that Einstein considered unsolvably difficult. Einstein had doubts about the existence of the black holes for quite some time but in the 1960s, the confusion disappeared. Sgr A* is his (and my) key example of a real-world black hole.

Andy says that there's less than nothing, namely nothing nothing, inside black holes. I am not 100% sure what he actually means by that. Probably some topological issues – the Euclidean black hole has no geometry for \(r\lt r_0\) at all. OK, what happens in the quantum world? Particles tunnel out of the nothing nothing and stuff comes out as the black body radiation – at Hawking's temperature. Andy calls this single equation for the temperature "the Hawking's contribution to science" which slightly belittles Hawking and it's surely partly Andy's goal but OK.

He switches to thermodynamics, the science done by those people who were playing with water and fire and the boiling point of carbon dioxide without knowing about molecules. Ludwig Boltzmann beautifully derived those phenomenologically found laws from the assumption that matter is composed of molecules that may be traced using the probabilistic reasoning. He found the important of the entropy/information. Andy wisely presents entropy to be in the units of kilobytes or gigabytes - because that's what ordinary people sort of know today.

Andy counts the Hawking-Bekenstein entropy formula among the five most fundamental formulae in physics, and perhaps the most interesting one because we don't understand. That's a bit bizarre because whenever I was telling him about the general derivations of this formula I was working on, aside from other things, Andy would tell me that we didn't need such a derivation! ;-)

Amusingly and cleverly, he explains the holographic entropy bounds by talking about the Moore's law (thanks, Luke) that must inevitably break down at some point. Of course, in the real world, it will break down long before that... Now, he faces the tension between two pictures of black holes: something with the "nothing nothing" inside; or the most complicated (highest-entropy) objects we may have.

Around 41:00, he begins to talk about string theory, its brief history, and its picture of elementary particles. On paper, string theory is capable of unifying all the forces as well as QM with GR, and it addresses the black hole puzzle. String theory has grown by having eaten almost all the competitors (a picture of a hungry boy eating some trucks, of course). The term "string theory" is used for the big body of knowledge even today.

I think that at this point, he's explaining the Strominger-Vafa paper – and its followups – although the overly popular language makes me "slightly" uncertain about that. But soon, he switches to a much newer topic, his and his collaborators' analysis of the holographic dual of the rotating Kerr black holes.

Andy doesn't fail to mention that without seeing and absorbing the mathematics, the beauty of the story is as incomplete as someone's verbal story about his visit to the Grand Canyon whose pictures can't be seen by the recipient of the story. The equation-based description of these insights is much more beautiful for the theoretical physicists than the Grand Canyon. Hooray.

Intense applause.

Last nine minutes are dedicated to questions.

The first question is not terribly original and you could guess that. What kind of experiments can we make to decide whether string theory is correct? Andy says that the question is analogous to the question to Magellan when he's in the middle of his trip around the Earth, when will he complete the trip? We don't know what comes next.

Now, I exploded in laughter because Andy's wording of this idea almost exactly mimics what I am often saying in such contexts. "You know, the understanding of Nature isn't a five-year plan." Of course, I like to say such a thing because 1) I was sort of fighting against the planned economy and similar excesses already as a child, 2) some people, most notably Lee Smolin, openly claimed that they think that science should be done according to five-year plans. It's great that Andy sees it identically. We surely don't have a proposal for an experiment that could say Yes or No but we work with things that are accessible and not just dreamed about, Andy says, and the work on the black hole puzzle is therefore such an important part of the research.

The second question was so great that one might even conjecture that the author knew something about the answer: Why does the entropy and the bounds scale like the area and not the volume? So Andy says that the black hole doesn't really have the volume. We "can't articulate it well" – he slightly looks like he is struggling and desperately avoiding the word "holography" for reasons I don't fully understand. OK, now he said the word.

In the third question, a girl asks how someone figured out that there should be black holes. Andy says that physicists solve things in baby steps or smaller ones. Well, they first try to solve everything exactly and they usually fail. So they try to find special solutions and Schwarzschild did find one. Amazingly, it took decades to understand what the solution meant. Every wrong thing has been tried before the right thing was arrived at.

Is a black hole needed for every galaxy? Is a black hole everywhere? He thinks that it is an empirical question. Andy says that he doesn't have an educated guess himself. Astronomers tend to believe that a black hole is in every galaxy. Of course, I would say that this question depends on the definition of a galaxy. The "galaxies" without a black hole inside are probably low-density "galaxies", and one may very well say that such diluted ensembles don't deserve the name "galaxy".

In twenty years, Andy will be able to answer the question – which he wouldn't promise for the "egg or chicken first" question.

I didn't understand the last question about some character of string theory. Andy answered that string theory will be able to explain that, whatever "that" means. ;-)

Another intense applause with colorful lights. Extraterrestrial sounds conclude the talk.

by Luboš Motl ( at October 20, 2014 06:10 AM

October 19, 2014

Christian P. Robert - xi'an's og

Shravan Vasishth at Bayes in Paris this week

Taking advantage of his visit to Paris this month, Shravan Vasishth, from University of Postdam, Germany, will give a talk at 10.30am, next Friday, October 24, at ENSAE on:

Using Bayesian Linear Mixed Models in Psycholinguistics: Some open issues

With the arrival of the probabilistic programming language Stan (and JAGS), it has become relatively easy to fit fairly complex Bayesian linear mixed models. Until now, the main tool that was available in R was lme4. I will talk about how we have fit these models in recently published work (Husain et al 2014, Hofmeister and Vasishth 2014). We are trying to develop a standard approach for fitting these models so that graduate students with minimal training in statistics can fit such models using Stan.

I will discuss some open issues that arose in the course of fitting linear mixed models. In particular, one issue is: should one assume a full variance-covariance matrix for random effects even when there is not enough data to estimate all parameters? In lme4, one often gets convergence failure or degenerate variance-covariance matrices in such cases and so one has to back off to a simpler model. But in Stan it is possible to assume vague priors on each parameter, and fit a full variance-covariance matrix for random effects. The advantage of doing this is that we faithfully express in the model how the data were generated—if there is not enough data to estimate the parameters, the posterior distribution will be dominated by the prior, and if there is enough data, we should get reasonable estimates for each parameter. Currently we fit full variance-covariance matrices, but we have been criticized for doing this. The criticism is that one should not try to fit such models when there is not enough data to estimate parameters. This position is very reasonable when using lme4; but in the Bayesian setting it does not seem to matter.

Filed under: Books, Statistics, University life Tagged: Bayesian linear mixed models., Bayesian modelling, JAGS, linear mixed models, lme4, prior domination, psycholinguistics, STAN, Universität Potsdam

by xi'an at October 19, 2014 10:14 PM

Michael Schmitt - Collider Blog

Enhanced Higgs to tau+tau- Search with Deep Learning

“Enhanced Higgs to tau+tau- Search with Deep Learning” – that is the title of a new article posted to the archive this week by Daniel Whiteson and two collaborators from the Computer Science Department at UC Irvine (1410.3469). While the title may be totally obscure to someone outside of collider physics, it caught my immediate attention because I am working on a similar project (to be released soon).

Briefly: the physics motivation comes from the need for a stronger signal for Higgs decays to τ+τ-, which are important for testing the Higgs couplings to fermions (specifically, leptons). The scalar particle with a mass of 125 GeV looks very much like the standard model Higgs boson, but tests of couplings, which are absolutely crucial, are not very precise yet. In fact, indirect constraints are stronger than direct ones at the present time. So boosting the sensitivity of the LHC data to Higgs decays to fermions is an important task.

The meat of the article concerns the comparisons of shallow artificial neural networks, which contain only one or two hidden layers, and deep artificial neural networks, which have many. Deep networks are harder to work with than shallow ones, so the question is: does one really gain anything? The answer is: yes, its like increasing your luminosity by 25%.

This case study considers final states with two oppositely-charged leptons (e or μ) and missing transverse energy. The Higgs signal must be separated from the Drell-Yan production of τ pairs, especially Z→τ+τ-, on a statistical basis. It appears that no other backgrounds (such as W pair or top pair production) were considered, so this study is a purely technical one. Nonetheless, there is plenty to be learned from it.

Whiteson, Baldi and Sadowski make a distinction between low-level variables, which include the basic kinematic observables for the leptons and jets, and the high-level variables, which include derived kinematic quantities such as invariant masses, differences in angles and pseudorapidity, sphericity, etc. I think this distinction and the way they compare the impact of the two sets is interesting.

The question is: if a sophisticated artificial neural network is able to develop complex functions of the low-level variables through training and optimization, isn’t it redundant to provide derived kinematic quantities as additional inputs? More sharply: does the neural network need “human assistance” to do its job?

The answer is clear: human assistance does help the performance of even a deep neural network with thousands of neurons and millions of events for training. Personally I am not surprised by this, because there is physics insight behind most if not all of the high-level variables — they are not just arbitrary functions of the low-level variables. So these specific functions carry physics meaning and fall somewhere between arbitrary functions of the input variables and brand new information (or features). I admit, though, that “physics meaning” is a nebulous concept and my statement is vague…

Comparison of the performance of shallow networks and deep networks, and also of low-level and high-level variables

Comparison of the performance of shallow networks and deep networks, and also of low-level and high-level variables

The authors applied state of the art techniques for this study, including optimization with respect to hyperparameters, i.e., the parameters that concern the details of the training of the neural network (learning speed, `velocity’ and network architecture). A lot of computer cycles were burnt to carry out these comparisons!

Deep neural networks might seem like an obvious way to go when trying to isolate rare signals. There are real, non-trivial stumbling blocks, however. An important one is the vanishing gradient problem. If the number of hidden nodes is large (imagine eight layers with 500 neurons each) then training by back-propagation fails because it cannot find a significantly non-zero gradient with respect to the weights and offsets of the all the neurons. If the gradient vanishes, then the neural network cannot figure out which way to evolve so that it performs well. Imagine a vast flat space with a minimum that is localized and far away. How can you figure out which way to go to get there if the region where you are is nearly perfectly flat?

The power of a neural network can be assessed on the basis of the receiver operator curve (ROC) by integrating the area beneath the curve. For particle physicists, however, the common coinage is the expected statistical significance of an hypothetical signal, so Whiteson & co translate the performance of their networks into a discovery significance defined by a number of standard deviations. Notionally, a shallow neural network working only with low-level variables would achieve a significance of 2.57σ, while adding in the high-level variables increases the significance to 3.02σ. In contrast, the deep neural networks achieve 3.16σ with low-level, and 3.37σ with all variables.

Some conclusions are obvious: deep is better than shallow. Also, adding in the high-level variables helps in both cases. (Whiteson et al. point out that the high-level variables incorporate the τ mass, which otherwise is unavailable to the neural networks.) The deep network with low-level variables is better than a shallow network with all variables, and the authors conclude that the deep artificial neural network is learning something that is not embodied in the human-inspired high-level variables. I am not convinced of this claim since it is not clear to me that the improvement is not simply due to the inadequacy of the shallow network to the task. By way of an analogy, if we needed to approximate an exponential curve by a linear one, we would not succeed unless the range was very limited; we should not be surprised if a quadratic approximation is better.

In any case, since I am working on similar things, I find this article very interesting. It is clear that the field is moving in the direction of very advanced numerical techniques, and this is one fruitful direction to go in.

by Michael Schmitt at October 19, 2014 02:19 PM

Peter Coles - In the Dark

What’s the point of conferences?

Well, here I am back in the office making a start on my extensive to-do list. Writing it, I mean. Not actually doing any of it.

It was nice to get away for a couple of weeks, to meet up with some old friends I haven’t seen for a while and also to catch up on some of the developments in my own field and other related areas. We do have pretty good seminar series here at Sussex which should in principle allow me to keep up to date with developments in my own research area, but unfortunately the timing of these events often clashes with other meetings  that I’m obliged to attend as Head of School. Escaping to a conference is a way of focussing on research for a while without interruption. At least that’s the idea.

While at the meeting, however, I was struck by a couple of things. First was that during the morning plenary lectures given by invited speakers almost everyone in the audience was spending much more time working on their laptops than listening to the talk.  This has been pretty standard at every meeting I’ve been to for the last several years. Now that everyone uses powerpoint (or equivalent) for such presentations nobody in the audience feels the need to take notes so to occupy themselves they spend the time answering emails or pottering about on Facebook. That behaviour does not depend on the quality of the talk, either. Since nobody seems to listen very much the question naturally arises as to whether the presentations have any intrinsic value at all. It often seems to me that the conference talk has turned into a kind of ritual that persists despite nobody really knowing what it’s for or how it originated. An hour is too long to talk if you really want people to listen, but we go on doing it.

The part of a conference session that’s more interesting is the discussion after each talk. Sometimes there’s a genuine discussion from which you learn something quite significant or get an idea for a new study.  There’s often also a considerable amount of posturing, preening and point-scoring which is less agreeable but in its own way I suppose fairly interesting.

At the meeting I was attending the afternoons were devoted to discussion sessions for which we split into groups. I was allocated to “Gravitation and Cosmology”; others were on “Cosmic Rays”, “Neutrino Physics and Astrophysics”, and so on. The group I was, of about 25 people, was a nice size for discussion. These sessions were generally planned around short “informal” presentations intended to stimulate discussion, but generally these presentations were about the same length as the plenary talks and also given in Powerpoint. There was discussion, but the format turned out to be less different from the morning sessions than I’d hoped for. I’m even more convinced than ever that Powerpoint presentations used in this way stifle rather than stimulate discussion and debate. The pre-prepared presentation is often used as a crutch by a speaker reluctant to adopt a more improvisatory approach that would probably be less polished but arguably more likely to generate new thoughts.

I don’t know whether the rise of Powerpoint is itself to blame for our collective unwillingness inability to find other ways of talking about science, but I’d love to try organizing a workshop or conference along lines radically different from the usual “I talk, you listen” format in which the presenter is active and the audience passive for far too long.

All this convinced me that the answer to the question “What is the point of conferences?” has very little to do with the formal  programme and more with the informal parts, especially the conversations over coffee and at dinner. Perhaps I should try arranging a conference that has nothing but dinner and coffee breaks on the schedule?

by telescoper at October 19, 2014 02:02 PM

October 18, 2014

Sean Carroll - Preposterous Universe

How to Communicate on the Internet

Let’s say you want to communicate an idea X.

You would do well to simply say “X.”

Also acceptable is “X. Really, just X.”

A slightly riskier strategy, in cases where miscomprehension is especially likely, would be something like “X. This sounds a bit like A, and B, and C, but I’m not saying those. Honestly, just X.” Many people will inevitably start arguing against A, B, and C.

Under no circumstances should you say “You might think Y, but actually X.”

Equally bad, perhaps worse: “Y. Which reminds me of X, which is what I really want to say.”

For examples see the comment sections of the last couple of posts, or indeed any comment section anywhere on the internet.

It is possible these ideas may be of wider applicability in communication situations other than the internet.

(You may think this is just grumping but actually it is science!)

by Sean Carroll at October 18, 2014 04:32 PM

Peter Coles - In the Dark

Bagaglio Mancante

I should have known something would go wrong.

When my flight landed at Gatwick yesterday, I was quickly off the plane, through passport control and into the Baggage Reclaim. And there I waited. My baggage never arrived.

After almost an hour waiting in vain went to the counter and filed a missing baggage report before getting the train back to Brighton.

By then my phone battery was flat but the charger was in my lost bag so I was unable to receive the text message I was told I would get when my bag was located. This morning I had to buy another charger and when I recharged my phone I discovered the bag had arrived at London Gatwick at 0800 this morning and a Courier would call to arrange delivery.

Great, I thought. Gatwick is only 30 minutes away from Brighton so I would soon get my stuff.

Wrong. Using the online tracking system I found the bag had been sent to Heathrow and had sat there until after 2pm before being loaded onto a vehicle for delivery.

There’s nobody answering phones at the courier company so I guess I just have to wait in the flat until they decide to deliver it.

I don’t know how BA managed to lose a bag on a direct flight in the first place, but their idiotic courier has added at least half a day’s delay in returning it.

UPDATE: My bag finally arrived at 1940. It seems it was never put on the plane I flew on.


by telescoper at October 18, 2014 02:12 PM

October 17, 2014

The Great Beyond - Nature blog

White House suspends enhanced pathogen research
Past research made the H5N1 virus transmissible in ferrets.

Past research has made the H5N1 virus transmissible in ferrets.

Sara Reardon

As the US public frets about the recent transmission of Ebola to two Texas health-care workers, the US government has turned an eye on dangerous viruses that could become much more widespread if they were to escape from the lab. On 17 October, the White House Office of Science and Technology Policy (OSTP) announced a mandatory moratorium on research aimed at making pathogens more deadly, known as gain-of-function research.

Under the moratorium, government agencies will not fund research that attempts to make natural pathogens more transmissible through the air or more deadly in the body. Researchers who have already been funded to do such projects are asked to voluntarily pause work while two non-regulatory bodies, the National Science Advisory Board for Biosecurity (NSABB) and the National Research Council, assess its risks. The ban specifically mentions research that would enhance influenza, severe acute respiratory syndrome (SARS) and Middle East respiratory syndrome (MERS). Other types of research on naturally occurring strains of these viruses would still be funded.

This is the second time that gain-of-function research has been suspended. In 2012, 39 scientists working on influenza agreed to a voluntary moratorium after the publication of two papers demonstrating that an enhanced H5N1 influenza virus could be transmitted between mammals through respiratory droplets. The publications drew a storm of controversy centred around the danger that they might give terrorists the ability to create highly effective bioweapons, or that the viruses might accidentally escape the lab. Research resumed after regulatory agencies and entities such as the World Health Organization laid out guidelines for ensuring the safety and security of flu research.

The OSTP’s moratorium, by contrast, is mandatory and affects a much broader array of viruses. “I think it’s really excellent news,” says Marc Lipsitch of Harvard University in Cambridge, Massachusetts, who has long called for more oversight of risky research. “I think it’s common sense to deliberate before you act.”

Virologist Yoshihiro Kawaoka of the University of Wisconsin–Madison, who conducted one of the controversial H5N1 gain-of-function studies in an effort to determine how the flu virus could evolve to become more transmissible in mammals, says that he plans to “comply with the government’s directives” on those experiments that are considered to be gain-of-function under OSTP’s order. “I hope that the issues can be discussed openly and constructively so that important research will not be delayed indefinitely,” he says.

The NSABB, which has not met since 2012, was called back into action in July, apparently in response to a set of lab accidents at the US Centers for Disease Control and Prevention in which lab workers were exposed to anthrax and inadvertently shipped H5N1 virus without proper safety precautions. The NSABB will spend most of its next meeting on 22 October discussing gain-of-function research, and the National Research Council plans to hold a workshop on a date that has not yet been set. Lipsitch, who will speak at the NSABB meeting, says that he plans to advocate for the use of an objective risk-assessment tool to weigh the potential benefits of each research project against the probability of a lab accident and the pathogen’s contagiousness, and to consider whether the knowledge gained by studying a risky pathogen could be gained in a safer way.

Correction: This post has been changed to specify that Yoshihiro Kawaoka’s 2012 gain-of-function research increased the transmissibility of H5N1.

by Sara Reardon at October 17, 2014 10:52 PM

Emily Lakdawalla - The Planetary Society Blog

Watching Siding Spring's encounter with Mars
The nucleus of comet Siding Spring passes close by Mars on Sunday, October 19, at 18:27 UTC. Here are links to webcasts and websites that should have updates throughout the encounter.

October 17, 2014 09:11 PM

astrobites - astro-ph reader's digest

UR #16: Star Cluster Evolution
astrobitesURlogoThe undergrad research series is where we feature the research that you’re doing. If you’ve missed the previous installments, you can find them under the “Undergraduate Research” category here.

Did you finish a senior thesis this summer? Or maybe you’re just getting started on an astro research project this semester? If you, too, have been working on a project that you want to share, we want to hear from you! Think you’re up to the challenge of describing your research carefully and clearly to a broad audience, in only one paragraph? Then send us a summary of it!

You can share what you’re doing by clicking on the “Your Research” tab above (or by clicking here) and using the form provided to submit a brief (fewer than 200 words) write-up of your work. The target audience is one familiar with astrophysics but not necessarily your specific subfield, so write clearly and try to avoid jargon. Feel free to also include either a visual regarding your research or else a photo of yourself.

We look forward to hearing from you!


Bhawna Motwani
Indian Institute of Technology Roorkee, Uttarakhand, India

Bhawna is a final year Integrated Masters student of Physics at IIT Roorkee. This work is a part of her summer research in 2013 with Prof. Pavel Kroupa and Dr. Sambaran Banerjee at the Argelander Institut für Astronomie, Bonn, Germany.

Dynamical Evolution of Young Star Clusters

The much-debated classical scenario of star-cluster formation, first delineated by Hills (1980), suggests that the collapse of a proto-stellar core within a parent molecular gas cloud gives rise to an infant gas-embedded cluster. Subsequently, the residual gas is driven out of the cluster due to kinetic energy from stellar winds and radiation thereby diluting the gravitational cluster potential. However, pertaining to a star-formation efficiency $\epsilon$ <50% (Kroupa 2008) and slow gas-expulsion, the cluster remains fractionally bound and ultimately regains dynamical equilibrium. With the help of NBODY6 (Aarseth 1999) algorithm, we perform N-body simulations to examine the time-evolution of confinement radii ($R_f$) for various mass-fractions f of such emerging clusters. From this, we infer the cluster re-virialization times ($\tau_{vir}$) and bound-mass fractions for a range of initial cluster-mass and half-mass radii. We relate the above properties to stellar evolution and initial mass segregation in the simulation and find that primordially segregated systems virialize faster and possess a lower bound-mass fraction on account of mass loss from heavy stars and 2-body+3-body interactions whereas, stellar evolution does not exhibit significant effect. This research is the first instance where a realistic IMF (Kroupa 2001) has been utilized to perform an extended parameter scan for an N-body cluster model.

The figure depicts typical Lagrange radii $R_{f}$ evolution profile for a computed N-body model with initial mass = 3e4 M_sun and half-mass radius = 0.5 pc. From bottom to top, the curves represent mass fractions from 5% to 99% in steps of 5%. The dark-red lines represent $R_{10}$, $R_{50}$ and $R_{80}$ respectively. The delay time (time after which gas-expulsion starts) is 0.6 Myr.

The figure depicts typical Lagrange radii $R_{f}$ evolution profile for a computed N-body model with initial mass = 3e4 M_sun and half-mass radius = 0.5 pc. From bottom to top, the curves represent mass fractions from 5% to 99% in steps of 5%. The dark-red lines represent $R_{10}$, $R_{50}$ and $R_{80}$ respectively. The delay time (time after which gas-expulsion starts) is 0.6 Myr.


by Astrobites at October 17, 2014 09:07 PM

Sean Carroll - Preposterous Universe

Does Santa Exist?

There’s a claim out there — one that is about 95% true, as it turns out — that if you pick a Wikipedia article at random, then click on the first (non-trivial) link, and keep clicking on the first link of each subsequent article, you will end up at Philosophy. More specifically, you will end up at a loop that runs through Reality, Existence, Awareness, Consciousness, and Quality (philosophy), as well as Philosophy itself. It’s not hard to see why. These are the Big Issues, concerning the fundamental nature of the universe at a deep level. Almost any inquiry, when pressed to ever-greater levels of precision and abstraction, will get you there.

Does Santa Exist? Take, for example, the straightforward-sounding question “Does Santa Exist?” You might be tempted to say “No” and move on. (Or you might be tempted to say “Yes” and move on, I don’t know — a wide spectrum of folks seem to frequent this blog.) But even to give such a common-sensical answer is to presume some kind of theory of existence (ontology), not to mention a theory of knowledge (epistemology). So we’re allowed to ask “How do you know?” and “What do you really mean by exist?”

These are the questions that underlie an entertaining and thought-provoking new book by Eric Kaplan, called Does Santa Exist?: A Philosophical Investigation. Eric has a resume to be proud of: he is a writer on The Big Bang Theory, and has previously written for Futurama and other shows, but he is also a philosopher, currently finishing his Ph.D. from Berkeley. In the new book, he uses the Santa question as a launching point for a rewarding tour through some knotty philosophical issues. He considers not only a traditional attack on the question, using Logic and the beloved principles of reason, but sideways approaches based on Mysticism as well. (“The Buddha ought to be able to answer our questions about the universe for like ten minutes, and then tell us how to be free of suffering.”) His favorite, though, is the approach based on Comedy, which is able to embrace contradiction in a way that other approaches can’t quite bring themselves to do.

Most people tend to have a pre-existing take on the Santa question. Hence, the book trailer for Does Santa Exist? employs a uniquely appropriate method: Choose-Your-Own-Adventure. Watch and interact, and you will find the answers you seek.

by Sean Carroll at October 17, 2014 04:34 PM

CERN Bulletin

CHIS – Letter from French health insurance authorities "Assurance Maladie" and “frontalier” status

Certain members of the personnel residing in France have recently received a letter, addressed to themselves and/or their spouse, from the French health insurance authorities (Assurance Maladie) on the subject of changes in the health insurance coverage of “frontalier” workers.


It should be recalled that employed members of personnel (MPE) are not affected by the changes made by the French authorities to the frontalier  workers' "right to choose" (droit d'option) in matters of health insurance (see the CHIS website for more details), which took effect as of 1 June 2014, as they are not considered to be frontalier workers. Associated members of the personnel (MPA) are not affected either, unless they live in France and are employed by a Swiss institute.

For the small number of MPAs in the latter category who might be affected, as well as for family members who do have frontalier status, CERN is still in discussion with the authorities of the two Host States regarding the health insurance coverage applicable to them.

We hope to receive more information in the coming weeks and will keep you informed via the CHIS web site and the CERN Bulletin.

HR Department

October 17, 2014 04:10 PM

The n-Category Cafe

New Evidence of the NSA Deliberately Weakening Encryption

One of the most high-profile ways in which mathematicians are implicated in mass surveillance is in the intelligence agencies’ deliberate weakening of commercially available encryption systems — the same systems that we rely on to protect ourselves from fraud, and, if we wish, to ensure our basic human privacy.

We already knew quite a lot about what they’ve been doing. The NSA’s 2013 budget request asked for funding to “insert vulnerabilities into commercial encryption systems”. Many people now know the story of the Dual Elliptic Curve pseudorandom number generator, used for online encryption, which the NSA aggressively and successfully pushed to become the industry standard, and which has weaknesses that are widely agreed by experts to be a back door. Reuters reported last year that the NSA arranged a secret $10 million contract with the influential American security company RSA (yes, that RSA), who became the most important distributor of that compromised algorithm.

In the August Notices of the AMS, longtime NSA employee Richard George tried to suggest that this was baseless innuendo. But new evidence published in The Intercept makes that even harder to believe than it already was. For instance, we now know about the top secret programme Sentry Raven, which

works with specific US commercial entities … to modify US manufactured encryption systems to make them exploitable for SIGINT [signals intelligence].

(page 9 of this 2004 NSA document).

The Intercept article begins with a dramatic NSA-drawn diagram of the hierarchy of secrecy levels. Each level is colour-coded. Top secret is red, and above top secret (these guys really give it 110%) are the “core secrets” — which, as you’d probably guess, are in black. From the article:

the NSA’s “core secrets” include the fact that the agency works with US and foreign companies to weaken their encryption systems.

(The source documents themselves are linked at the bottom of the article.)

It’s noted that there is “a long history of overt NSA involvement with American companies, especially telecommunications and technology firms”. Few of us, I imagine, would regard that as a bad thing in itself. It’s the nature of the involvement that’s worrying. The aim is not just to crack the encrypted messages of particular criminal suspects, but the wholesale compromise of all widely used encryption methods:

The description of Sentry Raven, which focuses on encryption, provides additional confirmation that American companies have helped the NSA by secretly weakening encryption products to make them vulnerable to the agency.

The documents also appear to suggest that NSA staff are planted inside American security, technology or telecomms companies without the employer’s knowledge. Chris Soghoian, principal technologist at the ACLU, notes that “As more and more communications become encrypted, the attraction for intelligence agencies of stealing an encryption key becomes irresistible … It’s such a juicy target.”

Unsurprisingly, the newly-revealed documents don’t say anything specific about the role played by mathematicians in weakening digital encryption. But they do make it that bit harder for defenders of the intelligence agencies to maintain that their cryptographic efforts are solely directed against the “bad guys” (a facile distinction, but one that gets made).

In other words, there is now extremely strong documentary evidence that the NSA and its partners make strenuous efforts to compromise, undermine, degrade and weaken all commonly-used encryption software. As the Reuters article puts it:

The RSA deal shows one way the NSA carried out what Snowden’s documents describe as a key strategy for enhancing surveillance: the systematic erosion of security tools.

The more or less explicit aim is that no human being is able to send a message to any other human being that the NSA cannot read.

Let that sink in for a while. There is less hyperbole than there might seem when people say that the NSA’s goal is the wholesale elimination of privacy.

This evening, I’m going to see Laura Poitras’s film Citizenfour (trailer), a documentary about Edward Snowden by one of the two journalists to whom he gave the full set of documents. But before that, I’m going to a mathematical colloquium by Trevor Wooley, Strategic Director of the Heilbronn Institute — which is the University of Bristol’s joint venture with GCHQ. I wonder how mathematicians like him, or young mathematicians now considering working for the NSA or GCHQ, feel about the prospect of a world where it is impossible for human beings to communicate in private.

by leinster ( at October 17, 2014 03:05 PM

arXiv blog

Urban "Fingerprints" Finally Reveal the Similarities (and Differences) Between American and European Cities

Travelers have long noticed that some American cities “feel” more European than others. Now physicists have discovered a way to measure the “fingerprint” of a city that captures this sense.

Travel to any European city and the likelihood is that it will look and feel substantially different to modern American cities such as Los Angeles, San Diego, or Miami.

October 17, 2014 03:05 PM

Lubos Motl - string vacua and pheno

Lorentz violation: zero or 10 million times smaller than previously thought
One of the research paradigms that I consider insanely overrated is the idea that the fundamental theory of Nature may break the Lorentz symmetry – the symmetry underlying the special theory of relativity – and that the theorist may pretty much ignore the requirement that the symmetry should be preserved.

The Super-Kamiokande collaboration has published a new test of the Lorentz violation that used over a decade of observations of atmospheric neutrinos:
Test of Lorentz Invariance with Atmospheric Neutrinos
The Lorentz-violating terms whose existence they were trying to discover are some bilinear terms modifying the oscillations of the three neutrino species, \(\nu_e,\nu_\mu,\nu_\tau\), by treating the temporal and spatial directions of the spacetime differently.

They haven't found any evidence that these coefficients are nonzero which allowed them to impose new upper bounds. Some of them, in some parameterization, are 10 million times more constraining than the previous best upper bounds!

I don't want to annoy you with some technical details of this good piece of work because I am not terribly interested in it myself, being more sure about the result than about any other experiment by a wealthy enough particle-physics-like collaboration. But I can't resist to reiterate a general point.

The people who are playing with would-be fundamental theories that don't preserve the Lorentz invariance exactly (like most of the "alternatives" of string theory meant to describe quantum gravity) must hope that "the bad things almost exactly cancel" so that the resulting effective theory is almost exactly Lorentz-preserving which is needed for the agreement with the Super-Kamiokande search – as well as a century of less accurate experiments in different sectors of physics.

But in the absence of an argument why the resulting effective theory should be almost exactly Lorentz-preserving, one must assume that it's not and that the Lorentz-violating coefficients are pretty much uniformly distributed in a certain interval.

Before this new paper, they were allowed to be between \(0\) and a small number \(\epsilon\) and if one assumes that they were nonzero, there was no theoretical reason to think that the value was much smaller than \(\epsilon\). But a new observation shows that the new value of \(\epsilon\) is 10 million times smaller than the previous one. The Lorentz-breaking theories just don't have any explanation for this strikingly accurate observation, so they should be disfavored.

The simplest estimate what happens with the "Lorentz symmetry is slightly broken" theories is, of course, that their probability has decreased 10 million times when this paper was published! Needless to say, it's not the first time when the plausibility of such theories has dramatically decreased. But even if this were the first observation, it should mean that one lines up 10,000,001 likes of Lee Smolins who are promoting similar theories and kills 10,000,000 of them.

(OK, their names don't have to be "Lee Smolin". Using millions of his fans would be pretty much OK with me. The point is that the research into these possibilities should substantially decrease.)

Because nothing remotely similar to this sensible procedure is taking place, it seems to me that too many people just don't care about the empirical data at all. They don't care about the mathematical cohesiveness of the theories, either. Both the data and the mathematics seem to unambiguously imply that the Lorentz symmetry of the fundamental laws of Nature is exact and a theory that isn't shown to exactly preserve this symmetry – or to be a super-tiny deformation of an exactly Lorentz-preserving theory – is just ruled out.

Most of the time, they hide their complete denial of this kind of experiment behind would-be fancy words. General relativity always breaks the Lorentz symmetry because the spacetime is curved, and so on. But this breaking is spontaneous and there are still several extremely important ways how the Lorentz symmetry underlying the original laws of physics constrains all phenomena in the spacetime whether it is curved or not. The Lorentz symmetry still has to hold "locally", in small regions that always resemble regions of a flat Minkowski space, it it must also hold in "large regions" that resemble the flat space if the objects inside (which may be even black holes, highly curved objects) may be represented as local disturbances inside a flat spacetime.

One may misunderstand the previous sentences – or pretend that he misunderstands the previous sentences – but it is still a fact that a fundamentally Lorentz-violating theory makes a prediction (at least a rough, qualitative prediction) about experiments such as the experiment in this paper and this prediction clearly disagrees with the observations.

By the way, few days ago, Super-Kamiokande published another paper with limits, those for the proton lifetime (in PRD). Here the improvement is small, if any, and theories naturally giving these long lifetimes obviously exist and still seem "most natural". But yes, I also think that the theories with a totally stable proton may also exist and should be considered.

by Luboš Motl ( at October 17, 2014 02:50 PM

CERN Bulletin

Computer Security: Our life in symbiosis*

Do you recall our Bulletin articles on control system cyber-security (“Hacking control systems, switching lights off!” and “Hacking control systems, switching... accelerators off?”) from early 2013? Let me shed some light on this issue from a completely different perspective.


I was raised in Europe during the 80s. With all the conveniences of a modern city, my environment made me a cyborg - a human entangled with technology - supported but also dependent on software and hardware. Since my childhood, I have eaten food packaged by machines and shipped through a sophisticated network of ships and lorries, keeping it fresh or frozen until it arrives in supermarkets. I heat my house with the magic of nuclear energy provided to me via a complicated electrical network. In fact, many of the amenities and gadgets I use are based on electricity and I just need to tap a power socket. When on vacation, I travel by taxi, train and airplane. And I enjoy the beautiful weather outside thanks to the air conditioning system located in the basement of the CERN IT building.

This air conditioning system, a process control system (PCS), monitors the ambient room temperature through a distributed network of sensors. A smart central unit - the Programmable Logic Controller (PLC) - compares the measured temperature values with a set of thresholds and subsequently calculates a new setting for heating or cooling. On top of this temperature control loop (monitor - calculate - set), a small display (a simple SCADA (supervisory controls and data acquisition) system) attached to the wall allows me to read the current room temperature and to manipulate its set-points. Depending on the size of the building and the number of processes controlled, many (different) sensors, PLCs, actuators and SCADA systems can be combined and inter-connected to build a larger and more complex PCS.

In a similar way, all our commodities and amenities depend on many different, complex PCSs e.g. a PCS for water and waste management, for electricity production and transmission, for public and private transport, for communication, for production of oil and gas but also cars, food, and pharmaceuticals. Today, many people live in symbiosis with those PCSs which make their lives cosy and comfortable, and industry depends on them. The variety of PCSs has become a piece of “critical infrastructure”, providing the fundamental basis for their general survival.

So what would happen if part or all of this critical infrastructure failed? How would your life change without clean tap water and proper waste disposal, without electricity, without fresh and frozen food? The cool air in the lecture hall will get hot and become uncomfortable. On a wider scale, with no drinking water from the tap, we would have to go back to local wells or collect and heat rain water in order to purify it. Failure of the electricity system would halt public life: frozen goods in supermarkets would warm up and become inedible, fuel pumps would not work anymore, life-preservation systems in hospitals would stop once the local diesel generators ran out of fuel…  (this is nicely depicted in the novel “Blackout” by M. Elsberg).

We rely on our critical infrastructure, we rely on PCS and we rely on the technologies behind PCSs. In the past, PCSs, PLCs and SCADA systems and their hardware and software components were proprietary, custom-built, and stand-alone. Expertise was centralised with a few system engineers who knew their system by heart. That has changed in recent decades. Pressure for consolidation and cost-effectiveness has pushed manufacturers to open up. Today, modern PCSs employ the same technological means that have been used for years in computer centres, in offices and at home: Microsoft’s Windows operating system to run SCADA systems; web browser as user interfaces; laptops and tablets replacing paper checklists; emails to disseminate status information and alerts; the IP protocol to communicate among different parts of a PCS; the Internet for remote access for support personnel and experts...

Unfortunately, while benefitting from standard information technology, PCSs have also inherited its drawbacks: design flaws in hardware, bugs in software components and applications, and vulnerabilities in communication protocols. Exploiting these drawbacks, malicious cyber-attackers and benign IT researchers have probed many different hardware, software and protocols for many years. Today, computer centres, office systems and home computers are permanently under attack. With their new technological basis, PCSs underwent scrutiny, too. The sophisticated “Stuxnet” attack by the US and Israel against the control system of Iranian uranium enrichment facilities in 2010 is just one of the more publicised cases. New vulnerabilities affecting PCSs are regularly published on certain web pages, and recipes for malicious attacks circulate widely on the Internet. The damage caused may be enormous.

Therefore, “Critical Infrastructure Protection” (CIP) becomes a must. But protecting PCSs like computer centres, patching them, running anti-virus on them, and controlling their access is much more difficult than attacking. PCS are built for use-cases. Malicious abuse is rarely considered during their design and implementation phase. For example, rebooting a SCADA PC will temporarily cease monitoring capabilities while updating PLCs firmware usually requires thorough re-testing and probably even re-certification. Both are non-trivial and costly tasks that cannot be done in-line with the monthly patch cycle releases by firms like Microsoft.

Ergo, a fraction (if not many) of today’s PCSs are vulnerable to common cyber-attacks. Not without reason, the former advisor to the US president, Richard Clarke, said “that the US might be able to blow up a nuclear plant somewhere, or a terrorist training centre somewhere, but a number of countries could strike back with a cyber-attack and the entire [US] economic system could be crashed in retaliation … because we can’t defend it today.” (AP 2011) We need to raise our cyber-defences now. Without CIP, without protected SCADA systems, our modern symbiotic life is at risk.

*To be published in the annual yearbook of the World Federation of Scientists.

Check out our website for further information, answers to your questions and help, or e-mail

If you want to learn more about computer security incidents and issues at CERN, just follow our Monthly Report.

Access the entire collection of Computer Security articles here.

October 17, 2014 02:10 PM

ZapperZ - Physics and Physicists

Iranian Physicist Omid Kokabee To Receive A New Trial
This type of prosecution used to happen in the iron-fisted rule of the Soviet Union. But there is a sign of optimism in the case of physicist Omid Kokabee as the Iranian Supreme Court ordered a new trial. This after Kokabee has spent 4 years in prison for a charge that many in the world considered to be flimsy at best.

"Acceptance of the retrial request means that the top judicial authority has deemed Dr. Omid Kokabee's [initial] verdict against the law," Kokabee's lawyer, Saeed Khalili was quoted as saying on the website of the International Campaign for Human Rights in Iran. "The path has been paved for a retrial in his case, and God willing, proving his innocence."

Kokabee, a citizen of Iran who at the time was studying at the University of Texas, Austin, was first arrested at the Tehran airport in January 2011. After spending 15 months in prison waiting for a trial, including more than a month in solitary confinement, he was convicted by Iran's Revolutionary Court of "communicating with a hostile government" and receiving "illegitimate funds" in the form of his college loans. He was sentenced to ten years in prison without ever talking to his lawyer or being allowed testimony in his defense.

He received stipends as part of his graduate assistantship that was considered to be "illegitimate funds", which is utterly ridiculous. My characterization of such an accusation is that this can only come out of a bunch of extremely stupid and moronic group of people. There, I've said it!


by ZapperZ ( at October 17, 2014 01:41 PM

Symmetrybreaking - Fermilab/SLAC

High schoolers try high-powered physics

The winners of CERN's Beam Line for Schools competition conducted research at Europe’s largest physics laboratory.

Many teenagers dream about getting the keys to their first car. Last month, a group of high schoolers got access to their first beam of accelerated particles at CERN.

As part of its 60th anniversary celebration, CERN invited high school students from around the world to submit proposals for how they would use a beam of particles at the laboratory. Of the 292 teams that submitted the required “tweet of intent,” 1000-word proposal and one-minute video, CERN chose not one but two groups of winners: one from Dominicus College in Nijmegen, the Netherlands, and another from the Varvakios Pilot School in Athens, Greece.

The teams travelled to Switzerland in early September.

“Just being at CERN was fantastic,” says Nijmegen student Lisa Biesot. “The people at CERN were very enthusiastic that we were there. They helped us very much, and we all worked together.”

The Beam Line for Schools project was the brainchild of CERN physicist Christoph Rembser, who also coordinated the project. He and others at CERN didn’t originally plan for more than one team to win. But it made sense, as the two groups easily merged their experiments: Dominicus College students constructed a calorimeter that was placed within the Varvakios Pilot School’s experiment, which studied one of the four fundamental forces, the weak force.

“These two strong experiments fit so well together, and having an international collaboration, just like what we have at CERN, was great,” says Kristin Kaltenhauser of CERN’s international relations office, who worked with the students.

Over the summer the Nijmegen team grew crystals from potassium dihydrogen phosphate, a technique not used before at CERN, to make their own calorimeter, a piece of equipment that measures the energy of different particles.

At CERN, the unified team cross-calibrated the Nijmegen calorimeter with a calorimeter at CERN.

“We were worried if it would work,” says Nijmegen teacher Rachel Crane. “But then we tested our calorimeter on the beam with a lot of particles—positrons, electrons, pions and muons—and we really saw the difference. That was really amazing.”

The Athens team modeled their proposal on one of CERN’s iconic early experiments, conducted at the laboratory's first accelerator in 1958 to study an aspect of the weak force, which powers the thermonuclear reactions that cause the sun to shine.

Whereas the 1958 experiment had used a beam made completely of particles called pions, the students’ experiment used a higher energy beam containing a mixture of pions, kaons, protons, electrons and muons. They are currently analyzing the data.

CERN physicists Saime Gurbuz and Cenk Yildiz, who assisted the two teams, say they and other CERN scientists were very impressed with the students. “They were like real physicists,” Gurbuz says. “They were  professional and eager to take data and analyze it.”

The students and their teachers agree that working together enriched both their science and their overall experience. “We were one team,” says Athens student Nikolas Plaskovitis. “The collaboration was great and added so much to the experiment.” 

The students, teachers and CERN scientists have stayed in touch since the trip.

Before Nijmegen student Olaf Leender started working on the proposal, he was already interested in science, he says. “Now after my visit to CERN and this awesome experience, I am definitely going to study physics.”

Andreas Valadakis, who teaches the Athens group, says that his students now serve as science mentors to their fellow students. “This experience was beyond what we imagined,” he says.

Plaskovitis agrees with his teacher. “When we ran the beam line at CERN, just a few meters away behind the wall was the weak force at work. Just like the sun. And we were right there next to it.” 

Kaltenhauser says that CERN plans to hold another Beam Line for Schools competition in the future.


Like what you see? Sign up for a free subscription to symmetry!

by Rich Blaustein at October 17, 2014 01:27 PM

The n-Category Cafe

'Competing Foundations?' Conference

FINAL CFP and EXTENDED DEADLINE: SoTFoM II `Competing Foundations?’, 12-13 January 2015, London.

The focus of this conference is on different approaches to the foundations of mathematics. The interaction between set-theoretic and category-theoretic foundations has had significant philosophical impact, and represents a shift in attitudes towards the philosophy of mathematics. This conference will bring together leading scholars in these areas to showcase contemporary philosophical research on different approaches to the foundations of mathematics. To accomplish this, the conference has the following general aims and objectives. First, to bring to a wider philosophical audience the different approaches that one can take to the foundations of mathematics. Second, to elucidate the pressing issues of meaning and truth that turn on these different approaches. And third, to address philosophical questions concerning the need for a foundation of mathematics, and whether or not either of these approaches can provide the necessary foundation.

Date and Venue: 12-13 January 2015 - Birkbeck College, University of London.

Confirmed Speakers: Sy David Friedman (Kurt Goedel Research Center, Vienna), Victoria Gitman (CUNY), James Ladyman (Bristol), Toby Meadows (Aberdeen).

Call for Papers: We welcome submissions from scholars (in particular, young scholars, i.e. early career researchers or post-graduate students) on any area of the foundations of mathematics (broadly construed). While we welcome submissions from all areas concerned with foundations, particularly desired are submissions that address the role of and compare different foundational approaches. Applicants should prepare an extended abstract (maximum 1,500 words) for blind review, and send it to sotfom [at] gmail [dot] com, with subject `SOTFOM II Submission’.

Submission Deadline: 31 October 2014

Notification of Acceptance: Late November 2014

Scientific Committee: Philip Welch (University of Bristol), Sy-David Friedman (Kurt Goedel Research Center), Ian Rumfitt (University of Birmigham), Carolin Antos-Kuby (Kurt Goedel Research Center), John Wigglesworth (London School of Economics), Claudio Ternullo (Kurt Goedel Research Center), Neil Barton (Birkbeck College), Chris Scambler (Birkbeck College), Jonathan Payne (Institute of Philosophy), Andrea Sereni (Universita Vita-Salute S. Raffaele), Giorgio Venturi (CLE, Universidade Estadual de Campinas)

Organisers: Sy-David Friedman (Kurt Goedel Research Center), John Wigglesworth (London School of Economics), Claudio Ternullo (Kurt Goedel Research Center), Neil Barton (Birkbeck College), Carolin Antos-Kuby (Kurt Goedel Research Center)

Conference Website: sotfom [dot] wordpress [dot] com

Further Inquiries: please contact Carolin Antos-Kuby (carolin [dot] antos-kuby [at] univie [dot] ac [dot] at) Neil Barton (bartonna [at] gmail [dot] com) Claudio Ternullo (ternulc7 [at] univie [dot] ac [dot] at) John Wigglesworth (jmwigglesworth [at] gmail [dot] com)

The conference is generously supported by the Mind Association, the Institute of Philosophy, British Logic Colloquium, and Birkbeck College.

by david ( at October 17, 2014 01:13 PM

CERN Bulletin

Emilio Picasso (1927-2014)

Many people in the high-energy physics community will be deeply saddened to learn that Emilio Picasso passed away on Sunday 12 October after a long illness. His name is closely linked in particular with the construction of CERN’s Large Electron-Positron (LEP) collider.


Emilio studied physics at the University of Genoa. He came to CERN in 1964 as a research associate to work on the ‘g-2’ experiments, which he was to lead when he became a staff member in 1966. These experiments spanned two decades at two different muon storage rings and became famous for their precision studies of the muon and tests of quantum electrodynamics.

In 1979, Emilio became responsible for the coordination of work by several institutes, including CERN, on the design and construction of superconducting RF cavities for LEP. Then, in 1981, the Director-General, Herwig Schopper, appointed him as a CERN director and LEP project leader. Emilio immediately set up a management board of the best experts at CERN and together they went on to lead the construction of LEP, the world’s largest electron synchrotron, in the 27-km tunnel that now houses the LHC.

LEP came online just over 25 years ago on 14 July 1989 and ran for 11 years. Its experiments went on to perform high-precision tests of the Standard Model, a true testament to Emilio’s skills as a physicist and as a project leader.

We send our deepest condolences to his wife and family.

A full obituary will appear in a later edition of the Bulletin.

See also the CERN Courier, in which Emilio talks about the early days of the LEP project and its start-up.

October 17, 2014 01:10 PM

CERN Bulletin

UK school visit: Alfriston School for girls

Pupils with learning disabilities from Alfriston School in the UK visited the CMS detector last week. This visit was funded by the UK's Science and Technologies Facilities Council (STFC) as part of a grant awarded to support activities that will help to build the girls’ self-esteem and interest in physics.


Alfriston School students at CMS.

On Friday, 10 October, pupils from Alfriston School – a UK secondary school catering for girls with a wide range of special educational needs and disabilities – paid a special visit to CERN.

Dave Waterman, a science teacher at the school, recently received a Public Engagement Small Award from the STFC, which enabled the group of girls and accompanying teachers to travel to Switzerland and visit CERN. The awards form part of a project to boost the girls’ confidence and interest in physics. The aim is to create enthusiastic role models with first-hand experience of science who can inspire their peers back home.

By building pupils' self-esteem with regards to learning science, the project further aims to encourage students to develop the confidence to go on to study subjects related to science or engineering when they leave school.

Waterman first visited CERN as part of the UK Teachers Programme in December 2013, which was when the idea of bringing his pupils over for a visit was first suggested. "The main challenge with a visit of this kind is finding how to engage the pupils who don’t have much knowledge of maths," said Waterman. Dave Barney, a member of the CMS collaboration, rose to the challenge, hitting the level spot on with a short and engaging introductory talk just before the detector visit. Chemical-engineering student Olivia Bailey, who recently completed a year-long placement at CERN, accompanied the students on the visit. "Being involved in this outreach project was really fun," she said. "It was a great way of using my experience at CERN and sharing it with others."

For one pupil – Laura – this was her first journey out of England and her first time on a plane. "The whole trip has been so exciting," she said. "My highlight was seeing the detector because it was so much bigger than what I thought." Other students were similarly impressed, expressing surprise and awe as they entered the detector area.

October 17, 2014 01:10 PM

Clifford V. Johnson - Asymptotia

Sunday Assembly – Origin Stories
Sorry about the slow posting this week. It has been rather a busy time the last several days, with all sorts of deadlines and other things taking up lots of time. This includes things like being part of a shooting of a new TV show, writing and giving a midterm to my graduate electromagnetism class, preparing a bunch of documents for my own once-every-3-years evaluation (almost forgot to do that one until the last day!), and so on and so forth. Well, the other thing I forgot to do is announce that I'll be doing the local Sunday Assembly sermon (for want of a better word) this coming Sunday. I've just taken a step aside from writing it to tell you about it. You'll have maybe heard of Sunday Assembly since it has been featured a lot in the news as a secular alternative (or supplement) to a Sunday Church gathering, in many cities around the world (more here). Instead of a sermon they have someone come along and talk about a topic, and they cover a lot of interesting topics. They sound like a great bunch of people to hang out with, and I strongly [..] Click to continue reading this post

by Clifford at October 17, 2014 12:24 AM

October 16, 2014

John Baez - Azimuth

Network Theory Seminar (Part 2)


This time I explain more about how ‘cospans’ represent gadgets with two ends, an input end and an output end:

I describe how to glue such gadgets together by composing cospans. We compose cospans using a category-theoretic construction called a ‘pushout’, so I also explain pushouts. At the end, I explain how this gives us a category where the morphisms are electrical circuits made of resistors, and sketch what we’ll do next: study the behavior of these circuits.

These lecture notes provide extra details:

Network theory (part 31).

by John Baez at October 16, 2014 08:59 PM

Lubos Motl - string vacua and pheno

An overlooked paper discovering axions gets published
What's the catch?

Sam Telfer has noticed and tweeted about a Royal Astronomic Society press release promoting today's publication (in Monthly Notices of RAS: link goes live next Monday) of a paper we should (or could) have discussed since or in March 2014 when it was sent to the arXiv – except that no one has discussed it and the paper has no followups at this moment:
Potential solar axion signatures in X-ray observations with the XMM-Newton observatory by George Fraser and 4 co-authors
The figures are at the end of the paper, after the captions. Unfortunately, Prof Fraser died in March, two weeks after this paper was sent to the arXiv. This can make the story about the discovery if it is real dramatic; alternatively, you may view it as a compassionate piece of evidence that the discovery isn't real.

Yes, this photograph of five axions was posted on the blog of the science adviser of The Big Bang Theory. It is no bazinga.

This French-English paper takes some data from XMM-Newton, X-ray Multi-Mirror Mission installed on and orbiting with ESA's Arianne 5's rocket. My understanding is that the authors more or less assume that the orientation of this X-ray telescope is "randomly changing" relatively to both the Earth and the Sun (which may be a problematic assumption but they study some details about the changing orientation, too).

With this disclaimer, they look at the amount of X-rays with energies between \(0.2\) and \(10\keV\) and notice that the flux has a rather clear seasonal dependence. The significance of these effects is claimed to be 4, 5, and 11 sigma (!!!), depending on some details. Seasonal signals are potentially clever but possibly tricky, too: recall that DAMA and (later) CoGeNT have "discovered" WIMP dark matter using the seasonal signals, too.

What is changing as a function of the season (date) is mainly the relative orientation of the Sun and the Earth. If you ignore the Sun, the Earth is just a gyroscope that rotates in the same way during the year, far away from stars etc., so seasons shouldn't matter. If you ignore the Earth, the situation should be more or less axially symmetric, although I wouldn't claim it too strongly, so there should also be no seasonal dependence.

What I want say and what is reasonable although not guaranteed is that the seasonal dependence of a signal seen from an orbiting rocket probably needs to depend both on the Sun and the Earth. Their interpretation is that axions are actually coming from the Sun, and they are later processed by the geomagnetic field.

The birth of the solar axions is either from a Compton-like process\[

e^- + \gamma \to e^- + a

\] or the (or more precisely: die) Bremsstrahlung-like process\[

e^- + Z \to e^- + Z+ a.

\] where the electrons and gauge bosons are taken from the mundane thermal havoc within the Sun's core, unless I am wrong. This axion \(a\) is created and some of those fly towards the Earth. And in the part of the geomagnetic field pointing towards the Sun, the axions \(a\) are converted to photons \(\gamma\) via the axion-to-photon conversion or the Primakoff effect (again: this process only works in the external magnetic field). The strength and relevance of the relevant geomagnetic field is season-dependent.

Their preferred picture is that there is the axion \(a\) with masses comparable to a few microelectronvolts and it couples both to electrons and photons. The product of these two coupling constants is said to be \(2.2\times 10^{-22} \GeV^{-1}\) because the authors love to repeat the word "two". Their hypothesis (or interpretation of the signal) probably makes some specific predictions about the spectrum of the X-rays and they should be checked which they have tried but I don't see too many successes of these checks after the first super-quick analysis of the paper.

There are lots of points and arguments and possible loopholes and problems over here that I don't fully understand at this point. You are invited to teach me (and us) or think loudly if you want to think about this bold claim at all.

Clearly, if the signal were real, it would be an extremely important discovery. Dark matter could be made out of these axions. The existence of axions would have far-reaching consequences not just for CP-violation in QCD but also for the scenarios within string theory, thanks to the axiverse and related paradigms.

The first news outlets that posted stories about the paper today were The Guardian, Phys.ORG, EurekAlert, and Fellowship for ET aliens.

by Luboš Motl ( at October 16, 2014 04:47 PM

ZapperZ - Physics and Physicists

No Women Physics Nobel Prize Winner In 50 Years
This article reports on the possible reasons why there have been no Physics Nobel Prize for a woman in 50 years.

But there's also, of course, the fact that the prize is awarded to scientists whose discoveries have stood the test of time. If you're a theorist, your theory must be proven true, which knocks various people out of the running. One example is Helen Quinn, whose theory with Roberto Peccei predicts a new particle called the axion. But the axion hasn't been discovered yet, and therefore they can't win the Nobel Prize.
Age is important to note. Conrad tells Mashable that more and more women are entering the field of physics, but as a result, they're still often younger than what the committee seems to prefer. According to the Nobel Prize website, the average age of Nobel laureates has even increased since the 1950s.
But the Nobel Prize in Physics isn't a lifetime achievement award — it honors a singular accomplishment, which can be tricky for both men and women.

"Doing Nobel Prize-worthy research is a combination of doing excellent science and also getting lucky," Conrad says. "Discoveries can only happen at a certain place and time, and you have to be lucky to be there then. These women coming into the field are as excellent as the men, and I have every reason to think they will have equal luck. So, I think in the future you will start to see lots of women among the Nobel Prize winners. I am optimistic."

The article mentioned the names of 4 women who are the leading candidates for the Nobel prize: Deborah Jin, Lene Hau, Vera Rubin, and Margaret Murnane. If you noticed, I mentioned about Jin and Hau way back when already, and I consider them to have done Nobel caliber work. I can only hope that, during my lifetime, we will see a woman win this again after so long.


by ZapperZ ( at October 16, 2014 12:40 PM

ZapperZ - Physics and Physicists

Lockheed Fusion "Breakthrough" - The Skeptics Are Out
Barely a day after Lockheed Martin announced their "fusion breakthrough" in designing a workable and compact fusion reactor, the skeptics are already weighing in their opinions even when details of Lockheed design has not been clearly described.

"The nuclear engineering clearly fails to be cost effective," Tom Jarboe told Business Insider in an email. Jarboe is a professor of aeronautics and astronautics, an adjunct professor in physics, and a researcher with the University of Washington's nuclear fusion experiment.
"This design has two doughnuts and a shell so it will be more than four times as bad as a tokamak," Jarboe said, adding that, "Our concept [at the University of Washington] has no coils surrounded by plasma and solves the problem."

Like I said earlier, from the sketchy detail that I've read, they are using a familiar technique for confinement, etc., something that has been used and studied extensively before. So unless they are claiming to find something that almost everyone has overlooked, this claim of their will need to be very convincing for others to accept. As stated in the article, Lockheed hasn't published anything yet, and they probably won't until they get patent approval of their design. That is what a commercial entity will typically do when they want to protect their design and investment.

There's a lot more work left to do for this to be demonstrated.


by ZapperZ ( at October 16, 2014 12:26 PM

Tommaso Dorigo - Scientificblogging

No Light Dark Matter In ATLAS Search
Yesterday the ATLAS collaboration published the results of a new search for dark matter particles produced in association with heavy quarks by proton-proton collisions at the CERN Large Hadron Collider. Not seeing a signal, ATLAS produced very tight upper limits on the cross section for interactions of the tentative dark matter particle with nucleons, which is the common quantity on which dark matter search results are reported. The cross section is in fact directly proportional to the rate at which one would expact to see the hypothetical particle scatter off ordinary matter, which is what one directly looks for in many of today's dark matte search experiments.

read more

by Tommaso Dorigo at October 16, 2014 10:22 AM

ZapperZ - Physics and Physicists

Lockheed Martin Claims Fusion Breakthrough
As always, we should reserve our judgement until we get this independently verified. Still, Lockheed Martin, out of the company's Skunk Works program (which was responsible for the Stealth technology), has made the astounding claim of potentially producing a working fusion reactor by 2017.

Tom McGuire, who heads the project, told Reuters that his team had been working on fusion energy at Lockheed’s Skunk Works program for the past four years, but decided to go public with the news now to recruit additional partners in industry and government to support their work.

Last year, while speaking at Google’s Solve for X program, Charles Chase , a research scientist at Skunk Works, described Lockheed’s effort to build a trailer-sized fusion power plant that turns cheap and plentiful hydrogen (deuterium and tritium) into helium plus enough energy to power a small city.

“It’s safe, it’s clean, and Lockheed is promising an operational unit by 2017 with assembly line production to follow, enabling everything from unlimited fresh water to engines that take spacecraft to Mars in one month instead of six,” Evan Ackerman wrote in a post about Chase’s Google talk on Dvice.

The thing that I don't have very clear is on the nature of the breakthrough that would allow them to do this, because what was written in the piece about using a magnetic bottle isn't new at all. This technique has been around for decades. I even saw one in the basement of the Engineering Research building at the University of Wisconsin-Madison back in the early 80's when they were doing extensive research work in this area. So what exactly did they do that they think will be successful that others over many years couldn't?

I guess that is a trade secret for them right now and we will just have to wait for the details to trickle out later.


by ZapperZ ( at October 16, 2014 01:24 AM

October 15, 2014

Quantum Diaries

Let there be beam!

It’s been a little while since I’ve posted anything, but I wanted to write a bit about some of the testbeam efforts at CERN right now. In the middle of July this year, the Proton Synchrotron, or PS, the second ring of boosters/colliders which are used to get protons up to speed to collide in the LHC, saw its first beam since the shutdown at the end Run I of the LHC. In addition to providing beam to experiments like CLOUD, the beam can also be used to create secondary particles of up to 15 GeV/c momentum, which are then used for studies of future detector technology. Such a beam is called a testbeam, and all I can say is WOOT, BEAM! I must say that being able to take accelerator data is amazing!

The next biggest milestone is the testbeams from the SPS, which started on the 6th of October. This is the last ring before the LHC. If you’re unfamiliar with the process used to get protons up to the energies of the LHC, a great video can be found at the bottom of the page.

Just to be clear, test beams aren’t limited to CERN. Keep your eyes out for a post by my friend Rebecca Carney in the near future.

I was lucky enough to be part of the test beam effort of LHCb, which was testing both new technology for the VELO and for the upgrade of the TT station, called the Upstream Tracker, or UT. I worked mainly with the UT group, testing a sensor technology which will be used in the 2019 upgraded detector. I won’t go too much into the technology of the upgrade right now, but if you are interested in the nitty-gritty of it all, I will instead point you to the Technical Design Report itself.

I just wanted to take a bit to talk about my experience with the test beam in July, starting with walking into the experimental area itself. The first sight you see upon entering the building is a picture reminding you that you are entering a radiation zone.


The Entrance!!

Then, as you enter, you see a large wall of radioactive concrete.


Don’t lick those!

This is where the beam is dumped. Following along here, you get to the control room, which is where all the data taking stuff is set up outside the experimental area itself. Lots of people are always working in the control room, focused and making sure to take as much data as possible. I didn’t take their picture since they were working so hard.

Then there’s the experimental area itself.


The Setup! To find the hardhat, look for the orange and green racks, then follow them towards the top right of the picture.

Ah, beautiful. :)

There are actually 4 setups here, but I think only three were being used at this time (click on the picture for a larger view). We occupied the area where the guy with the hardhat is.

Now the idea behind a tracker testbeam is pretty straight forward. A charged particle flies by, and many very sensitive detector planes record where the charged particle passed. These planes together form what’s called a “telescope.” The setup is completed when you add a detector to be tested either in the middle of the telescope or at one end.

Cartoon of a test beam setup. The blue indicates the "telescope", the orange is the detector under test, and the red is the trajectory of a charged particle.

Cartoon of a test beam setup. The blue indicates the “telescope”, the orange is the detector under test, and the red is the trajectory of a charged particle.


From timing information and from signals from these detectors, a trajectory of the particle can be determined. Now, you compare the position which your telescope gives you to the position you record in the detector you want to test, and voila, you have a way to understand the resolution and abilities of your tested detector. After that, the game is statistics. Ideally, you want to be in the middle of the telescope, so you have the information on where the charged particle passed on either side of your detector as this information gives the best resolution, but it can work if you’re on one side or the other, too.

This is the setup which we have been using for the testbeam at the PS.  We’ll be using a similar setup for the testbeam at the SPS next week! I’ll try to write a follow up post on that when we finish!

And finally, here is the promised video.


by Adam Davis at October 15, 2014 08:37 PM

Quantum Diaries

Top quark still raising questions

This article appeared in symmetry on Oct. 15, 2014.

Why are scientists still interested in the heaviest fundamental particle nearly 20 years after its discovery? Photo: Reidar Hahn, Fermilab

Why are scientists still interested in the heaviest fundamental particle nearly 20 years after its discovery? Photo: Reidar Hahn, Fermilab

“What happens to a quark deferred?” the poet Langston Hughes may have asked, had he been a physicist. If scientists lost interest in a particle after its discovery, much of what it could show us about the universe would remain hidden. A niche of scientists, therefore, stay dedicated to intimately understanding its properties.

Case in point: Top 2014, an annual workshop on top quark physics, recently convened in Cannes, France, to address the latest questions and scientific results surrounding the heavyweight particle discovered in 1995 (early top quark event pictured above).

Top and Higgs: a dynamic duo?
A major question addressed at the workshop, held from September 29 to October 3, was whether top quarks have a special connection with Higgs bosons. The two particles, weighing in at about 173 and 125 billion electronvolts, respectively, dwarf other fundamental particles (the bottom quark, for example, has a mass of about 4 billion electronvolts and a whole proton sits at just below 1 billion electronvolts).

Prevailing theory dictates that particles gain mass through interactions with the Higgs field, so why do top quarks interact so much more with the Higgs than do any other known particles?

Direct measurements of top-Higgs interactions depend on recording collisions that produce the two side-by-side. This hasn’t happened yet at high enough rates to be seen; these events theoretically require higher energies than the Tevatron or even the LHC’s initial run could supply. But scientists are hopeful for results from the next run at the LHC.

“We are already seeing a few tantalizing hints,” says Martijn Mulders, staff scientist at CERN. “After a year of data-taking at the higher energy, we expect to see a clear signal.” No one knows for sure until it happens, though, so Mulders and the rest of the top quark community are waiting anxiously.

A sensitive probe to new physics

Top and antitop quark production at colliders, measured very precisely, started to reveal some deviations from expected values. But in the last year, theorists have responded by calculating an unprecedented layer of mathematical corrections, which refined the expectation and promise to realigned the slightly rogue numbers.

Precision is an important, ongoing effort. If researchers aren’t able to reconcile such deviations, the logical conclusion is that the difference represents something they don’t know about — new particles, new interactions, new physics beyond the Standard Model.

The challenge of extremely precise measurements can also drive the formation of new research alliances. Earlier this year, the first Fermilab-CERN joint announcement of collaborative results set a world standard for the mass of the top quark.

Such accuracy hones methods applied to other questions in physics, too, the same way that research on W bosons, discovered in 1983, led to the methods Mulders began using to measure the top quark mass in 2005. In fact, top quark production is now so well controlled that it has become a tool itself to study detectors.

Forward-backward synergy

With the upcoming restart in 2015, the LHC will produce millions of top quarks, giving researchers troves of data to further physics. But scientists will still need to factor in the background noise and data-skewing inherent in the instruments themselves, called systematic uncertainty.

“The CDF and DZero experiments at the Tevatron are mature,” says Andreas Jung, senior postdoc at Fermilab. “It’s shut down, so the understanding of the detectors is very good, and thus the control of systematic uncertainties is also very good.”

Jung has been combing through the old data with his colleagues and publishing new results, even though the Tevatron hasn’t collided particles since 2011. The two labs combined their respective strengths to produce their joint results, but scientists still have much to learn about the top quark, and a new arsenal of tools to accomplish it.

“DZero published a paper in Nature in 2004 about the measurement of the top quark mass that was based on 22 events,” Mulders says. “And now we are working with millions of events. It’s incredible to see how things have evolved over the years.”

Troy Rummler

by Fermilab at October 15, 2014 07:04 PM

Axel Maas - Looking Inside the Standard Model

Challenging subtleties
I have just published a conference proceeding in which I return to an idea of how the standard model of particle physics could be extended. It is an idea I have already briefly written about: The idea is concerned with the question what would happen if there would be twice as many Higgs particles as there are in nature. The model describing this idea is therefore called 2-Higgs(-doublet)-model, or for short 2HDM. The word doublet in the official name is rather technical. It has something to do with how the second Higgs connects to the weak interaction.

As fascinating as the model itself may be, I do not want to write about its general properties. Given its popularity, you will find many things about it already on the web. No, here I want to write about what I want to learn about this theory in particular. And this is a peculiar subtlety. It connects to the research I am doing on the situation with just the single Higgs.

To understand what is going on, I have to dig deep into the theory stuff, but I will try to keep it not too technical.

The basic question is: What can we observe, and what can we not observe. One of the things a theoretician learns early on that it may be quite helpful to have some dummies. This means that he adds something in a calculation just for the sake of making the calculation simpler. Of course, she or he has to make very sure that this is not affecting the result. But if done properly, this can be of great help. The technical term for this trick is an auxiliary quantity.

Now, when we talk about the weak interactions, something amazing happens. If we assume that everything is indeed very weak, we can calculate results using so-called perturbation theory. And now an amazing thing happens: It appears, like the auxiliary quantities are real, and we can observe them. It is, and can only be, some kind of illusion. This is indeed true, something I have been working on since a long time, and others before me. It just comes out that the true thing and the auxiliary quantities have the same properties, and therefore it does not matter, which we take for our calculation. This is far from obvious, and pretty hard to explain without very much technical stuff. But since this is not the point I would like to make in this entry, let me skip these details.

That this is the case is actually a consequence of a number of 'lucky' coincidences in the standard model. Some particles have just the right mass. Some particles appear just in the right ratio of numbers. Some particles are just inert enough. Of course, as a theoretician, my experience is that there is no such thing as 'lucky'. But that is a different story (I know, I say this quite often this time).

Now, I finally return to the starting point: The 2HDM. In this theory, one can do the same kind of tricks with auxiliary quantities and perturbation theory and so on. If you assume that everything is just like in the standard model, this is fine. But is this really so? In the proceedings, I look at this question. Especially, I check whether perturbation theory should work. And what I find is: This may be possible, but it is very unlikely to happen in all the circumstances where one would like this to be true. Especially, in several scenarios in which one would like to have this property, it could indeed be failing. E.g., in some scenarios this theory could have twice as many weak gauge bosons, so-called W and Z bosons, as we see in experiment. That would be bad, as this would contradict experiment, and therefore invalidate these scenarios.

This is not the final word, of course not - proceedings are just status reports, not final answers. But that there may be, just may be, a difference. This is enough to require us (and, in this case, me) to make sure what is going on. That will be challenging. But this time such a subtly may make a huge difference.

by Axel Maas ( at October 15, 2014 05:03 PM

Symmetrybreaking - Fermilab/SLAC

Top quark still raising questions

Why are scientists still interested in the heaviest fundamental particle nearly 20 years after its discovery?

“What happens to a quark deferred?” the poet Langston Hughes may have asked, had he been a physicist. If scientists lost interest in a particle after its discovery, much of what it could show us about the universe would remain hidden. A niche of scientists, therefore, stay dedicated to intimately understanding its properties.

Case in point: Top 2014, an annual workshop on top quark physics, recently convened in Cannes, France, to address the latest questions and scientific results surrounding the heavyweight particle discovered in 1995 (early top quark event pictured above).

Top and Higgs: a dynamic duo?

A major question addressed at the workshop, held from September 29 to October 3, was whether top quarks have a special connection with Higgs bosons. The two particles, weighing in at about 173 and 125 billion electronvolts, respectively, dwarf other fundamental particles (the bottom quark, for example, has a mass of about 4 billion electronvolts and a whole proton sits at just below 1 billion electronvolts).

Prevailing theory dictates that particles gain mass through interactions with the Higgs field, so why do top quarks interact so much more with the Higgs than do any other known particles?

Direct measurements of top-Higgs interactions depend on recording collisions that produce the two side-by-side. This hasn’t happened yet at high enough rates to be seen; these events theoretically require higher energies than the Tevatron or even the LHC’s initial run could supply. But scientists are hopeful for results from the next run at the LHC.

“We are already seeing a few tantalizing hints,” says Martijn Mulders, staff scientist at CERN. “After a year of data-taking at the higher energy, we expect to see a clear signal.” No one knows for sure until it happens, though, so Mulders and the rest of the top quark community are waiting anxiously.

A sensitive probe to new physics

Top and anti-top quark production at colliders, measured very precisely, started to reveal some deviations from expected values. But in the last year, theorists have responded by calculating an unprecedented layer of mathematical corrections, which refined the expectation and promise to realign the slightly rogue numbers.

Precision is an important, ongoing effort. If researchers aren’t able to reconcile such deviations, the logical conclusion is that the difference represents something they don’t know about—new particles, new interactions, new physics beyond the Standard Model.

The challenge of extremely precise measurements can also drive the formation of new research alliances. Earlier this year, the first Fermilab-CERN joint announcement of collaborative results set a world standard for the mass of the top quark.

Such accuracy hones methods applied to other questions in physics, too, the same way that research on W bosons, discovered in 1983, led to the methods Mulders began using to measure the top quark mass in 2005. In fact, top quark production is now so well controlled that it has become a tool itself to study detectors.

Forward-backward synergy

With the upcoming restart in 2015, the LHC will produce millions of top quarks, giving researchers troves of data to further physics. But scientists will still need to factor in the background noise and data-skewing inherent in the instruments themselves, called systematic uncertainty.

“The CDF and DZero experiments at the Tevatron are mature,” says Andreas Jung, senior postdoc at Fermilab. “It’s shut down, so the understanding of the detectors is very good, and thus the control of systematic uncertainties is also very good.”

Jung has been combing through the old data with his colleagues and publishing new results, even though the Tevatron hasn’t collided particles since 2011. The two labs combined their respective strengths to produce their joint results, but scientists still have much to learn about the top quark, and a new arsenal of tools to accomplish it.

“DZero published a paper in Nature in 2004 about the measurement of the top quark mass that was based on 22 events,” Mulders says.  “And now we are working with millions of events. It’s incredible to see how things have evolved over the years.”


Like what you see? Sign up for a free subscription to symmetry!

by Troy Rummler at October 15, 2014 03:48 PM

arXiv blog

Emerging Evidence Shows How Computer Messaging Helps Autistic Adults Communicate

Anecdotal reports suggest that autistic adults benefit from computer-based communication. Now the scientific evidence is building.

The conventional view of people with autism is that they are loners with little interest in initiating or maintaining relationships with other people. But that attitude is changing rapidly not least because of the growing evidence that exactly the opposite is true.

October 15, 2014 03:19 PM

Lubos Motl - string vacua and pheno

A good popular text on gravitons and its limitations
In recent 24 hours, I saw a couple of news reports and popular articles about particle physics that were at least fine. For example, Physics World wrote about an experiment looking for WISP dark matter (it's like WIMP but "massive" is replaced by "sub-eV", and axions are the most famous WISPs). The Wall Street Journal wrote something about the RHIC experiment – unfortunately, the text only attracted one comment. The lack of interest in such situations is mostly due to the missing "controversy" and thanks to the technical character of the information.

But I want to mention a text by a "daily explainer" Esther Inglis-Arkell at
What are Gravitons and Why Can't We See Them?
which is pretty good, especially if one realizes that the author doesn't seem to be trained in these issues. Before I tell you about some flaws of the article, I want to focus on what I consider good about it because that may be more important in this case.

First, the article is a product by an "explainer". Its goal is to "explain" some background. This activity is hugely missing in the "news stories" about physics, especially cutting-edge physics. Physics is like a pyramid with many floors built on top of each other and the newest discoveries almost never "rebuild" the basement. They reorganize and sometimes rebuild the top floor and maybe the floor beneath it.

Everyone who wants to understand a story about this reconstruction of the near-top floor simply has to know something about some of the floors beneath it. Unfortunately, most of the science journalists are pretending that the news stories may be directly relevant for someone who has no background, who doesn't know about simpler and more fundamental "cousins" of the latest events. It is not possible.

A related point is that this article tries to present a picture that is coherent. The storyline isn't too different from the structure of an article that an expert could construct. What is important is that it doesn't distract the readers with topics that are clearly distractions.

In particular, it tells you that the problem with the non-renormalizability of gravitons is solved by string theory, and why, without misleading "obligatory" comments about loop quantum gravity and similar "alternative" viewpoints on vaguely related matters. Most articles unfortunately try to squeeze as much randomly collected rubbish analogous to loop quantum gravity in the few paragraphs as possible so that the result is unavoidably incoherent. Almost all readers are trying to build an opinion that "combines" or "interpolates" in between all these ideas.

But if you try to "combine" valid string theory with "components" imported from some alternative "sciences", you may end up with an even more illogical pile of rubbish than if you just parrot the alternative "science" separately. Those things simply should be segregated. Even if there were some real doubts that string theory is the only framework that avoids the logical problems with the gravitons' self-interactions, and there aren't really any doubts, there should be articles that only talk about one paradigm or another – just like actual scientists aren't jumping from one framework to another every minute. And the articles shouldn't focus on the "spicy interactions" between the vastly different research directions because that's not what the good scientists are actually spending their time with, or what is needed to understand Nature.

Some people might argue that things like loop quantum gravity shouldn't be "obligatory" in similar articles because the people researching it professionally represent something like 10% of the researchers in quantum gravity, write 5% of the articles, and receive about 1% of the citations. So these are small numbers which is a reason to neglect those things. But I don't actually believe in such an approach or in such a justification of the omission. It is perfectly OK to investigate things that represent a minority of researchers, papers, and citations. What's more important is that someone thinking about physics must preserve some consistency and focus on the content – and constant distractions by sociologically flavored metaphysical debates are no good for genuine progress in physics.

OK, let me now say a few comments about the flaws of the IO9 article on gravitons.

It says that the gravitons are particles that self-interact, unlike photons and like gluons, and this leads to the non-renormalizability of classical GR, and string theory solves the problem by replacing the point-like gravitons by strings which makes the self-replication of gravitons manageable. The LHC has been and will be looking for gravitons in warped geometry scenarios in the form of missing transverse energy.

The writer offers some rather usual comments about the forces and virtual particles that mediate them. Somewhere around this paragraph I started to have some problems:
What we call "force" at the macro level seems to be conveyed by particles at the micro level. The graviton should be one of these particles. The trouble with gravitons - or, more precisely, the first of many troubles with gravitons - is that gravity isn't supposed to be a force at all. General relativity indicates that gravity is a warp in spacetime. General relativity does allow for gravitational waves, though. It's possible that these waves could come in certain precise wavelengths the way photons do, and that these can be gravitons.
The first thing that I don't like about these comments is that they suggest that there is a contradiction between "gravity is a force" and "gravity is a warp in spacetime". There is no contradiction and there has never been one. "A warp in spacetime" is a more detailed explanation how the force works, much like some biology of muscle contractions "explains" where the force of our hands comes from.

In this context, I can't resist to make a much more general remark about the laymen's logic. When you say that "a graviton is an XY" and "a graviton is a UV", they deduce that either there is a contradiction, or "UV" is the same thing as "XY". But this ain't the case. The sentences say that the "set of gravitons" is a subset of the "set of XYs" or the "set of UVs". All these sets may be different and "XY" may still be a different set than "UV" while the propositions about the subsets may still hold. "XY" and "UV" may overlap – and sometimes, one of them may be a subset of the other. Many laymen (and I don't really mean the author of the article who seems much deeper) just seem to be lacking this "basic layer of structured thinking". They seem to understand the only meaning of the phrase "A is B", namely "A is completely synonymous to a different word B". But no non-trivial science could ever be built if this were the only allowed type of "is". If it were so, science would be reduced to the translation of several pre-existing objects to different languages or dialects.

I also have problems with the last sentence of the paragraph that "gravitons could come with precise wavelengths" just like photons. In the Universe, both gravitons and photons are demonstrably allowed to have any real positive value of the wavelength (the Doppler shift arising from the change of the inertial system is the simplest way to change the wavelength of a photon or a graviton to any value you want) although particular photons and gravitons resulting from some process may have a specific wavelength (or a specific distribution of wavelengths). Moreover, she talks about gravitons although she should be logically talking about gravitational waves – which are coherent states of many gravitons in the same state, as she doesn't seem to explain at all.

In the section about gravitons and string theory, she writes that gravitons are technically "gauge bosons". It is a matter of terminology whether gravitons are gauge bosons. Conceptually, they sort of are but exactly if we add the word "technically", I think that most physicists would say that gravitons are technically not gauge bosons because the term "gauge bosons" is only used for spin-1 particles. She says lots of correct things about the spins herself, however.

Then she describes a "recursive process" of production of new photons and (using the normal experts' jargon) addition of loops to the Feynman diagrams. Things are sort of OK but at one point we learn
Although this burst of particles may get hectic, it doesn't produce an endless branching chain of photons.
It actually does. Loop diagrams with an unlimited number of loops (and virtual photons) contribute. The number of terms is infinite. The point is that one may sum this infinite collection of terms and get a finite result. And the finite result isn't really a "direct outcome" of the procedure. A priori, the (approximately) geometric series is actually divergent. However, the quotient (I guess that the right English word is the common ratio, and I will use the latter) that is greater than one (naively infinite) may be "redefined" to be a finite number smaller than one, and that's why the (approximately) geometric series converges after this process and yields a finite result.

This is the process of renormalization and the theory is renormalizable if we only need to "redefine" a finite number of "types" of divergent common ratios or objects.

(A special discussion would be needed for infrared divergences. When it comes to very low-energy photons, one literally produces an infinite number of very soft photons if two charged objects repel one another, for example. And this infinite number of photons is no illusion and is never "reduced" to any finite number. Calculations of quantities that are "truly measurable by real-world devices with their limitations" can still be done and yield finite results so the infinities encountered as infrared divergences are harmless if we are really careful about what is a measurable question and what is not.)

Concerning the renormalizability, she writes:
Because of this, photons and electron interactions are said to be renormalizable. They can get weird, but they can't become endless.
Again, they can and do become endless, but it's not a problem. It may be a good idea to mention Zeno's paradoxes such as Achilles and the turtle. Zeno believed that Achilles could never catch up with the turtle because the path to the hypothetical point where he catches up may be divided to infinitely many pieces. And Zeno was implicitly assuming that the sum of an infinite number of terms (durations) had to be divergent. That wasn't the case. Infinite series often converge.

When mathematicians started to become smarter than Zeno and his soulmates, they saw that Zeno's paradoxes weren't paradoxes at all and some of Zeno's assumptions (or hidden assumptions) were simply wrong. Similarly, when Isaac Newton and his enemy developed the calculus, they were already sure that Zeno's related paradox, "the arrow paradox", wasn't a paradox, either. Zeno used to argue that an arrow cannot move because the trajectory may be divided to infinitesimal pieces and the arrow is static in each infinitesimal subinterval. Therefore, he reasoned, the arrow must always be static. Well, of course, we know that if you divide the path to infinitely many infinitesimal pieces, you may get and you often do get a finite total distance.

In this sense, mathematicians were able to see that many previous paradoxes aren't really paradoxes. We may continue and present the renormalization as another solution to a previous would-be Zeno-like paradox. Not only it is OK if there are infinitely many terms in the expansion to Feynman diagrams. It is even OK if this expansion is naively divergent – as long as the number of the "types of divergences" that have to be redefined to a finite number is finite.

The author of the article similarly discusses the loops with many gravitons and concludes:
That huge amount of energy causes the newly-created graviton to create yet another graviton. This endless cycle of graviton production makes gravitons nonrenormalizable.
This is of course deeply misleading. As the article didn't mention (even though it claimed to discuss the gluons as well), the gluons self-interact in the same sense as gravitons do. But the gluons' self-interactions – which may also involve an arbitrary number of virtual gluons in multiloop diagrams – are renormalizable and therefore harmless because the series we have to resum is closer to a geometric series and it is enough to renormalize the naively divergent common ratio in order to tame the whole sum.

In the case of the gravitons, the common ratio is "more divergent" because the high-energy virtual gravitons have stronger interactions (gravity couples to energy and not just the constant charge) and the series is further from a geometric one because the common ratios are not so common – the ratios increase with the number of loops. That's why we face an infinite number of "types of divergences", an infinite spectrum of something that looks like a common ratio of a geometric series but it is not really common. To determine a finite result of this sum, we would need to insert an infinite amount of information to the theory – to redefine an infinite number of distinct divergent objects to finite values. And in the absence of a hierarchy that would make most of these divergences "inconsequential", this process renders the theory unpredictive because it's simply not possible to measure or otherwise determine the value of the "infinitely many different divergent integrals".

To summarize, the author has oversimplified the situation and said that the nonrenormalizability arises whenever the processes have contributions from arbitrarily complicated loop diagrams. But they always do and it is not a problem yet. The real problem of nonrenormalizability only arises if a sequence of "cures" that are analogous to Newton's cure of Zeno's arrow paradox is applied and fails, anyway.

She says that strings are extended which cures the problem and...
That bit of wiggle room keeps the creation of a graviton from being so energetic that it necessitates the creation of yet another graviton, and makes the theory renormalizable.
Well, again, string theory doesn't change anything about the fact that arbitrarily complicated multiloop diagrams contribute to the total amplitude. But strings cure the problems with nonrenormalizability. But is it quite right to say that "strings make the theory renormalizable"?

Not really. First of all, strings aren't "surgeons" that would cure a field theory. Instead, strings replace it with a different theory that isn't a quantum field theory in the conventional sense (if we're strict about its definition – and if we overlook conceptually difficult dualities that show that string theory and field theories are really inseparable and piecewise equivalent as classes of physical theories). So strings don't do anything with "the theory". Instead, they tell us to use a different theory, string theory!

Second, it isn't quite right to say that string theory (as a theory in the spacetime) is renormalizable. In fact, string theory as a theory in the spacetime is completely finite so divergences from short-distances processes never arise in the first place. So this "problem" or "disease" that arises in almost all quantum field theories doesn't arise in string theory at all – which also means that it doesn't have to be cured. (String theory changes nothing about the emergence of infrared divergences in many situations or vacua – they were real physics in quantum field theory and have to remain real physics in any valid theory going beyond field theory.) String theory still involves some calculations on the world sheet that has intermediate divergences that have to be treated and yes, the theory on the world sheet is a field theory and it is a renormalizable one. But because the dynamics of string theory in the spacetime isn't a field theory, it doesn't even make sense to ask whether it is renormalizable. The adjective "renormalizable" is only well-defined for field theories.

Finally, she talks about the possible detection of gravitons. She is aware that there are facilities such as LIGO that should look for gravitational waves but if you want to see a graviton, an individual particle, you need something as good as the LHC. I am adding something here because as I have mentioned, the article hasn't really clarified the relationship between gravitational waves and gravitons at all.

Her comments about the LHC are referring to the theories with large or warped extra dimensions that could make some effects involving gravitons observable. I think that it is "much less likely than 50%" that the LHC will observe something like that and I would surely mention this expectation in an article I would write. But there is really no rock-solid argument against these scenarios so the LHC may observe these things.

The only other complaint I have against this part of the text is that she used the word "hole" for the missing transverse energy – a potential experimental sign indicating that the gravitons were sent in the extra dimensions. I had to spend half a minute to figure out what the hole was supposed to mean – I was naturally thinking about "holes in the Dirac sea" analogous to positrons as "holes in the sea of electrons". There's some sense in which a "hole" is just fine as a word for the "missing transverse energy" but its usage is non-standard and confusing. Physicists imagine rather specific things if you say a "hole" or a "missing energy" – if you know what these phrases mean, you should appreciate how much harder it is for the laymen who are imagining "holes" in the billiard table or "missing energy" after a small breakfast, or something like that. It can't be surprising that they're often led to completely misunderstand some texts about physics even though the texts look "perfectly fine" to those who know the right meaning of the words in the context.

I've mentioned many flaws of the article but my final point is that those are unavoidable for an article that was more ambitious than the typical popular ones. And if one wanted to "grade" the article according to these flaws, he shouldn't forget about the context – and the context is that the author actually decided to write a much more technical, detailed, less superficial article than what you may see elsewhere. Writers should be encouraged to write similar things even if there are similar technical problems. They can get fixed as the community of writers and their readership get more knowledgeable about all the issues.

But if writers decide not to write anything except for superficial – and usually sociological and "spicy" – issues, there is nothing to improve and the readers won't really ever have any tools to converge to any semi-qualified let alone qualified opinions about the physics itself.

by Luboš Motl ( at October 15, 2014 11:37 AM

The n-Category Cafe

The Atoms of the Module World

In many branches of mathematics, there is a clear notion of “atomic” or “indivisible” object. Examples are prime numbers, connected spaces, transitive group actions, and ergodic dynamical systems.

But in the world of modules, things aren’t so clear. There are at least two competing notions of “atomic” object: simple modules and, less obviously, projective indecomposable modules. Neither condition implies the other, even when the ring we’re over is a nice one, such as a finite-dimensional algebra over a field.

So it’s a wonderful fact that when we’re over a nice ring, there is a canonical bijection between <semantics>{<annotation encoding="application/x-tex">\{</annotation></semantics>isomorphism classes of simple modules<semantics>}<annotation encoding="application/x-tex">\}</annotation></semantics> and <semantics>{<annotation encoding="application/x-tex">\{</annotation></semantics>isomorphism classes of projective indecomposable modules<semantics>}<annotation encoding="application/x-tex">\}</annotation></semantics>.

Even though neither condition implies the other, modules that are “atoms” in one sense correspond one-to-one with modules that are “atoms” in the other. And the correspondence is defined in a really easy way: a simple module <semantics>S<annotation encoding="application/x-tex">S</annotation></semantics> corresponds to a projective indecomposable module <semantics>P<annotation encoding="application/x-tex">P</annotation></semantics> exactly when <semantics>S<annotation encoding="application/x-tex">S</annotation></semantics> is a quotient of <semantics>P<annotation encoding="application/x-tex">P</annotation></semantics>.

This fact is so wonderful that I had to write a short expository note on it (update — now arXived). I’ll explain the best bits here — including how it all depends on one of my favourite things in linear algebra, the eventual image.

It’s clear how the simple modules might be seen as “atomic”. They’re the nonzero modules that have no nontrivial submodules.

But what claim do the projective indecomposables have to be the “atoms” of the module world? Indecomposability, the nonexistence of a nontrivial direct summand, is a weaker condition than simplicity. And what does being projective have to do with it?

The answer comes from the Krull-Schmidt theorem. This says that over a finite enough ring <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics>, every finitely generated module is isomorphic to a finite direct sum of indecomposable modules, uniquely up to reordering and isomorphism.

In particular, we can decompose the <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics>-module <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics> as a sum <semantics>P 1P n<annotation encoding="application/x-tex">P_1 \oplus \cdots \oplus P_n</annotation></semantics> of indecomposables. Now the <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics>-module <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics> is projective (being free), and each <semantics>P i<annotation encoding="application/x-tex">P_i</annotation></semantics> is a direct summand of <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics>, from which it follows that each <semantics>P i<annotation encoding="application/x-tex">P_i</annotation></semantics> is projective indecomposable. We’ve therefore decomposed <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics>, uniquely up to isomorphism, as a direct sum of projective indecomposables.

But that’s not all. The Krull-Schmidt theorem also implies that every projective indecomposable <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics>-module appears on this list <semantics>P 1,,P n<annotation encoding="application/x-tex">P_1, \ldots, P_n</annotation></semantics>. That’s not immediately obvious, but you can find a proof in my note, for instance. And in this sense, the projective indecomposables are exactly the “pieces” or “atoms” of <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics>.

Here and below, I’m assuming that <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics> is a finite-dimensional algebra over a field. And in case any experts are reading this, I’m using “atomic” in an entirely informal way (hence the quotation marks). Inevitably, someone has given a precise meaning to “atomic module”, but that’s not how I’m using it here.

One of the first things we learn in linear algebra is the rank-nullity formula. This says that for an endomorphism <semantics>θ<annotation encoding="application/x-tex">\theta</annotation></semantics> of a finite-dimensional vector space <semantics>V<annotation encoding="application/x-tex">V</annotation></semantics>, the dimensions of the image and kernel are complementary:

<semantics>dimimθ+dimkerθ=dimV.<annotation encoding="application/x-tex"> dim\, im \theta + dim\, ker \theta = dim V. </annotation></semantics>

Fitting’s lemma says that when you raise <semantics>θ<annotation encoding="application/x-tex">\theta</annotation></semantics> to a high enough power, the image and kernel themselves are complementary:

<semantics>imθ nkerθ n=V(n0).<annotation encoding="application/x-tex"> im \theta^n \oplus ker \theta^n = V \qquad (n \gg 0). </annotation></semantics>

I’ve written about this before, calling <semantics>imθ n<annotation encoding="application/x-tex">im \theta^n</annotation></semantics> the eventual image, <semantics>im θ<annotation encoding="application/x-tex">im^\infty \theta</annotation></semantics>, and calling <semantics>kerθ n<annotation encoding="application/x-tex">ker\theta^n</annotation></semantics> the eventual kernel, <semantics>ker θ<annotation encoding="application/x-tex">ker^\infty \theta</annotation></semantics>, for <semantics>n0<annotation encoding="application/x-tex">n \gg 0</annotation></semantics>. (They don’t change once <semantics>n<annotation encoding="application/x-tex">n</annotation></semantics> gets high enough.) But what I hadn’t realized is that Fitting’s lemma is incredibly useful in the representation theory of finite-dimensional algebras.

For instance, Fitting’s lemma can be used to show that every projective indecomposable module is finitely generated — and indeed, cyclic (that is, generated as a module by a single element). Simple modules are cyclic too, since the submodule generated by any nonzero element must be the module itself. So, both projective indecomposable and simple modules are “small”, in the sense of being generated by a single element. In other words:

Atoms are small.

Whatever “atom” means, they should certainly be small!

But also, “atoms” shouldn’t have much internal structure. For instance, an atom shouldn’t have enough complexity that it admits lots of interesting endomorphisms. There are always going to be some, namely, multiplication by any scalar, and this means that the endomorphism ring <semantics>End(M)<annotation encoding="application/x-tex">End(M)</annotation></semantics> of a nonzero module <semantics>M<annotation encoding="application/x-tex">M</annotation></semantics> always contains a copy of the ground field <semantics>K<annotation encoding="application/x-tex">K</annotation></semantics>. But it’s a fact that when <semantics>M<annotation encoding="application/x-tex">M</annotation></semantics> is atomic in either of the two senses I’m talking about, <semantics>End(M)<annotation encoding="application/x-tex">End(M)</annotation></semantics> isn’t too much bigger than <semantics>K<annotation encoding="application/x-tex">K</annotation></semantics>.

Let me explain that first for simple modules, since that’s, well, simpler.

A basic fact about simple modules is:

Every endomorphism of a simple module is invertible or zero.

Why? Because the kernel of such an endomorphism is a submodule, so it’s either zero or the whole module. So the endomorphism is either zero or injective. But it’s a linear endomorphism of a finite-dimensional vector space, so “injective” and “surjective” and “invertible” all mean the same thing.

Assume from now on that <semantics>K<annotation encoding="application/x-tex">K</annotation></semantics> is algebraically closed. Let <semantics>S<annotation encoding="application/x-tex">S</annotation></semantics> be a simple module and <semantics>θ<annotation encoding="application/x-tex">\theta</annotation></semantics> an endomorphism of <semantics>S<annotation encoding="application/x-tex">S</annotation></semantics>. Then <semantics>θ<annotation encoding="application/x-tex">\theta</annotation></semantics> has an eigenvalue, <semantics>λ<annotation encoding="application/x-tex">\lambda</annotation></semantics>, say. But then <semantics>(θλid)<annotation encoding="application/x-tex">(\theta - \lambda\cdot id)</annotation></semantics> is not invertible, and must therefore be zero.

What we’ve just shown is that the only endomorphisms of a simple module are the rescalings <semantics>λid<annotation encoding="application/x-tex">\lambda\cdot id</annotation></semantics> (which are always there for any module). So <semantics>End(S)=K<annotation encoding="application/x-tex">End(S) = K</annotation></semantics>:

A simple module has as few endomorphisms as could be.

Now let’s do it for projective indecomposables. Fitting’s lemma can be used to show:

Every endomorphism of an indecomposable finitely generated module is invertible or nilpotent.

That’s easy to see: writing <semantics>M<annotation encoding="application/x-tex">M</annotation></semantics> for the module and <semantics>θ<annotation encoding="application/x-tex">\theta</annotation></semantics> for the endomorphism, we can find <semantics>n1<annotation encoding="application/x-tex">n \geq 1</annotation></semantics> such that <semantics>imθ nkerθ n=M<annotation encoding="application/x-tex">im \theta^n \oplus ker \theta^n = M</annotation></semantics>. Since <semantics>M<annotation encoding="application/x-tex">M</annotation></semantics> is indecomposable, <semantics>imθ n<annotation encoding="application/x-tex">im \theta^n</annotation></semantics> is either <semantics>0<annotation encoding="application/x-tex">0</annotation></semantics>, in which case <semantics>θ<annotation encoding="application/x-tex">\theta</annotation></semantics> is nilpotent, or <semantics>M<annotation encoding="application/x-tex">M</annotation></semantics>, in which case <semantics>θ<annotation encoding="application/x-tex">\theta</annotation></semantics> is surjective and therefore invertible. Done!

I said earlier that (by Fitting’s lemma) every projective indecomposable is finitely generated. So, every endomorphism of a projective indecomposable is invertible or nilpotent.

Let’s try to classify all the endomorphisms of a projective indecomposable module <semantics>P<annotation encoding="application/x-tex">P</annotation></semantics>. We’re hoping there aren’t many.

Exactly the same argument as for simple modules — the one with the eigenvalues — shows that every endomorphism of a projective indecomposable module is of the form <semantics>λid+ε<annotation encoding="application/x-tex">\lambda\cdot id + \varepsilon</annotation></semantics>, where <semantics>λ<annotation encoding="application/x-tex">\lambda</annotation></semantics> is a scalar and <semantics>ε<annotation encoding="application/x-tex">\varepsilon</annotation></semantics> is a nilpotent endomorphism. So if you’re willing to regard nilpotents as negligible (and why else would I have used an <semantics>ε<annotation encoding="application/x-tex">\varepsilon</annotation></semantics>?):

A projective indecomposable module has nearly as few endomorphisms as could be.

(If you want to be more precise about it, <semantics>End(P)<annotation encoding="application/x-tex">End(P)</annotation></semantics> is a local ring with residue field <semantics>K<annotation encoding="application/x-tex">K</annotation></semantics>. All that’s left to prove here is that <semantics>End(P)<annotation encoding="application/x-tex">End(P)</annotation></semantics> is local, or equivalently that for every endomorphism <semantics>θ<annotation encoding="application/x-tex">\theta</annotation></semantics>, either <semantics>θ<annotation encoding="application/x-tex">\theta</annotation></semantics> or <semantics>idθ<annotation encoding="application/x-tex">id - \theta</annotation></semantics> is invertible. We can prove this by contradiction. If neither is invertible, both are nilpotent — and that’s impossible, since the sum of two commuting nilpotents is again nilpotent.)

So all in all, what this means is that for “atoms” in either of our two senses, there are barely more endomorphisms than the rescalings. More poetically:

Atoms have very little internal structure.

My note covers a few more things than I’ve mentioned here, but I’ll mention just one more. There is, as I’ve said, a canonical bijection between isomorphism classes of indecomposable modules and isomorphism classes of simple modules. But how big are these two sets of isomorphism classes?

The answer is that they’re finite. In other words, there are only finitely many “atoms”, in either sense.

Why? Well, I mentioned earlier that as a consequence of the Krull-Schmidt theorem, the <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics>-module <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics> is a finite direct sum <semantics>P 1P n<annotation encoding="application/x-tex">P_1 \oplus \cdots \oplus P_n</annotation></semantics> of projective indecomposables, and that every projective indecomposable appears somewhere on this list (up to iso, of course). So, there are only finitely many projective indecomposables. It follows that there are only finitely many simple modules too.

An alternative argument comes in from the opposite direction. The Jordan-Hölder theorem tells us that the <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics>-module <semantics>A<annotation encoding="application/x-tex">A</annotation></semantics> has a well-defined set-with-multiplicity <semantics>S 1,,S r<annotation encoding="application/x-tex">S_1, \ldots, S_r</annotation></semantics> of composition factors, which are simple modules, and that every simple module appears somewhere on this list. So, there are only finitely many simple modules. It follows that there are only finitely many projective indecomposables too.

by leinster ( at October 15, 2014 01:31 AM

October 14, 2014

Symmetrybreaking - Fermilab/SLAC

Jokes for nerds

Webcomic artist Zach Weinersmith fuels ‘Saturday Morning Breakfast Cereal’ with grad student humor and almost half of a physics degree.

Zach Weinersmith, creator of popular webcomic “Saturday Morning Breakfast Cereal,” doesn’t know all the things you think he knows—but he’s working on it.

Reading certain SMBC comics, you could be forgiven for assuming Weinersmith (his married name) possesses a deep knowledge of math, biology, psychology, mythology, philosophy, economics or physics—even if that knowledge is used in service of a not-so-academic punch line.

In reality the artist behind the brainy comic simply loves to read. “I think I’m a very slow learner,” Weinersmith says. “I just work twice as hard.”

Around 2007, before SMBC took off, Weinersmith was working in Hollywood, producing closed captioning for television programs. He was taken with a sudden desire to understand how DNA works, so he bought a stack of textbooks and started researching in his spare time.

“Before that, my comic was straight comedy,” he says. He began to inject some of what he was learning into his writing. It was a relief, he found. “It’s much harder to make funny jokes than it is to talk about things.”

That year, SMBC was recognized at the Web Cartoonists’ Choice Awards and became popular enough for Weinersmith to quit his job and write full time. But he started to get bored.

“Imagine being 25 and self-employed,” he says.

What better way to cure boredom than to pursue a degree in physics? He took a few semesters of classes at San Jose State until he realized he was stretching himself too thin.

“I have three-eighths of a physics degree,” he says, which is probably perfect. “If you say three things about a topic, people assume you know the rest of it.

“I really think there’s this sweet spot. Right when I’m learning something, I have all these hilarious ideas. Once you’re a wizened gray-beard, nothing works.”

That hasn’t soured Weinersmith on scholarship. Last year he hosted his first live event, the Festival of Bad Ad Hoc Hypotheses, in which he invites speakers to compete to give the best serious argument for a completely ridiculous idea. It was inspired by a comic arguing the evolutionary benefits of aerodynamic babies.

Weinersmith runs the festival with a panel of judges and his wife, biologist Kelly Weinersmith, whose trials and tribulations in academia inspire much of his writing.

The appeal of BAHFest can be hard to explain, he says. “People see the video [of last year’s event] and say, ‘What the hell is the audience laughing about? That was barely a joke.’”

The key, he says, is to get rid of the jokes entirely. “It’s not stand-up; it’s play-acting,” he says. “Let this thing you’re doing be the joke.”

BAHFest will take place October 19 at MIT in Boston and October 25 at the Castro Theatre in San Francisco.

Courtesy of: Zach Weinersmith


Like what you see? Sign up for a free subscription to symmetry!

by Kathryn Jepsen at October 14, 2014 01:00 PM

John Baez - Azimuth

El Niño Project (Part 8)

So far we’ve rather exhaustively studied a paper by Ludescher et al which uses climate networks for El Niño prediction. This time I’d like to compare another paper:

• Y. Berezin, Avi Gozolchiani, O. Guez and Shlomo Havlin, Stability of climate networks with time, Scientific Reports 2 (2012).

Some of the authors are the same, and the way they define climate networks is very similar. But their goal here is different: they want to see see how stable climate networks are over time. This is important, since the other paper wants to predict El Niños by changes in climate networks.

They divide the world into 9 zones:

For each zone they construct several climate networks. Each one is an array of numbers W_{l r}^y, one for each year y and each pair of grid points l, r in that zone. They call W_{l r}^y a link strength: it’s a measure of how how correlated the weather is at those two grid points during that year.

I’ll say more later about how they compute these link strengths. In Part 3 we explained one method for doing it. This paper uses a similar but subtly different method.

The paper’s first big claim is that W_{l r}^y doesn’t change much from year to year, “in complete contrast” to the pattern of local daily air temperature and pressure fluctuations. In simple terms: the strength of the correlation between weather at two different points tends to be quite stable.

Moreover, the definition of link strength involves an adjustable time delay, \tau. We can measure the correlation between the weather at point l at any given time and point r at a time \tau days later. The link strength is computed by taking a maximum over time delays \tau. Naively speaking, the value of \tau that gives the maximum correlation is “how long it typically takes for weather at point l to affect weather at point r”. Or the other way around, if \tau is negative.

This is a naive way of explaining the idea, because I’m mixing up correlation with causation. But you get the idea, I hope.

Their second big claim is that when the link strength between two points l and r is big, the value of \tau that gives the maximum correlation doesn’t change much from year to year. In simple terms: if the weather at two locations is strongly correlated, the amount of time it takes for weather at one point to reach the other point doesn’t change very much.

The data

How do Berezin et al define their climate network?

They use data obtained from here:

NCEP-DOE Reanalysis 2.

This is not exactly the same data set that Ludescher et al use, namely:

NCEP/NCAR Reanalysis 1.

“Reanalysis 2″ is a newer attempt to reanalyze and fix up the same pile of data. That’s a very interesting issue, but never mind that now!

Berezin et al use data for:

• the geopotential height for six different pressures


• the air temperature at those different heights

The geopotential height for some pressure says roughly how high you have to go for air to have that pressure. Click the link if you want a more precise definition! Here’s the geopotential height field for the pressure of 500 millibars on some particular day of some particular year:

The height is in meters.

Berezin et al use daily values for this data for:

• locations world-wide on a grid with a resolution of 5° × 5°,


• the years from 1948 to 2006.

They divide the globe into 9 zones, and separately study each zone:

So, they’ve got twelve different functions of space and time, where space is a rectangle discretized using a 5° × 5° grid, and time is discretized in days. From each such function they build a ‘climate network’.

How do they do it?

The climate networks

Berezin’s method of defining a climate network is similar to Ludescher et al‘s, but different. Compare Part 3 if you want to think about this.

Let \tilde{S}^y_l(t) be any one of their functions, evaluated at the grid point l on day t of year y.

Let S_l^y(t) be \tilde{S}^y_l(t) minus its climatological average. For example, if t is June 1st and y is 1970, we average the temperature at location l over all June 1sts from 1948 to 2006, and subtract that from \tilde{S}^y_l(t) to get S^y_l(t). In other words:

\displaystyle{  \tilde{S}^y_l(t) = S^y_l(t) - \frac{1}{N} \sum_y S^y_l(t)  }

where N is the number of years considered.

For any function of time f, let \langle f^y(t) \rangle be the average of the function over all days in year y. This is different than the ‘running average’ used by Ludescher et al, and I can’t even be 100% sure that Berezin mean what I just said: they use the notation \langle f^y(t) \rangle.

Let l and r be two grid points, and \tau any number of days in the interval [-\tau_{\mathrm{max}}, \tau_{\mathrm{max}}]. Define the cross-covariance function at time t by:

\Big(f_l(t) - \langle f_l(t) \rangle\Big) \; \Big( f_r(t + \tau) - \langle f_r(t + \tau) \rangle \Big)

I believe Berezin mean to consider this quantity, because they mention two grid points l and r. Their notation omits the subscripts l and r so it is impossible to be completely sure what they mean! But what I wrote is the reasonable quantity to consider here, so I’ll assume this is what they meant.

They normalize this quantity and take its absolute value, forming:

\displaystyle{ X_{l r}^y(\tau) = \frac{\Big|\Big(f_l(t) - \langle f_l(t) \rangle\Big) \; \Big( f_r(t + \tau) - \langle f_r(t + \tau) \rangle \Big)\Big|}   {\sqrt{\Big\langle \Big(f_l(t)      - \langle f_l(t)\rangle \Big)^2 \Big\rangle  }  \; \sqrt{\Big\langle \Big(f_r(t+\tau) - \langle f_r(t+\tau)\rangle\Big)^2 \Big\rangle  } }  }

They then take the maximum value of X_{l r}^y(\tau) over delays \tau \in [-\tau_{\mathrm{max}}, \tau_{\mathrm{max}}], subtract its mean over delays in this range, and divide by the standard deviation. They write something like this:

\displaystyle{ W_{l r}^y = \frac{\mathrm{MAX}\Big( X_{l r}^y - \langle X_{l r}^y\rangle \Big) }{\mathrm{STD} X_{l r}^y} }

and say that the maximum, mean and standard deviation are taken over the (not written) variable \tau \in [-\tau_{\mathrm{max}}, \tau_{\mathrm{max}}].

Each number W_{l r}^y is called a link strength. For each year, the matrix of numbers W_{l r}^y where l and r range over all grid points in our zone is called a climate network.

We can think of a climate network as a weighted complete graph with the grid points l as nodes. Remember, an undirected graph is one without arrows on the edges. A complete graph is an undirected graph with one edge between any pair of nodes:

A weighted graph is an undirected graph where each edge is labelled by a number called its weight. But right now we’re also calling the weight the ‘link strength’.

A lot of what’s usually called ‘network theory’ is the study of weighted graphs. You can learn about it here:

• Ernesto Estrada, The Structure of Complex Networks: Theory and Applications, Oxford U. Press, Oxford, 2011.

Suffice it to say that given a weighted graph, there are lot of quantities you can compute from it, which are believed to tell us interesting things!

The conclusions

I will not delve into the real meat of the paper, namely what they actually do with their climate networks! The paper is free online, so you can read this yourself.

I will just quote their conclusions and show you a couple of graphs.

The conclusions touch on an issue that’s important for the network-based approach to El Niño prediction. If climate networks are ‘stable’, not changing much in time, why would we use them to predict a time-dependent phenomenon like the El Niño Southern Oscillation?

We have established the stability of the network of connections between the dynamics of climate variables (e.g. temperatures and geopotential heights) in different geographical regions. This stability stands in fierce contrast to the observed instability of the original climatological field pattern. Thus the coupling between different regions is, to a large extent, constant and predictable. The links in the climate network seem to encapsulate information that is missed in analysis of the original field.

The strength of the physical connection, W_{l r}, that each link in this network represents, changes only between 5% to 30% over time. A clear boundary between links that represent real physical dependence and links that emerge due to noise is shown to exist. The distinction is based on both the high link average strength \overline{W_{l r}} and on the low variability of time delays \mathrm{STD}(T_{l r}).

Recent studies indicate that the strength of the links in the climate network changes during the El Niño Southern Oscillation and the North Atlantic Oscillation cycles. These changes are within the standard deviation of the strength of the links found here. Indeed in Fig. 3 it is clearly seen that the coefficient of variation of links in the El Niño basin (zone 9) is larger than other regions such as zone 1. Note that even in the El Niño basin the coefficient of variation is relatively small (less than 30%).

Beside the stability of single links, also the hierarchy of the link strengths in the climate network is preserved to a large extent. We have shown that this hierarchy is partially due to the two dimensional space in which the network is embedded, and partially due to pure physical coupling processes. Moreover the contribution of each of these effects, and the level of noise was explicitly estimated. The spatial effect is typically around 50% of the observed stability, and the noise reduces the stability value by typically 5%–10%.

The network structure was further shown to be consistent across different altitudes, and a monotonic relation between the altitude distance and the correspondence between the network structures is shown to exist. This yields another indication that the observed network structure represents effects of physical coupling.

The stability of the network and the contributions of different effects were summarized in specific relation to different geographical areas, and a clear distinction between equatorial and off–equatorial areas was observed. Generally, the network structure of equatorial regions is less stable and more fluctuative.

The stability and consistence of the network structure during time and across different altitudes stands in contrast to the known unstable variability of the daily anomalies of climate variables. This contrast indicates an analogy between the behavior of nodes in the climate network and the behavior of coupled chaotic oscillators. While the fluctuations of each coupled oscillators are highly erratic and unpredictable, the interactions between the oscillators is stable and can be predicted. The possible outreach of such an analogy lies in the search for known behavior patterns of coupled chaotic oscillators in the climate system. For example, existence of phase slips in coupled chaotic oscillators is one of the fingerprints for their cooperated behavior, which is evident in each of the individual oscillators. Some abrupt changes in climate variables, for example, might be related to phase slips, and can be understood better in this context.

On the basis of our measured coefficient of variation of single links (around 15%), and the significant overall network stability of 20–40%, one may speculatively assess the extent of climate change. However, for this assessment our current available data is too short and does not include enough time from periods before the temperature trends. An assessment of the relation between the network stability and climate change might be possible mainly through launching of global climate model “experiments” realizing other climate conditions, which we indeed intend to perform.

A further future outreach of our work can be a mapping between network features (such as network motifs) and known physical processes. Such a mapping was previously shown to exist between an autonomous cluster in the climate network and El Niño. Further structures without such a climate interpretation might point towards physical coupling processes which were not observed earlier.

(I have expanded some acronyms and deleted some reference numbers.)

Finally, here two nice graphs showing the average link strength as a function of distance. The first is based on four climate networks for Zone 1, the southern half of South America:

The second is based on four climate networks for Zone 9, a big patch of the Pacific north of the Equator which roughly corresponds to the ‘El Niño basin':

As we expect, temperatures and geopotential heights get less correlated at points further away. But the rate at which the correlation drops off conveys interesting information! Graham Jones has made some interesting charts of this for the rectangle of the Pacific that Ludescher et al use for El Niño prediction, and I’ll show you those next time.

The series so far

El Niño project (part 1): basic introduction to El Niño and our project here.

El Niño project (part 2): introduction to the physics of El Niño.

El Niño project (part 3): summary of the work of Ludescher et al.

El Niño project (part 4): how Graham Jones replicated the work by Ludescher et al, using software written in R.

El Niño project (part 5): how to download R and use it to get files of climate data.

El Niño project (part 6): Steve Wenner’s statistical analysis of the work of Ludescher et al.

El Niño project (part 7): the definition of El Niño.

El Niño project (part 8): Berezin et al on the stability of climate networks.

by John Baez at October 14, 2014 12:07 AM

October 13, 2014

John Baez - Azimuth

Network Theory (Part 31)

Last time we came up with a category of labelled graphs and described circuits as ‘cospans’ in this category.

Cospans may sound scary, but they’re not. A cospan is just a diagram consisting of an object with two morphisms going into it:

We can talk about cospans in any category. A cospan is an abstract way of thinking about a ‘chunk of stuff’ \Gamma with two ‘ends’ I and O. It could be any sort of stuff: a set, a graph, an electrical circuit, a network of any kind, or even a piece of matter (in some mathematical theory of matter).

We call the object \Gamma the apex of the cospan and call the morphisms i: I \to \Gamma, o : O \to \Gamma the legs of the cospan. We sometimes call the objects I and O the feet of the cospan. We call I the input and O the output. We say the cospan goes from I to O, though the direction is just a convention: we can flip a cospan and get a cospan going the other way!

If you’re wondering about the name ‘cospan’, it’s because a span is a diagram like this:

Since a ‘span’ is another name for a bridge, and this looks like a bridge from I to O, category theorists called it a span! And category theorists use the prefix ‘co-‘ when they turn all the arrows around. Spans came first historically, and we will use those too at times. But now let’s think about how to compose cospans.

Composing cospans is supposed to be like gluing together chunks of stuff by attaching the output of the first to the input of the second. So, we say two cospans are composable if the output of the first equals the input of the second, like this:

We then compose them by forming a new cospan going all the way from X to Z:

The new object \Gamma +_Y \Gamma' and the new morphisms i'', o'' are built using a process called a ‘pushout’ which I’ll explain in a minute. The result is cospan from X to Z, called the composite of the cospans we started with. Here it is:

So how does a pushout work? It’s a general construction that you can define in any category, though it only exists if the category is somewhat nice. (Ours always will be.) You start with a diagram like this:

and you want to get a commuting diamond like this:

which is in some sense ‘the best’ given the diagram we started with. For example, suppose we’re in the category of sets and Y is a set included in both \Gamma and \Gamma'. Then we’d like A to be the union of \Gamma and \Gamma. There are other choices of A that would give a commuting diamond, but the union is the best. Something similar is happening when we compose circuits, but instead of the category of sets we’re using the category of labelled graphs we discussed last time.

How do we make precise the idea that A is ‘the best’? We consider any other potential solution to this problem, that is, some other commuting diamond:

Then A is ‘the best’ if there exists a unique morphism q from A to the ‘competitor’ Q making the whole combined diagram commute:

This property is called a universal property: instead of saying that A is the ‘best’, grownups say it is universal.

When A has this universal property we call it the pushout of the original diagram, and we may write it as \Gamma +_Y \Gamma'. Actually we should call the whole diagram

the pushout, or a pushout square, because the morphisms i'', o'' matter too. The universal property is not really a property just of A, but of the whole pushout square. But often we’ll be sloppy and call just the object A the pushout.

Puzzle 1. Suppose we have a diagram in the category of sets

where Y = \Gamma \cap \Gamma' and the maps i, o' are the inclusions of this intersection in the sets \Gamma and \Gamma'. Prove that A = \Gamma \cup \Gamma' is the pushout, or more precisely the diagram

is a pushout square, where i'', o'' are the inclusions of \Gamma and \Gamma in the union A = \Gamma \cup \Gamma'.

More generally, a pushout in the category of sets is a way of gluing together sets \Gamma and \Gamma' with some ‘overlap’ given by the maps

And this works for labelled graphs, too!

Puzzle 2. Suppose we have two circuits of resistors that are composable, like this:

and this:

These give cospans in the category L\mathrm{Graph} where

L = (0,\infty)

(Remember from last time that L\mathrm{Graph} is the category of graphs with edges labelled by elements of some set L.) Show that if we compose these cospans we get a cospan corresponding to this circuit:

If you’re a mathematician you might find it easier to solve this kind of problem in general, which requires pondering how pushouts work in L\mathrm{Graph}. Alternatively, you might find it easier to think about this particular example: then you can just check that the answer we want has the desired property of a pushout!

If this stuff seems complicated, well, just know that category theory is a very general, powerful tool and I’m teaching you just the microscopic fragment of it that we need right now. Category theory ultimately seems very simple: I can’t really think of any math that’s simpler! It only seem complicated when it’s unfamiliar and you have a fragmentary view of it.

So where are we? We know that circuits made of resistors are a special case of cospans. We know how to compose cospans. So, we know how to compose circuits… and in the last puzzle, we saw this does just what we want.

The advantage of this rather highbrow approach is that a huge amount is known about composing cospans! In particular, suppose we have any category C where pushouts exist: that is, where we can always complete any diagram like this:

to a pushout square. Then we can form a category \mathrm{Cospan}(C) where:

• an object is an object of C

• a morphism from an object I \in C to an object O \in C is an equivalence classes of cospans from I to O:

• we compose cospans in the manner just described.

Why did I say ‘equivalence class’? It’s because the pushout is not usually unique. It’s unique only up to isomorphism. So, composing cospans would be ill-defined unless we work with some kind of equivalence class of cospans.

To be precise, suppose we have two cospans from I to O:

Then a map of cospans from one to the other is a commuting diagram like this:

We say that this is an isomorphism of cospans if f is an isomorphism.

This gives our equivalence relation on cospans! It’s an old famous theorem in category theory—so famous that it’s hard to find a reference for the proof—that whenever C is a category with pushouts, there’s a category \mathrm{Cospan}(C) where:

• an object is an object of C

• a morphism from an object I \in C to an object O \in C is an isomorphism class of cospans from I to O.

• we compose isomorphism classes of cospans by picking representatives, composing them and then taking the isomorphism class.

This takes some work to prove, but it’s true, so this is how we get our category of circuits!

Next time we’ll do something with this category. Namely, we’ll cook up a category of ‘behaviors’. The behavior of a circuit made of resistors just says which currents and potentials its terminals can have. If we put a circuit in a metaphorical ‘black box’ and refuse to peek inside, all we can see is its behavior.

Then we’ll cook up a functor from the category of circuits to the category of behaviors. We’ll call this the ‘black box functor’. Saying that it’s a functor mainly means that

\blacksquare(f g) = \blacksquare(f) \blacksquare(g)

Here f and g are circuits that we can compose, and f g is their composite. The black square is the black box functor, so \blacksquare(fg) is the behavior of the circuit f g. There’s a way to compose behaviors, too, and the equation above says that the behavior of the composite circuit is the composite of their behaviors!

This is very important, because it says we can figure out what a big circuit does if we know what its pieces do. And this is one of the grand themes of network theory: understanding big complicated networks by understanding their pieces. We may not always be able to do this, in practice! But it’s something we’re always concerned with.

by John Baez at October 13, 2014 08:09 PM