Category Archives: Quantum Quandaries

Quantum Times Book Reviews

Following Tuesday’s post, here is the second piece I wrote for the latest issue of the Quantum Times. It is a review of two recent popular science books on quantum computing by John Gribbin and Jonathan Dowling. Jonathan Dowling has the now obligatory book author’s blog, which you should also check out.

Book Review

  • Title: Computing With Quantum Cats: From Colossus To Qubits
  • Author: John Gribbin
  • Publisher: Bantam, 2013
  • Title: Schrödinger’s Killer App: Race To Build The World’s First Quantum Computer
  • Author: Jonathan Dowling
  • Publisher: CRC Press, 2013

The task of writing a popular book on quantum computing is a daunting
one. In order to get it right, you need to explain the subtleties of
theoretical computer science, at least to the point of understanding
what makes some problems hard and some easy to tackle on a classical
computer. You then need to explain the subtle distinctions between
classical and quantum physics. Both of these topics could, and indeed
have, filled entire popular books on their own. Gribbin’s strategy is
to divide his book into three sections of roughly equal length, one on
the history of classical computing, one on quantum theory, and one on
quantum computing. The advantage of this is that it makes the book
well paced, as the reader is not introduced to too many new ideas at
the same time. The disadvantage is that there is relatively little
space dedicated to the main topic of the book.

In order to weave the book together into a narrative, Gribbin
dedicates each chapter except the last to an individual prominent
scientist, specifically: Turing, von Neumann, Feynman, Bell and
Deutsch. This works well as it allows him to interleave the science
with biography, making the book more accessible. The first two
sections on classical computing and quantum theory display Gribbin’s
usual adeptness at popular writing. In the quantum section, my usual
pet peeves about things being described as “in two states at the same
time” and undue prominence being given to the many-worlds
interpretation apply, but no more than to any other popular treatment
of quantum theory. The explanations are otherwise very good. I
would, however, quibble with some of the choice of material for the
classical computing section. It seems to me that the story of how we
got from abstract Turing machines to modern day classical computers,
which is the main topic of the von Neumann chapter, is tangential to
the main topic of the book, and Gribbin fails to discuss more relevant
topics such as the circuit model and computational complexity in this
section. Instead these topics are squeezed in very briefly into the
quantum computing section, and Gribbin flubs the description of
computational complexity. For example, see if you can spot the
problems with the following three quotes:

“…problems that can be solved by efficient algorithms belong to a
category that mathematicians call `complexity class P’…”

“Another class of problem, known as NP, are very difficult to
solve…”

“All problems in P are, of course, also in NP.”

The last chapter of Gribbin’s book is an tour of the proposed
experimental implementations of quantum computing and the success
achieved so far. This chapter tries to cover too much material too
quickly and is rather credulous about the prospects of each
technology. Gribbin also persists with the device of including potted
biographies of the main scientists involved. The total effect is like
running at high speed through an unfamiliar woods, while someone slaps
you in the face rapidly with CVs and scientific papers. I think the
inclusion of such a detailed chapter was a mistake, especially since
it will seem badly out of date in just a year or two. Finally,
Gribbin includes an epilogue about the controversial issue of discord
in non-universal models of quantum computing. This is a bold
inclusion, which will either seem prescient or silly after the debate
has died down. My own preference would have been to focus on
well-established theory.

In summary, Gribbin’s has written a good popular book on quantum
computing, perhaps the best so far, but it is not yet a great one. It
is not quite the book you should give to your grandmother to explain
what you do. I fear she will unjustly come out of it thinking she is
not smart enough to understand, whereas in fact the failure is one of
unclear explanation in a few areas on the author’s part.

Dowling’s book is a different kettle of fish from Gribbin’s. He
claims to be aiming for the same audience of scientifically curious
lay readers, but I am afraid they will struggle. Dowling covers more
or less everything he is interested in and I think the rapid fire
topic changes would leave the lay reader confused. However, we all
know that popular science books written by physicists are really meant
to be read by other physicists rather than by the lay reader. From
this perspective, there is much valuable material in Dowling’s book.

Dowling is really on form when he is discussing his personal
experience. This mainly occurs in chapters 4 and 5, which are about
the experimental implementation of quantum computing and other quantum
technologies. There is also a lot of material about the internal
machinations of military and intelligence funding agencies, which
Dowling has copious experience of on both sides of the fence. Much of
this material is amusing and will be of value to those interested in
applying for such funding. As you might expect, Dowling’s assessment
of the prospects of the various proposed technologies is much more
accurate and conservative than Gribbin’s. In particular his treatment
of the cautionary tale of NMR quantum computing is masterful and his
assessment of non fully universal quantum computers, such as the D-Wave
One, is insightful. Dowling also gives an excellent account of quantum
technologies beyond quantum computing and cryptography, such as
quantum metrology, which are often neglected in popular treatments.

Chapter 6 is also interesting, although it is a bit of a hodge-podge
of different topics. It starts with a debunking of David Kaiser’s
thesis that the “hippies” of the Fundamental Fysiks group in Berkeley
were instrumental in the development of quantum information via their
involvement in the no-cloning theorem. Dowling rightly points out
that the origins of quantum cryptography are independent of this,
going back to Wiesner in the 1970’s, and that the no-cloning theorem
would probably have been discovered as a result of this. This section
is only missing a discussion of the role of Wheeler, since he was
really the person who made it OK for mainstream physicists to think
about the foundations of quantum theory again, and who encouraged his
students and postdocs to do so in information theoretic terms. Later
in the chapter, Dowling moves into extremely speculative territory,
arguing for “the reality of Hilbert space” and discussing what quantum
artificial intelligence might be like. I disagree with about as much
as I agree with in this section, but it is stimulating and
entertaining nonetheless.

You may notice that I have avoided talking about the first few
chapters of the book so far. Unfortunately, I do not have many
positive things to say about them.

The first couple of chapters cover the EPR experiment, Bell’s theorem,
and entanglement. Here, Dowling employs the all too common device of
psychoanalysing Einstein. As usual in such treatments, there is a
thin caricature of Einstein’s actual views followed by a lot of
comments along the lines of “Einstein wouldn’t have liked this” and
“tough luck Einstein”. I personally hate this sort of narrative with
a passion, particularly since Einstein’s response to quantum theory
was perfectly rational at the time he made it and who knows what he
would have made of Bell’s theorem? Worse than this, Dowling’s
treatment perpetuates the common myth that determinism is one of the
assumptions of both the EPR argument and Bell’s theorem. Of course,
CHSH does not assume this, but even EPR and Bell’s original argument
only use it when it can be derived from the quantum predictions.
Thus, there is not the option of “uncertainty” for evading the
consequences of these theorems, as Dowling maintains throughout the
book.

However, the worst feature of these chapters is the poor choice of
analogy. Dowling insists on using a single analogy to cover
everything, that of an analog clock or wristwatch. This analogy is
quite good for explaining classical common cause correlations,
e.g. Alice and Bob’s watches will always be anti-correlated if they
are located in timezones with a six hour time difference, and for
explaining the use of modular arithmetic in Shor’s algorithm.
However, since Dowling has earlier placed such great emphasis on the
interpretation of the watch readings in terms of actual time, it falls
flat when describing entanglement in which we have to imagine that the
hour hand randomly points to an hour that has nothing to do with time.
I think this is confusing and that a more abstract analogy,
e.g. colored balls in boxes, would have been better.

There are also a few places where Dowling makes flatly incorrect
statements. For example, he says that the OR gate does mod 2 addition
and he says that the state |00> + |01> + |10> + |11> is entangled. I
also found Dowling’s criterion for when something should be called an
ENT gate (his terminology for the CNOT gate) confusing. He says that
something is not an ENT gate unless it outputs an entangled state, but
of course this depends on what the input state is. For example, he
says that NMR quantum computers have no ENT gates, whereas I think
they do have them, but they just cannot produce the pure input states
needed to generate entanglement from them.

The most annoying thing about this book is that it is in dire need of
a good editor. There are many typos and basic fact-checking errors.
For example, John Bell is apparently Scottish and at one point a D-Wave
computer costs a mere $10,000. There is also far too much repetition.
For example, the tale of how funding for classical optical computing
dried up after Conway and Mead instigated VLSI design for silicon
chips, but then the optical technology was reused used to build the
internet, is told in reasonable detail at least three different times.
The first time it is an insightful comment, but by the third it is
like listening to an older relative with a limited stock of stories.
There are also whole sections that are so tangentially related to the
main topic that they should have been omitted, such as the long anti
string-theory rant in chapter six.

Dowling has a cute and geeky sense of humor, which comes through well
most of the time, but on occasion the humor gets in the way of clear
exposition. For example, in a rather silly analogy between Shor’s
algorithm and a fruitcake, the following occurs:

“We dive into the molassified rum extract of the classical core of the
Shor algorithm fruitcake and emerge (all sticky) with a theorem proved
in the 1760s…”

If he were a writing student, Dowling would surely get kicked out of
class for that. Finally, unless your name is David Foster Wallace, it
is not a good idea to put things that are essential to following the
plot in the footnotes. If you are not a quantum scientist then it is
unlikely that you know who Charlie Bennett and Dave Wineland are or
what NIST is, but then the quirky names chosen in the first few
chapters will be utterly confusing. They are explained in the main
text, but only much later. Otherwise, you have to hope that the
reader is not the sort of person who ignores footnotes. Overall,
having a sense of humor is a good thing, but there is such a thing as
being too cute.

Despite these criticisms, I would still recommend Dowling’s book to
physicists and other academics with a professional interest in quantum
technology. I think it is a valuable resource on the history of the
subject. I would steer the genuine lay reader more in the direction
of Gribbin’s book, at least until a better option becomes available.

Quantum Times Article about Surveys on the Foundations of Quantum Theory

A new edition of The Quantum Times (newsletter of the APS topical group on Quantum Information) is out and I have two articles in it. I am posting the first one here today and the second, a book review of two recent books on quantum computing by John Gribbin and Jonathan Dowling, will be posted later in the week. As always, I encourage you to download the newsletter itself because it contains other interesting articles and announcements other than my own. In particlar, I would like to draw your attention to the fact that Ian Durham, current editor of The Quantum Times, is stepping down as editor at some point before the March meeting. If you are interested in getting more involved in the topical group, I would encourage you to put yourself forward. Details can be found at the end of the newsletter.

Upon reformatting my articles for the blog, I realized that I have reached almost Miguel Navascues levels of crankiness. I guess this might be because I had a stomach bug when I was writing them. Today’s article is a criticism of the recent “Snapshots of Foundational Attitudes Toward Quantum Mechanics” surveys that appeared on the arXiv and generated a lot of attention. The article is part of a point-counterpoint, with Nathan Harshman defending the surveys. Here, I am only posting my part in its original version. The newsletter version is slightly edited from this, most significantly in the removal of my carefully constructed title.

Lies, Damned Lies, and Snapshots of Foundational Attitudes Toward Quantum Mechanics

Q1. Which of the following questions is best resolved by taking a straw
poll of physicists attending a conference?

A. How long ago did the big bang happen?

B. What is the correct approach to quantum gravity?

C. Is nature supersymmetric?

D. What is the correct way to understand quantum theory?

E. None of the above.

By definition, a scientific question is one that is best resolved by
rational argument and appeal to empirical evidence.  It does not
matter if definitive evidence is lacking, so long as it is conceivable
that evidence may become available in the future, possibly via
experiments that we have not conceived of yet.  A poll is not a valid
method of resolving a scientific question.  If you answered anything
other than E to the above question then you must think that at least
one of A-D is not a scientific question, and the most likely culprit
is D.  If so, I disagree with you.

It is possible to legitimately disagree on whether a question is
scientific.  Our imaginations cannot conceive of all possible ways,
however indirect, that a question might get resolved.  The lesson from
history is that we are often wrong in declaring questions beyond the
reach of science.  For example, when big bang cosmology was first
introduced, many viewed it as unscientific because it was difficult to
conceive of how its predictions might be verified from our lowly
position here on Earth.  We have since gone from a situation in which
many people thought that the steady state model could not be
definitively refuted, to a big bang consensus with wildly fluctuating
estimates of the age of the universe, and finally to a precision value
of 13.77 +/- 0.059 billion years from the WMAP data.

Traditionally, many physicists separated quantum theory into its
“practical part” and its “interpretation”, with the latter viewed as
more a matter of philosophy than physics.  John Bell refuted this by
showing that conceptual issues have experimental consequences.  The
more recent development of quantum information and computation also
shows the practical value of foundational thinking.  Despite these
developments, the view that “interpretation” is a separate
unscientific subject persists.  Partly this is because we have a
tendency to redraw the boundaries.  “Interpretation” is then a
catch-all term for the issues we cannot resolve, such as whether
Copenhagen, Bohmian mechanics, many-worlds, or something else is the
best way of looking at quantum theory.  However, the lesson of big
bang cosmology cautions against labelling these issues unscientific.
Although interpretations of quantum theory are constructed to yield
the same or similar enough predictions to standard quantum theory,
this need not be the case when we move beyond the experimental regime
that is now accessible.  Each interpretation is based on a different
explanatory framework, and each suggests different ways of modifying
or generalizing the theory.  If we think that quantum theory is not
our final theory then interpretations are relevant in constructing its
successor.  This may happen in quantum gravity, but it may equally
happen at lower energies, since we do not yet have an experimentally
confirmed theory that unifies the other three forces.  The need to
change quantum theory may happen sooner than you expect, and whichever
explanatory framework yields the next theory will then be proven
correct.  It is for this reason that I think question D is scientific.

Regardless of the status of question D, straw polls, such as the three
that recently appeared on the arXiv [1-3], cannot help us to resolve
it, and I find it puzzling that we choose to conduct them for this
question, but not for other controversial issues in physics.  Even
during the decades in which the status of big bang cosmology was
controversial, I know of no attempts to poll cosmologists’ views on
it.  Such a poll would have been viewed as meaningless by those who
thought cosmology was unscientific, and as the wrong way to resolve
the question by those who did think it was scientific.  The same is
true of question D, and the fact that we do nevertheless conduct polls
suggests that the question is not being treated with the same respect
as the others on the list.

Admittedly, polls about controversial scientific questions are
relevant to the sociology of science, and they might be useful to the
beginning graduate student who is more concerned with their career
prospects than following their own rational instincts.  From this
perspective, it would be just as interesting to know what percentage
of physicists think that supersymmetry is on the right track as it is
to know about their views on quantum theory.  However, to answer such
questions, polls need careful design and statistical analysis.  None
of the three polls claims to be scientific and none of them contain
any error analysis.  What then is the point of them?

The three recent polls are based on a set of questions designed by
Schlosshauer, Kofler and Zeilinger, who conducted the first poll at a
conference organized by Zeilinger [1].  The questions go beyond just
asking for a preferred interpretation of quantum theory, but in the
interests of brevity I will focus on this aspect alone.  In the
Schlosshauer et al.  poll, Copenhagen comes out top, closely followed
by “information-based/information-theoretical” interpretations.  The
second comes from a conference called “The Philosophy of Quantum
Mechanics” [2].  There was a larger proportion of self-identified
philosophers amongst those surveyed and “I have no preferred
interpretation” came out as the clear winner, not so closely followed
by de Broglie-Bohm theory, which had obtained zero votes in the poll
of Schlosshauer et al.  Copenhagen is in joint third place along with
objective collapse theories.  The third poll comes from “Quantum
theory without observers III” [3], at which de Broglie-Bohm got a
whopping 63% of the votes, not so closely followed by objective
collapse.

What we can conclude from this is that people who went to a meeting
organized by Zeilinger are likely to have views similar to Zeilinger.
People who went to a philosophy conference are less likely to be
committed, but are much more likely to pick a realist interpretation
than those who hang out with Zeilinger.  Finally, people who went to a
meeting that is mainly about de Broglie-Bohm theory, organized by the
world’s most prominent Bohmians, are likely to be Bohmians.  What have
we learned from this that we did not know already?

One thing I find especially amusing about these polls is how easy it
would have been to obtain a more representative sample of physicists’
views.  It is straightforward to post a survey on the internet for
free.  Then all you have to do is write a letter to Physics Today
asking people to complete the survey and send the URL to a bunch of
mailing lists.  The sample so obtained would still be self-selecting
to some degree, but much less so than at a conference dedicated to
some particular approach to quantum theory.  The sample would also be
larger by at least an order of magnitude.  The ease with which this
could be done only illustrates the extent to which these surveys
should not even be taken semi-seriously.

I could go on about the bad design of the survey questions and about
how the error bars would be huge if you actually bothered to calculate
them.  It is amusing how willing scientists are to abandon the
scientific method when they address questions outside their own field.
However, I think I have taken up enough of your time already.  It is
time we recognized these surveys for the nonsense that they are.

References

[1] M. Schlosshauer, J. Kofler and A. Zeilinger, A Snapshot of
Foundational Attitudes Toward Quantum Mechanics, arXiv:1301.1069
(2013).

[2] C. Sommer, Another Survey of Foundational Attitudes Towards
Quantum Mechanics, arXiv:1303.2719 (2013).

[3] T. Norsen and S. Nelson, Yet Another Snapshot of Foundational
Attitudes Toward Quantum Mechanics, arXiv:1306.4646 (2013).

FQXi Essay Contest

I wrote an essay for the FQXi essay contest.  This year’s theme is “It from bit or bit from it?” and I decided to write about the extent to which Wheeler’s “it from bit” helps us to understand the origin of quantum probabilities from a subjective Bayesian point of view.   You can go here to read and rate the essay and it would be especially great if any fellow FQXi members would do that.

Quantum Times Article on the PBR Theorem

I recently wrote an article (pdf) for The Quantum Times (Newsletter of the APS Topical Group on Quantum Information) about the PBR theorem. There is some overlap with my previous blog post, but the newsletter article focuses more on the implications of the PBR result, rather than the result itself. Therefore, I thought it would be worth reproducing it here. Quantum types should still download the original newsletter, as it contains many other interesting things, including an article by Charlie Bennett on logical depth (which he has also reproduced over at The Quantum Pontiff). APS members should also join the TGQI, and if you are at the March meeting this week, you should check out some of the interesting sessions they have organized.

Note: Due to the appearance of this paper, I would weaken some of the statements in this article if I were writing it again. The results of the paper imply that the factorization assumption is essential to obtain the PBR result, so this is an additional assumption that needs to be made if you want to prove things like Bell’s theorem directly from psi-ontology rather than using the traditional approach. When I wrote the article, I was optimistic that a proof of the PBR theorem that does not require factorization could be found, in which case teaching PBR first and then deriving other results like Bell as a consequence would have been an attractive pedagogical option. However, due to the necessity for stronger assumptions, I no longer think this.

OK, without further ado, here is the article.

PBR, EPR, and all that jazz

In the past couple of months, the quantum foundations world has been abuzz about a new preprint entitled “The Quantum State Cannot be Interpreted Statistically” by Matt Pusey, Jon Barrett and Terry Rudolph (henceforth known as PBR). Since I wrote a blog post explaining the result, I have been inundated with more correspondence from scientists and more requests for comment from science journalists than at any other point in my career. Reaction to the result amongst quantum researchers has been mixed, with many people reacting negatively to the title, which can be misinterpreted as an attack on the Born rule. Others have managed to read past the title, but are still unsure whether to credit the result with any fundamental significance. In this article, I would like to explain why I think that the PBR result is the most significant constraint on hidden variable theories that has been proved to date. It provides a simple proof of many other known theorems, and it supercharges the EPR argument, converting it into a rigorous proof of nonlocality that has the same status as Bell’s theorem. Before getting to this though, we need to understand the PBR result itself.

What are Quantum States?

One of the most debated issues in the foundations of quantum theory is the status of the quantum state. On the ontic view, quantum states represent a real property of quantum systems, somewhat akin to a physical field, albeit one with extremely bizarre properties like entanglement. The alternative to this is the epistemic view, which sees quantum states as states of knowledge, more akin to the probability distributions of statistical mechanics. A psi-ontologist
(as supporters of the ontic view have been dubbed by Chris Granade) might point to the phenomenon of interference in support of their view, and also to the fact that pretty much all viable realist interpretations of quantum theory, such as many-worlds or Bohmian mechanics, include an ontic state. The key argument in favor of the epistemic view is that it dissolves the measurement problem, since the fact that states undergo a discontinuous change in the light of measurement results does not then imply the existence of any real physical process. Instead, the collapse of the wavefunction is more akin to the way that classical probability distributions get updated by Bayesian conditioning in the light of new data.

Many people who advocate a psi-epistemic view also adopt an anti-realist or neo-Copenhagen point of view on quantum theory in which the quantum state does not represent knowledge about some underlying reality, but rather it only represents knowledge about the consequences of measurements that we might make on the system. However, there remained the nagging question of whether it is possible in principle to construct a realist interpretation of quantum theory that is also psi-epistemic, or whether the realist is compelled to think that quantum states are real. PBR have answered this question in the negative, at least within the standard framework for hidden variable theories that we use for other no go results such as Bell’s theorem. As with Bell’s theorem, there are loopholes, so it is better to say that PBR have placed a strong constraint on realist psi-epistemic interpretations, rather than ruling them out entirely.

The PBR Result

To properly formulate the result, we need to know a bit about how quantum states are represented in a hidden variable theory. In such a theory, quantum systems are assumed to have real pre-existing properties that are responsible for determining what happens when we make a measurement. A full specification of these properties is what we mean by an ontic state of the system. In general, we don’t have precise control over the ontic state so a quantum state corresponds to a probability distribution over the ontic states. This framework is illustrated below.

Representation of a quantum state in an ontic model

In an ontic model, a quantum state (indicated heuristically on the left as a vector in the Bloch sphere) is represented by a probability distribution over ontic states, as indicated on the right.

A hidden variable theory is psi-ontic if knowing the ontic state of the system allows you to determine the (pure) quantum state that was prepared uniquely. Equivalently, the probability distributions corresponding to two distinct pure states do not overlap. This is illustrated below.

Psi-ontic model

Representation of a pair of quantum states in a psi-ontic model

A hidden variable theory is psi-epistemic if it is not psi-ontic, i.e. there must exist an ontic state that is possible for more than one pure state, or, in other words, there must exist two nonorthogonal pure states with corresponding distributions that overlap. This is illustrated below.

Psi-epistemic model

Representation of nonorthogonal states in a psi-epistemic model

These definitions of psi-ontology and psi-epistemicism may seem a little abstract, so a classical analogy may be helpful. In Newtonian mechanics the ontic state of a particle is a point in phase space, i.e. a specification of its position and momentum. Other ontic properties of the particle, such as its energy, are given by functions of the phase space point, i.e. they are uniquely determined by the ontic state. Likewise, in a hidden variable theory, anything that is a unique function of the ontic state should be regarded as an ontic property of the system, and this applies to the quantum state in a psi-ontic model. The definition of a psi-epistemic model as the negation of this is very weak, e.g. it could still be the case that most ontic states are only possible in one quantum state and just a few are compatible with more than one. Nonetheless, even this very weak notion is ruled out by PBR.

The proof of the PBR result is quite simple, but I will not review it here because it is summarized in my blog post and the original paper is also very readable. Instead, I want to focus on its implications.

Size of the Ontic State Space

A trivial consequence of the PBR result is that the cardinality of the ontic state space of any hidden variable theory, even for just a qubit, must be infinite, in fact continuously so. This is because there must be at least one ontic state for each quantum state, and there are a continuous infinity of the latter. The fact that there must be infinite ontic states was previously proved by Lucien Hardy under the name “Ontological Excess Baggage theorem”, but we can now
view it as a corollary of PBR. If you think about it, this property is quite surprising because we can only extract one or two bits from a qubit (depending on whether we count superdense coding) so it would be natural to assume that a hidden variable state could be specified by a finite amount of information.

Hidden variable theories provide one possible method of simulating a quantum computer on a classical computer by simply tracking the value of the ontic state at each stage in the computation. This enables us to sample from the probability distribution of any quantum measurement at any point during the computation. Another method is to simply store a representation of the quantum state at each point in time. This second method is clearly inefficient, as the number of parameters required to specify a quantum state grows exponentially with the number of qubits. The PBR theorem tells us that the hidden variable method cannot be any better, as it requires an ontic state space that is at least as big as the set of quantum states. This conclusion was previously drawn by Alberto Montina using different methods, but again it now becomes a corollary of PBR. This result falls short of saying that any classical simulation of a quantum computer must have exponential space complexity, since we usually only have to simulate the outcome of one fixed measurement at the end of the computation and our simulation does not have to track the slice-by-slice causal evolution of the quantum circuit. Indeed, pretty much the first nontrivial result in quantum computational complexity theory, proved by Bernstein and Vazirani, showed that quantum circuits can be simulated with polynomial memory resources. Nevertheless, this result does reaffirm that we need to go beyond slice-by-slice simulations of quantum circuits in looking for efficient classical algorithms.

Supercharged EPR Argument

As emphasized by Harrigan and Spekkens, a variant of the EPR argument favoured by Einstein shows that any psi-ontic hidden variable theory must be nonlocal. Thus, prior to Bell’s theorem, the only open possibility for a local hidden variable theory was a psi-epistemic theory. Of course, Bell’s theorem rules out all local hidden variable theories, regardless of the status of the quantum state within them. Nevertheless, the PBR result now gives an arguably simpler route to the same conclusion by ruling out psi-epistemic theories, allowing us to infer nonlocality directly from EPR.

A sketch of the argument runs as follows. Consider a pair of qubits in the singlet state. When one of the qubits is measured in an orthonormal basis, the other qubit collapses to one of two orthogonal pure states. By varying the basis that the first qubit is measured in, the second qubit can be made to collapse in any basis we like (a phenomenon that Schroedinger called “steering”). If we restrict attention to two possible choices of measurement basis, then there are
four possible pure states that the second qubit might end up in. The PBR result implies that the sets of possible ontic states for the second system for each of these pure states must be disjoint. Consequently, the sets of possible ontic states corresponding to the two distinct choices of basis are also disjoint. Thus, the ontic state of the second system must depend on the choice of measurement made on the first system and this implies nonlocality because I can decide which measurement to perform on the first system at spacelike separation from the second.

PBR as a proto-theorem

We have seen that the PBR result can be used to establish some known constraints on hidden variable theories in a very straightforward way. There is more to this story that I can possibly fit into this article, and I suspect that every major no-go result for hidden variable theories may fall under the rubric of PBR. Thus, even if you don’t care a fig about fancy distinctions between ontic and epistemic states, it is still worth devoting a few braincells to the PBR result. I predict that it will become viewed as the basic result about hidden variable theories, and that we will end up teaching it to our students even before such stalwarts as Bell’s theorem and Kochen-Specker.

Further Reading

For further details of the PBR theorem see:

For constraints on the size of the ontic state space see:

For the early quantum computational complexity results see:

For a fully rigorous version of the PBR+EPR nonlocality argument see:

Can the quantum state be interpreted statistically?

A new preprint entitled The Quantum State Cannot be Interpreted Statistically by Pusey, Barrett and Rudolph (henceforth known as PBR) has been generating a significant amount of buzz in the last couple of days. Nature posted an article about it on their website, Scott Aaronson and Lubos Motl blogged about it, and I have been seeing a lot of commentary about it on Twitter and Google+. In this post, I am going to explain the background to this theorem and outline exactly what it entails for the interpretation of the quantum state. I am not going to explain the technicalities in great detail, since these are explained very clearly in the paper itself. The main aim is to clear up misconceptions.

First up, I would like to say that I find the use of the word “Statistically” in the title to be a rather unfortunate choice. It is liable to make people think that the authors are arguing against the Born rule (Lubos Motl has fallen into this trap in particular), whereas in fact the opposite is true.  The result is all about reproducing the Born rule within a realist theory.  The question is whether a scientific realist can interpret the quantum state as an epistemic state (state of knowledge) or whether it must be an ontic state (state of reality). It seems to show that only the ontic interpretation is viable, but, in my view, this is a bit too quick. On careful analysis, it does not really rule out any of the positions that are advocated by contemporary researchers in quantum foundations. However, it does answer an important question that was previously open, and confirms an intuition that many of us already held. Before going into more detail, I also want to say that I regard this as the most important result in quantum foundations in the past couple of years, well deserving of a good amount of hype if anything is. I am not sure I would go as far as Antony Valentini, who is quoted in the Nature article saying that it is the most important result since Bell’s theorem, or David Wallace, who says that it is the most significant result he has seen in his career. Of course, these two are likely to be very happy about the result, since they already subscribe to interpretations of quantum theory in which the quantum state is ontic (de Broglie-Bohm theory and many-worlds respectively) and perhaps they believe that it poses more of a dilemma for epistemicists like myself then it actually does.

Classical Ontic States

Before explaining the result itself, it is important to be clear on what all this epistemic/ontic state business is all about and why it matters. It is easiest to introduce the distinction via a classical example, for which the interpretation of states is clear. Therefore, consider the Newtonian dynamics of a single point particle in one dimension. The trajectory of the particle can be determined by specifying initial conditions, which in this case consists of a position \(x(t_0)\) and momentum \(p(t_0)\) at some initial time \(t_0\). These specify a point in the particle’s phase space, which consists of all possible pairs \((x,p)\) of positions and momenta.

Classical Ontic State

The ontic state space for a single classical particle, with the initial ontic state marked.

Then, assuming we know all the relevant forces, we can compute the position and momentum \((x(t),p(t))\) at some other time \(t\) using Newton’s laws or, equivalently, Hamilton’s equations. At any time \(t\), the phase space point \((x(t),p(t))\) can be thought of as the instantaneous state of the particle. It is clearly an ontic state (state of reality), since the particle either does or does not possess that particular position and momentum, independently of whether we know that it possesses those values[1]. The same goes for more complicated systems, such as multiparticle systems and fields. In all cases, I can derive a phase space consisting of configurations and generalized momenta. This is the space of ontic states for any classical system.

Classical Epistemic States

Although the description of classical mechanics in terms of ontic phase space trajectories is clear and unambiguous, we are often, indeed usually, more interested in tracking what we know about a system. For example, in statistical mechanics, we may only know some macroscopic properties of a large collection of systems, such as pressure or temperature. We are interested in how these quantities change over time, and there are many different possible microscopic trajectories that are compatible with this. Generally speaking, our knowledge about a classical system is determined by assigning a probability distribution over phase space, which represents our uncertainty about the actual point occupied by the system.

A classical epistemic state

An epistemic state of a single classical particles. The ellipses represent contour lines of constant probability.

We can track how this probability distribution changes using Liouville’s equation, which is derived by applying Hamilton’s equations weighted with the probability assigned to each phase space point. The probability distribution is pretty clearly an epistemic state. The actual system only occupies one phase space point and does not care what probability we have assigned to it. Crucially, the ontic state occupied by the system would be regarded as possible by us in more than one probability distribution, in fact it is compatible with infinitely many.

Overlapping epistemic states

Epistemic states can overlap, so each ontic state is possible in more than one epistemic state. In this diagram, the two phase space axes have been schematically compressed into one, so that we can sketch the probability density graphs of epistemic states. The ontic state marked with a cross is possible in both epistemic states sketched on the graph.

Quantum States

We have seen that there are two clear notions of state in classical mechanics: ontic states (phase space points) and epistemic states (probability distributions over the ontic states). In quantum theory, we have a different notion of state — the wavefunction — and the question is: should we think of it as an ontic state (more like a phase space point), an epistemic state (more like a probability distribution), or something else entirely?

Here are three possible answers to this question:

  1. Wavefunctions are epistemic and there is some underlying ontic state. Quantum mechanics is the statistical theory of these ontic states in analogy with Liouville mechanics.
  2. Wavefunctions are epistemic, but there is no deeper underlying reality.
  3. Wavefunctions are ontic (there may also be additional ontic degrees of freedom, which is an important distinction but not relevant to the present discussion).

I will call options 1 and 2 psi-epistemic and option 3 psi-ontic. Advocates of option 3 are called psi-ontologists, in an intentional pun coined by Chris Granade. Options 1 and 3 share a conviction of scientific realism, which is the idea that there must be some description of what is going on in reality that is independent of our knowledge of it. Option 2 is broadly anti-realist, although there can be some subtleties here[2].

The theorem in the paper attempts to rule out option 1, which would mean that scientific realists should become psi-ontologists. I am pretty sure that no theorem on Earth could rule out option 2, so that is always a refuge for psi-epistemicists, at least if their psi-epistemic conviction is stronger than their realist one.

I would classify the Copenhagen interpretation, as represented by Niels Bohr[3], under option 2. One of his famous quotes is:

There is no quantum world. There is only an abstract physical description. It is wrong to think that the task of physics is to find out how nature is. Physics concerns what we can say about nature…[4]

and “what we can say” certainly seems to imply that we are talking about our knowledge of reality rather than reality itself. Various contemporary neo-Copenhagen approaches also fall under this option, e.g. the Quantum Bayesianism of Carlton Caves, Chris Fuchs and Ruediger Schack; Anton Zeilinger’s idea that quantum physics is only about information; and the view presently advocated by the philosopher Jeff Bub. These views are safe from refutation by the PBR theorem, although one may debate whether they are desirable on other grounds, e.g. the accusation of instrumentalism.

Pretty much all of the well-developed interpretations that take a realist stance fall under option 3, so they are in the psi-ontic camp. This includes the Everett/many-worlds interpretation, de Broglie-Bohm theory, and spontaneous collapse models. Advocates of these approaches are likely to rejoice at the PBR result, as it apparently rules out their only realist competition, and they are unlikely to regard anti-realist approaches as viable.

Perhaps the best known contemporary advocate of option 1 is Rob Spekkens, but I also include myself and Terry Rudolph (one of the authors of the paper) in this camp. Rob gives a fairly convincing argument that option 1 characterizes Einstein’s views in this paper, which also gives a lot of technical background on the distinction between options 1 and 2.

Why be a psi-epistemicist?

Why should the epistemic view of the quantum state should be taken seriously in the first place, at least seriously enough to prove a theorem about it? The most naive argument is that, generically, quantum states only predict probabilities for observables rather than definite values. In this sense, they are unlike classical phase space points, which determine the values of all observables uniquely. However, this argument is not compelling because determinism is not the real issue here. We can allow there to be some genuine stochasticity in nature whilst still maintaining realism.

An argument that I personally find motivating is that quantum theory can be viewed as a noncommutative generalization of classical probability theory, as was first pointed out by von Neumann. My own exposition of this idea is contained in this paper. Even if we don’t always realize it, we are always using this idea whenever we generalize a result from classical to quantum information theory. The idea is so useful, i.e. it has such great explanatory power, that it would be very puzzling if it were a mere accident, but it does appear to be just an accident in most psi-ontic interpretations of quantum theory.  For example, try to think about why quantum theory should be formally a generalization of probability theory from a many-worlds point of view.  Nevertheless, this argument may not be compelling to everyone, since it mainly entails that mixed states have to be epistemic. Classically, the pure states are the extremal probability distributions, i.e. they are just delta functions on a single ontic state. Thus, they are in one-to-one correspondence with the ontic states. The same could be true of pure quantum states without ruining the analogy[5].

A more convincing argument concerns the instantaneous change that occurs after a measurement — the collapse of the wavefunction. When we acquire new information about a classical epistemic state (probability distribution) say by measuring the position of a particle, it also undergoes an instantaneous change. All the weight we assigned to phase space points that have positions that differ from the measured value is rescaled to zero and the rest of the probability distribution is renormalized. This is just Bayesian conditioning. It represents a change in our knowledge about the system, but no change to the system itself. It is still occupying the same phase space point as it was before, so there is no change to the ontic state of the system. If the quantum state is epistemic, then instantaneous changes upon measurement are unproblematic, having a similar status to Bayesian conditioning. Therefore, the measurement problem is completely dissolved within this approach.

Finally, if we allow a more sophisticated analogy between quantum states and probabilities, in particular by allowing constraints on how much may be known and allowing measurements to locally disturb the ontic state, then we can qualitatively explain a large number of phenomena that are puzzing for a psi-ontologist very simply within a psi-epistemic approach. These include: teleportation, superdense coding, and much of the rest of quantum information theory. Crucially, it also includes interference, which is often held as a convincing reason for psi-ontology. This was demonstrated in a very convincing way by Rob Spekkens via a toy theory, which is recommended reading for all those interested in quantum foundations. In fact, since this paper contains the most compelling reasons for being a psi-epistemicist, you should definitely make sure you read it so that you can be more shocked by the PBR result.

Ontic models

If we accept that the psi-epistemic position is reasonable, then it would be superficially resonable to pick option 1 and try to maintain scientific realism. This leads us into the realm of ontic models for quantum theory, otherwise known as hidden variable theories[6]. A pretty standard framework for discussing such models has existed since John Bell’s work in the 1960’s, and almost everyone adopts the same definitions that were laid down then. The basic idea is that systems have properties. There is some space \(\Lambda\) of ontic states, analogous to the phase space of a classical theory, and the system has a value \(\lambda \in \Lambda\) that specifies all its properties, analogous to the phase space points. When we prepare a system in some quantum state \(\Ket{\psi}\) in the lab, what is really happening is that an ontic state \(\lambda\) is sampled from a probability distribution over \(\mu(\lambda)\) that depends on \(\Ket{\psi}\).

Representation of a quantum state in an ontic model

In an ontic model, a quantum state (indicated heuristically on the left as a vector in the Bloch sphere) is represented by a probability distribution over ontic states, as indicated on the right.

We also need to know how to represent measurements in the model[7].  For each possible measurement that we could make on the system, the model must specify the outcome probabilities for each possible ontic state.  Note that we are not assuming determinism here.  The measurement is allowed to be stochastic even given a full specification of the ontic state.  Thus, for each measurement \(M\), we need a set of functions \(\xi^M_k(\lambda)\) , where \(k\) labels the outcome.  \(\xi^M_k(\lambda)\) is the probability of obtaining outcome \(k\) in a measurement of \(M\) when the ontic state is \(\lambda\).  In order for these probabilities to be well defined the functions \(\xi^M_k\) must be positive and they must satisfy \(\sum_k \xi^M_k(\lambda) = 1\) for all \(\lambda \in \Lambda\). This normalization condition is very important in the proof of the PBR theorem, so please memorize it now.

Overall, the probability of obtaining outcome \(k\) in a measurement of \(M\) when the system is prepared in state \(\Ket{\psi}\) is given by

\[\mbox{Prob}(k|M,\Ket{\psi}) = \int_{\Lambda} \xi^M_k(\lambda) \mu(\lambda) d\lambda, \]
which is just the average of the outcome probabilities over the ontic state space.

If the model is going to reproduce the predictions of quantum theory, then these probabilities must match the Born rule.  Suppose that the \(k\)th outcome of \(M\) corresponds to the projector \(P_k\).  Then, this condition boils down to

\[\Bra{\psi} P_k \Ket{\psi} = \int_{\Lambda} \xi^M_k(\lambda) \mu(\lambda) d\lambda,\]

and this must hold for all quantum states, and all outcomes of all possible measurements.

Constraints on Ontic Models

Even disregarding the PBR paper, we already know that ontic models expressible in this framework have to have a number of undesirable properties. Bell’s theorem implies that they have to be nonlocal, which is not great if we want to maintain Lorentz invariance, and the Kochen-Specker theorem implies that they have to be contextual. Further, Lucien Hardy’s ontological excess baggage theorem shows that the ontic state space for even a qubit would have to have infinite cardinality. Following this, Montina proved a series of results, which culminated in the claim that there would have to be an object satisfying the Schrödinger equation present within the ontic state (see this paper). This latter result is close to the implication of the PBR theorem itself.

Given these constraints, it is perhaps not surprising that most psi-epistemicists have already opted for option 2, denouncing scientific realism entirely. Those of us who cling to realism have mostly decided that the ontic state must be a different type of object than it is in the framework described above.  We could discard the idea that individual systems have well-defined properties, or the idea that the probabilities that we assign to those properties should depend only on the quantum state. Spekkens advocates the first possibility, arguing that only relational properties are ontic. On the other hand, I, following Huw Price, am partial to the idea of epistemic hidden variable theories with retrocausal influences, in which case the probability distributions over ontic states would depend on measurement choices as well as which quantum state is prepared. Neither of these possibilities are ruled out by the previous results, and they are not ruled out by PBR either. This is why I say that their result does not rule out any position that is seriously held by any researchers in quantum foundations. Nevertheless, until the PBR paper, there remained the question of whether a conventional psi-epistemic model was possible even in principle. Such a theory could at least have been a competitor to Bohmian mechanics. This possibility has now been ruled out fairly convincingly, and so we now turn to the basic idea of their result.

The Result

Recall from our classical example that each ontic state (phase space point) occurs in the support of more than one epistemic state (Liouville distribution), in fact infinitely many. This is just because probability distributions can have overlapping support. Now, consider what would happen if we restricted the theory to only allow epistemic states with disjoint support. For example, we could partition phase space into a number of disjoint cells and only consider probability distributions that are uniform over one cell and zero everywhere else.

Restricted classical theory

A restricted classical theory in which only the distributions indicated are allowed as epistemic states. In this case, each ontic state is only possible in one epistemic state, so it is more accurate to say that the epistemic states represent a property of the ontic state.

Given this restriction, the ontic state determines the epistemic state uniquely. If someone tells you the ontic state, then you know which cell it is in, so you know what the epistemic state must be. Therefore, in this restricted theory, the epistemic state is not really epistemic. Its image is contained in the ontic state, and it would be better to say that we were talking about a property of the ontic state, rather than something that represents knowledge. According to the PBR result, this is exactly what must happen in any ontic model of quantum theory within the Bell framework.

Here is the analog of this in ontic models of quantum theory.  Suppose that two nonorthogonal quantum states \(\Ket{\psi_1}\) and \(\Ket{\psi_2}\) are represented as follows in an ontic model:

Psi-epistemic model

Representation of nonorthogonal states in a psi-epistemic model

Because the distributions overlap, there are ontic states that are compatible with more than one quantum states, so this is a psi-epistemic model.

In contrast, if, for every pair of quantum states \(\Ket{\psi_1},\Ket{\psi_2}\), the probability distributions do not overlap, i.e. the representation of each pair looks like this

Psi-ontic model

Representation of a pair of quantum states in a psi-ontic model

then the quantum state is uniquely determined by the ontic state, and it is therefore better regarded as a property of \(\lambda\) rather than a representation of knowledge.  Such a model is psi-ontic.  The PBR theorem states that all ontic models that reproduce the Born rule must be psi-ontic.

Sketch of the proof

In order to establish the result, PBR make use of the following idea. In an ontic model, the ontic state \(\lambda\) determines the probabilities for the outcomes of any possible measurement via the functions \(\xi^M_k\). The Born rule probabilities must be obtained by averaging these conditional probabilities with respect to the probability distribution \(\mu(\lambda)\) representing the quantum state. Suppose there is some measurement \(M\) that has an outcome \(k\) to which the quantum state \(\Ket{\psi}\) assigns probability zero according to the Born rule. Then, it must be the case that \(\xi^M_k(\lambda) = 0\) for every \(\lambda\) in the support of \(\mu(\lambda)\). Now consider two quantum states \(\Ket{\psi_1}\) and \(\Ket{\psi_2}\) and suppose that we can find a two outcome measurement such that that the first state gives zero Born rule probability to the first outcome and the second state gives zero Born rule probability to the second outcome. Suppose also that there is some \(\lambda\) that is in the support of both the distributions, \(\mu_1\) and \(\mu_2\), that represent \(\Ket{\psi_1}\) and \(\Ket{\psi_2}\) in the ontic model. Then, we must have \(\xi^M_1(\lambda) = \xi^M_2(\lambda) = 0\), which contradicts the normalization assumption \(\xi^M_1(\lambda) + \xi^M_2(\lambda) = 1\).

Now, it is fairly easy to see that there is no such measurement for a pair of nonorthogonal states, because this would mean that they could be distinguished with certainty, so we do not have a result quite yet. The trick to get around this is to consider multiple copies. Consider then, the four states \(\Ket{\psi_1}\otimes\Ket{\psi_1}, \Ket{\psi_1}\otimes\Ket{\psi_2}, \Ket{\psi_2}\otimes\Ket{\psi_1}\) and \(\Ket{\psi_2}\otimes\Ket{\psi_2}\) and suppose that there is a four outcome measurement such that \(\Ket{\psi_1}\otimes\Ket{\psi_1}\) gives zero probability to the first outcome, \(\Ket{\psi_1}\otimes\Ket{\psi_2}\) gives zero probability to the second outcome, and so on. In addition to this, we make an independence assumption that the probability distributions representing these four states must satisfy. Let \(\lambda\) be the ontic state of the first system and let \(\lambda’\) be the ontic state of the second. The independence assumption states that the probability densities representing the four quantum states in the ontic model are \(\mu_1(\lambda)\mu_1(\lambda’), \mu_1(\lambda)\mu_2(\lambda’), \mu_2(\lambda)\mu_1(\lambda’)\) and \(\mu_2(\lambda)\mu_2(\lambda’)\). This is a reasonable assumption because there is no entanglement between the two systems and we could do completely independent experiments on each of them. Assuming there is an ontic state \(\lambda\) in the support of both \(\mu_1\) and \(\mu_2\), there will be some nonzero probability that both systems occupy this ontic state whenever any of the four states are prepared. But, in this case, all four functions \(\xi^M_1,\xi^M_2,\xi^M_3\) and \(\xi^M_4\) must have value zero when both systems are in this state, which contradicts the normalization \(\sum_k \xi^M_k = 1\).

This argument works for the pair of states \(\Ket{\psi_1} = \Ket{0}\) and \(\Ket{\psi_2} = \Ket{+} = \frac{1}{\sqrt{2}} \left ( \Ket{0} + \Ket{1}\right )\). In this case, the four outcome measurement is a measurement in the basis:

\[\Ket{\phi_1} = \frac{1}{\sqrt{2}} \left ( \Ket{0}\otimes\Ket{1} + \Ket{1} \otimes \Ket{0} \right )\]
\[\Ket{\phi_2} = \frac{1}{\sqrt{2}} \left ( \Ket{0}\otimes\Ket{-} + \Ket{1} \otimes \Ket{+} \right )\]
\[\Ket{\phi_3} = \frac{1}{\sqrt{2}} \left ( \Ket{+}\otimes\Ket{1} + \Ket{-} \otimes \Ket{0} \right )\]
\[\Ket{\phi_4} = \frac{1}{\sqrt{2}} \left ( \Ket{+}\otimes\Ket{-} + \Ket{-} \otimes \Ket{+} \right ),\]

where \(\Ket{-} = \frac{1}{\sqrt{2}} \left ( \Ket{0} – \Ket{1}\right )\). It is easy to check that \(\Ket{\phi_1}\) is orthogonal to \(\Ket{0}\otimes\Ket{0}\), \(\Ket{\phi_2}\) is orthogonal to \(\Ket{0}\otimes\Ket{+}\), \(\Ket{\phi_3}\) is orthogonal to \(\Ket{+}\otimes\Ket{0}\), and \(\Ket{\phi_4}\) is orthogonal to \(\Ket{+}\otimes\Ket{+}\). Therefore, the argument applies and there can be no overlap in the probability distributions representing \(\Ket{0}\) and \(\Ket{+}\) in the model.

To establish psi-ontology, we need a similar argument for every pair of states \(\Ket{\psi_1}\) and \(\Ket{\psi_2}\). PBR establish that such an argument can always be made, but the general case is more complicated and requires more than two copies of the system. I refer you to the paper for details where it is explained very clearly.

Conclusions

The PBR theorem rules out psi-epistemic models within the standard Bell framework for ontological models. The remaining options are to adopt psi-ontology, remain psi-epistemic and abandon realism, or remain psi-epistemic and abandon the Bell framework. One of the things that a good interpretation of a physical theory should have is explanatory power. For me, the epistemic view of quantum states is so explanatory that it is worth trying to preserve it. Realism too is something that we should not abandon too hastily. Therefore, it seems to me that we should be questioning the assumptions of the Bell framework by allowing more general ontologies, perhaps involving relational or retrocausal degrees of freedom. At the very least, this option is the path less travelled, so we might learn something by exploring it more thoroughly.

  1. There are actually subtleties about whether we should think of phase space points as instantaneous ontic states. For one thing, the momentum depends on the first derivative of position, so maybe we should really think of the state being defined on an infinitesimal time interval. Secondly, the fact that momentum appears is because Newtonian mechanics is defined by second order differential equations. If it were higher order then we would have to include variables depending on higher derivatives in our definition of phase space. This is bad if you believe in a clean separation between basic ontology and physical laws. To avoid this, one could define the ontic state to be the position only, i.e. a point in configuration space, and have the boundary conditions specified by the position of the particle at two different times. Alternatively, one might regard the entire spacetime trajectory of the particle as the ontic state, and regard the Newtonian laws themselves as a mere pattern in the space of possible trajectories. Of course, all these descriptions are mathematically equivalent, but they are conceptually quite different and they lead to different intuitions as to how we should understand the concept of state in quantum theory. For present purposes, I will ignore these subtleties and follow the usual practice of regarding phase space points as the unambiguous ontic states of classical mechanics. []
  2. The subtlety is basically a person called Chris Fuchs. He is clearly in the option 2 camp, but claims to be a scientific realist. Whether he is successful at maintaining realism is a matter of debate. []
  3. Note, this is distinct from the orthodox interpretation as represented by the textbooks of Dirac and von-Neumann, which is also sometimes called the Copenhagen interpretation. Orthodoxy accepts the eigenvalue-eigenstate link.  Observables can sometimes have definite values, in which case they are objective properties of the system. A system has such a property when it is in an eigenstate of the corresponding observable. Since every wavefunction is an eigenstate of some observable, it follows that this is a psi-ontic view, albeit one in which there are no additional ontic degrees of freedom beyond the quantum state. []
  4. Sourced from Wikiquote. []
  5. but note that the resulting theory would essentially be the orthodox interpretation, which has a measurement problem. []
  6. The terminology “ontic model” is preferred to “hidden variable theory” for two reasons. Firstly, we do not want to exclude the case where the wavefunction is ontic, but there are no extra degrees of freedom (as in the orthodox interpretation). Secondly, it is often the case that the “hidden” variables are the ones that we actually observe rather than the wavefunction, e.g. in Bohmian mechanics the particle positions are not “hidden”. []
  7. Generally, we would need to represent dynamics as well, but the PBR theorem does not depend on this. []

The Choi-Jamiolkowski Isomorphism: You’re Doing It Wrong!

As the dear departed Quantum Pontiff used to say: New Paper Dance! I am pretty happy that this one has finally been posted because it is my first arXiv paper since I returned to work, and also because it has gone through more rewrites than Spiderman: The Musical.

What is the paper about, I hear you ask? Well, mathematically, it is about an extremely simple linear algebra trick called the Choi-Jamiolkwoski isomorphism. This is actually two different results: the Choi isomorphism and the Jamiolkowski isomorphism, but people have a habit of lumping them together. This trick is so extremely well-known to quantum information theorists that it is not even funny. One of the main points of the paper is that you should think about what the isomorphism means physically in a new way. Hence the “you’re doing it wrong” in the post title.

First Level Isomorphisms

For the uninitiated, here is the simplest way of describing the Choi isomorphism in a single equation:
\[\Ket{j}\Bra{k} \qquad \qquad \equiv \qquad \qquad \Ket{j} \otimes \Ket{k},\]
i.e. the ismomorphism works by turning a bra into a ket. The thing on the left is an operator on a Hilbert space \(\mathcal{H}\) and the thing on the right is a vector in \(\mathcal{H} \otimes \mathcal{H}\), so the isomorphism says that \(\mathcal{L}(\mathcal{H}) \equiv \mathcal{H} \otimes \mathcal{H}\), where \(\mathcal{L}(\mathcal{H})\) is the space of linear operators on \(\mathcal{H}\).

Here is how it works in general. If you have an operator \(U\) then you can pick a basis for \(\mathcal{H}\) and write \(U\) in this basis as
\[U = \sum_{j,k} U_{j,k} \Ket{j}\Bra{k},\]
where \(U_{j,k} = \Bra{j}U\Ket{k}\). Then you just extend the above construction by linearity and write down a vector
\[\Ket{\Phi_U} = \sum_{j,k} U_{j,k} \Ket{j} \otimes \Ket{k}.\]
It is pretty obvious that we can go in the other direction as well, starting with a vector on \(\mathcal{H}\otimes\mathcal{H}\), we can write it out in a product basis, turn the second ket into a bra, and then we have an operator.

So far, this is all pretty trivial linear algebra, but when we think about what this means physically it is pretty weird. One of the things that is represented by an operator in quantum theory is dynamics, in particular a unitary operator represents the dynamics of a closed system for a discrete time-step. One of the things that is represented by a vector on a tensor product Hilbert space is a pure state of a bipartite system. It is fairly easy to see that (up to normalization) unitary operators get mapped to maximally entangled states under the isomorphism, so, in some sense, a maximally entangled state is “the same thing” as a unitary operator. This is weird because there are some things that make sense for dynamical operators that don’t seem to make sense for states and vice-versa. For example, dynamics can be composed. If \(U\) represents the dynamics from \(t_0\) to \(t_1\) and \(V\) represents the dynamics from \(t_1\) to \(t_2\), then the dynamics from \(t_0\) to \(t_2\) is represented by the product \(VU\). Using the isomorphism, we can define a composition for states, but what on earth does this mean?

Before getting on to that, let us briefly pause to consider the Jamiolkowski version of the isomorphism. The Choi isomorphism is basis dependent. You get a slightly different state if you write down the operator in a different basis. To make things basis independent, we replace \(\mathcal{H}\otimes\mathcal{H}\) by \(\mathcal{H}\otimes\mathcal{H}^*\). \(\mathcal{H}^*\) denotes the dual space to \(\mathcal{H}\), i.e. it is the space of bras instead of the space of kets. In Dirac notation, the Jamiolkwoski isomorphism looks pretty trivial. It says
\[\Ket{j}\Bra{k} \qquad \qquad \equiv \qquad \qquad \Ket{j} \otimes \Bra{k}.\]
This is axiomatic in Dirac notation, because we always assume that tensor product symbols can be omitted without changing anything. However, this version of the isomorphism is going to become important later.

Conventional Interpretation: Gate Teleportation

In quantum information, the Choi isomorphism is usually interpreted in terms of “gate teleportation”. To understand this, we first reformulate the isomorphism slightly. Let \(\Ket{\Phi^+}_{AA’} = \sum_j \Ket{jj}_{AA’}\), where \(A\) and \(A’\) are quantum systems with Hilbert spaces of the same dimension. The vectors \(\Ket{j}\) form a preferred basis, and this is the basis in which the Choi isomorphism is going to be defined. Note that \(\Ket{\Phi^+}_{AA’}\) is an (unnormalized) maximally entangled state. It is easy to check that the isomorphism can now be reformulated as
\[\Ket{\Phi_U}_{AA'} = I_A \otimes U_A' \Ket{\Phi^+}_{AA'},\]
where \(I_A\) is the identity operator on system \(A\). The reverse direction of the isomorphism is given by
\[U_A \Ket{\psi}\Bra{\psi}_A U_A^{\dagger} = \Bra{\Phi^+}_{A'A''} \left ( \Ket{\psi}\Bra{\psi}_{A''} \otimes \Ket{\Phi_U}\Bra{\Phi_U}_{A'A} \right )\Ket{\Phi^+}_{A'A''},\]
where \(A^{\prime\prime}\) is yet another quantum system with the same Hilbert space as \(A\).

Now let’s think about the physical interpretation of the reverse direction of the isomorphism. Suppose that \(U\) is the identity. In that case, \(\Ket{\Phi_U} = \Ket{\Phi^+}\) and the reverse direction of the isomorphism is easily recognized as the expression for the output of the teleportation protocol when the \(\Ket{\Phi^+}\) outcome is obtained in the Bell measurement. It says that \(\Ket{\psi}\) gets teleported from \(A^{\prime\prime}\) to \(A\). Of course, this outcome only occurs some of the time, with probability \(1/d\), where \(d\) is the dimension of the Hilbert space of \(A\), a fact that is obscured by our decision to use an unnormalized version of \(\Ket{\Phi^+}\).

Now, if we let \(U\) be a nontrivial unitary operator then the reverse direction of the isomorphism says something more interesting. If we use the state \(\Ket{\Phi_U}\) rather than \(\Ket{\Phi^+}\) as our resource state in the teleportation protocol, then, upon obtaining the \(\Ket{\Phi^+}\) outcome in the Bell measurement, the output of the protocol will not simply be the input state \(\Ket{\psi}\), but it will be that state with the unitary \(U\) applied to it. This is called “gate teleportation”. It has many uses in quantum computing. For example, in linear optics implementations, it is impossible to perform every gate in a universal set with 100% probability. To avoid damaging your precious computational state, you can apply the indeterministic gates to half of a maximally entangled state and keep doing so until you get one that succeeds. Then you can teleport your computational state using the resulting state as a resource and end up applying the gate that you wanted. This allows you to use indeterministic gates without having to restart the computation from the beginning every time one of these gates fails.

Using this interpretation of the isomorphism, we can also come up with a physical interpretation of the composition of two states. It is basically a generalization of entanglement swapping. If you take \(\Ket{\Phi_U}\) and \(\Ket{\Phi_{V}}\) and and perform a Bell measurement across the output system of the first and the input system of the second then, upon obtaining the \(\Ket{\Phi^+}\) outcome, you will have the state \(\Ket{\Phi_{UV}}\). In this way, you can perform your entire computational circuit in advance, before you have access to the input state, and then just teleport your input state into the output register as the final step.

In this way, the Choi isomorphism leads to a correspondence between a whole host of protocols involving gates and protocols involving entangled states. We can also define interesting properties of operations, such as the entanglement of an operation, in terms of the states that they correspond to. We then use the isomoprhism to give a physical meaning to these properties in terms of gate teleportation. However, one weak point of the correspondence is that it transforms something deterministic; the application of a unitary operation; into something indeterministic; getting the \(\Ket{\Phi^+}\) outcome in a Bell measurement. Unlike the teleportation protocol, gate teleportation cannot be made deterministic by applying correction operations for the other outcomes, at least not if we want these corrections to be independent of \(U\). The states you get for the other outcomes involve nasty things like \(U^*, U^T, U^\dagger\) applied to \(\Ket{\psi}\), depending on exactly how you construct the Bell basis, e.g. choice of phases. These can typically not be corrected without applying \(U\). In particular, that would screw things up in the linear optics application wherein \(U\) can only be implemented non-deterministically.

Before turning to our alternative interpretation of Choi-Jamiolkowski, let’s generalize things a bit.

Second Level Isomorphisms

In quantum theory we don’t just have pure states, but also mixed states that arise if you have uncertainty about which state was prepared, or if you ignore a subsystem of a larger system that is in a pure state. These are described by positive, trace-one, operators, denoted \(\rho\), called density operators. Similarly, dynamics does not have to be unitary. For example, we might bring in an extra system, interact them unitarily, and then trace out the extra system. These are described by Completely-Positive, Trace-Preserving (CPT) maps, denoted \(\mathcal{E}\). These are linear maps that act on the space of operators, i.e. they are operators on the space of operators, and are often called superoperators.

Now, the set of operators on a Hilbert space is itself a Hilbert space with inner product \(\left \langle N, M \right \rangle = \Tr{N^{\dagger}M}\). Thus, we can apply Choi-Jamiolkowski on this space to define a correspondence between superoperators and operators on the tensor product. We can do this in terms of an orthonormal operator basis with respect to the trace inner product, but it is easier to just give the teleportation version of the isomorphism. We will also generalize slightly to allow for the possibility that the input and output spaces of our CPT map may be different, i.e. it may involve discarding a subsystem of the system we started with, or bringing in extra ancillary systems.

Starting with a CPT map \(\mathcal{E}_{B|A}: \mathcal{L}(\mathcal{H}_A) \rightarrow \mathcal{L}(\mathcal{H}_B)\) from system \(A\) to system \(B\), we can define an operator on \(\mathcal{H}_A \otimes \mathcal{H}_B\) via
\[\rho_{AB} = \mathcal{E}_{B|A'} \otimes \mathcal{I}_{A} \left ( \Ket{\Phi^+}\Bra{\Phi^+}_{AA'}\right ),\]
where \(\mathcal{I}_A\) is the identity superoperator. This is a positive operator, but it is not quite a density operator as it satisfies \(\PTr{B}{\rho_{AB}} = I_A\), which implies that \(\PTr{AB}{\rho_{AB}} = d\) rather than \(\PTr{AB}{\rho_{AB}} = 1\). This is analogous to using unnormalized states in the pure-state case. The reverse direction of the isomorphism is then given by
\[\mathcal{E}_{B|A} \left ( \sigma_A \right ) = \Bra{\Phi^+}_{A'A}\sigma_{A'} \otimes \rho_{AB}\Ket{\Phi^+}_{A'A}.\]
This has the same interpretation in terms of gate teleportation (or rather CPT-map teleportation) as before.

The Jamiolkowski version of this isomorphism is given by
\[\varrho_{AB} = \mathcal{E}_{B|A'} \otimes \mathcal{I}_{A} \left ( \Ket{\Phi^+}\Bra{\Phi^+}_{AA'}^{T_A}\right ),\]
where \(^T_A\) denotes the partial transpose in the basis used to define \(\Ket{\Phi^+}\). Although it is not obvious from this formula, this operator is independent of the choice of basis, as \(\Ket{\Phi^+}\Bra{\Phi^+}_{AA’}^{T_A}\) is actually the same operator for any choice of basis. I’ll keep the reverse direction of the isomorphism a secret for now, as it would give a strong hint towards the punchline of this blog post.

Probability Theory

I now want to give an alternative way of thinking about the isomorphism, in particular the Jamiolkowski version, that is in many ways conceptually clearer than the gate teleportation interpretation. The starting point is the idea that quantum theory can be viewed as a noncommutative generalization of classical probability theory. This idea goes back at least to von Neumann, and is at the root of our thinking in quantum information theory, particularly in quantum Shannon theory. The basic idea of the generalization is that that probability distributions \(P(X)\) get mapped to density operators \(\rho_A\) and sums over variables become partial traces. Therefore, let’s start by thinking about whether there is a classical analog of the isomorphism, and, if so, what its interpretation is.

Suppose we have two random variables, \(X\) and \(Y\). We can define a conditional probability distribution of \(Y\) given \(X\), \(P(Y|X)\), as a positive function of the two variables that satisfies \(\sum_Y P(Y|X) = 1\) independently of the value of \(X\). Given a conditional probability distribution and a marginal distribution, \(P(X)\), for \(X\), we can define a joint distribution via
\[P(X,Y) = P(Y|X)P(X).\]
Conversely, given a joint distribution \(P(X,Y)\), we can find the marginal \(P(X) = \sum_Y P(X,Y)\) and then define a conditional distribution
\[P(Y|X) = \frac{P(X,Y)}{P(X)}.\]
Note, I’m going to ignore the ambiguities in this formula that occur when \(P(X)\) is zero for some values of \(X\).

Now, suppose that \(X\) and \(Y\) are the input and output of a classical channel. I now want to think of the probability distribution of \(Y\) as being determined by a stochastic map \(\Gamma_{Y|X}\) from the space of probability distributions over \(X\) to the space of probability distributions over \(Y\). Since \(P(Y) = \sum_{X} P(X,Y)\), this has to be given by
\[P(Y) = \Gamma_{Y|X} \left ( P(X)\right ) = \sum_X P(Y|X) P(X),\]
or
\[\Gamma_{Y|X} \left ( \cdot \right ) = \sum_{X} P(Y|X) \left ( \cdot \right )\].

What we have here is a correspondence between a positive function of two variables — the conditional proabability distribution — and a linear map that acts on the space of probability distributions — the stochastic map. This looks analogous to the Choi-Jamiolkowski isomorphism, except that, instead of a joint probability distribution, which would be analogous to a quantum state, we have a conditional probability distribution. This suggests that we made a mistake in thinking of the operator in the Choi isomorphism as a state. Maybe it is something more like a conditional state.

Conditional States

Let’s just plunge in and make a definition of a conditional state, and then see how it makes sense of the Jamiolkowski isomorphism. For two quantum systems, \(A\) and \(B\), a conditional state of \(B\) given \(A\) is defined to be a positive operator \(\rho_{B|A}\) on \(\mathcal{H}_A \otimes \mathcal{H}_B\) that satisfies
\[\PTr{B}{\rho_{B|A}} = I_A.\]
This is supposed to be analogous to the condition \(\sum_Y P(Y|X) = 1\). Notice that this is exactly how the operators that are Choi-isomorphic to CPT maps are normalized.

Given a conditional state, \(\rho_{B|A}\), and a reduced state \(\rho_A\), I can define a joint state via
\[\rho_{AB} = \sqrt{\rho_A} \rho_{B|A} \sqrt{\rho_A},\]
where I have suppressed the implicit \(\otimes I_B\) required to make the products well defined. The conjugation by the square root ensures that \(\rho_{AB}\) is positive, and it is easy to check that \(\PTr{AB}{\rho_{AB}} = 1\).

Conversely, given a joint state, I can find its reduced state \(\rho_A = \PTr{B}{\rho_{AB}}\) and then define the conditional state
\[\rho_{B|A} = \sqrt{\rho_A^{-1}} \rho_{AB} \sqrt{\rho_A^{-1}},\]
where I am going to ignore cases in which \(\rho_A\) has any zero eigenvalues so that the inverse is well-defined (this is no different from ignoring the division by zero in the classical case).

Now, suppose you are given \(\rho_A\) and you want to know what \(\rho_B\) should be. Is there a linear map that tells you how to do this, analogous to the stochastic map \(\Gamma_{Y|X}\) in the classical case? The answer is obviously yes. We can define a map \(\mathfrak{E}_{B|A}: \mathcal{L} \left ( \mathcal{H}_A\right ) \rightarrow \mathcal{L} \left ( \mathcal{H}_B\right )\) via
\[\mathfrak{E}_{B|A} \left ( \rho_A \right ) = \PTr{A}{\rho_{B|A} \rho_A},\]
where we have used the cyclic property of the trace to combine the \(\sqrt{\rho_A}\) terms, or
\[\mathfrak{E}_{B|A} \left ( \cdot \right ) = \PTr{A}{\rho_{B|A} (\cdot)}.\]
The map \(\mathfrak{E}_{B|A}\) so defined is just the Jamiolkowski isomorphic map to \(\rho_{B|A}\) and the above equation gives the reverse direction of the Jamiolkowski isomorphism that I was being secretive about earlier.

The punchline is that the Choi-Jamiolkowski isomorphism should not be thought of as a mapping between quantum states and quantum operations, but rather as a mapping between conditional quantum states and quantum operations. It is no more surprising than the fact that classical stochastic maps are determined by conditional probability distributions. If you think of it in this way, then your approach to quantum information will become conceptually simpler a lot of ways. These ways are discussed in detail in the paper.

Causal Conditional States

There is a subtlety that I have glossed over so far that I’d like to end with. The map \(\mathfrak{E}_{B|A}\) is not actually completely positive, which is why I did not denote it \(\mathcal{E}_{B|A}\), but when preceeded by a transpose on \(A\) it defines a completely positive map. This is because the Jamiolkowski isomorphism is defined in terms of the partial transpose of the maximally entangled state. Also, so far I have been talking about two distinct quantum systems that exist at the same time, whereas in the classical case, I talked about the input and output of a classical channel. A quantum channel is given by a CPT map \(\mathcal{E}_{B|A}\) and its Jamiolkowski representation would be
\[\mathcal{E}_{B|A} \left (\rho_A \right ) = \PTr{A}{\varrho_{B|A}\rho_A},\]
where \(\varrho_{B|A}\) is the partial transpose over \(A\) of a positive operator and it satisfies \(\PTr{B}{\varrho_{B|A}} = I_A\). This is the appropriate notion of a conditional state in the causal scenario, where you are talking about the input and output of a quantum channel rather than two systems at the same time. The two types of conditional state are related by a partial transpose.

Despite this difference, a good deal of unification is achieved between the way in which acausally related (two subsystems) and causally related (input and output of channels) degrees of freedom are described in this framework. For example, we can define a “causal joint state” as
\[\varrho_{AB} = \sqrt{\rho_A} \varrho_{B|A} \sqrt{\rho_A},\]
where \(\rho_A\) is the input state to the channel and \(\varrho_{B|A}\) is the Jamiolkowski isomorphic map to the CPT map. This unification is another main theme of the paper, and allows a quantum version of Bayes’ theorem to be defined that is independent of the causal scenario.

The Wonderful World of Conditional States

To end with, here is a list of some things that become conceptually simpler in the conditional states formalism developed in the paper:

  • The Born rule, ensemble averaging, and quantum dynamics are all just instances of a quantum analog of the formula \(P(Y) = \sum_X P(Y|X)P(X)\).
  • The Heisenberg picture is just a quantum analog of \(P(Z|X) = \sum_Y P(Z|Y)P(Y|X)\).
  • The relationship between prediction and retrodiction (inferences about the past) in quantum theory is given by the quantum Bayes’ theorem.
  • The formula for the set of states that a system can be ‘steered’ to by making measurements on a remote system, as in EPR-type experiments, is just an application of the quantum Bayes’ theorem.

If this has whet your appetite, then this and much more can be found in the paper.

Foundations Mailing Lists

Bob Coecke has recently set up an email mailing list for announcements in the foundations of quantum theory (conference announcements, job postings and the like). You can subscribe by sending a blank email to quantum-foundations-subscribe@maillist.ox.ac.uk. The mailing list is moderated so you will not get inundated by messages from cranks.

On a similar note, I thought I would mention the philosophy of physics mailing list, which has been going for about seven years and also often features announcements that are relevant to the foundations of quantum theory. Obviously, the focus is more on the philosophy side, but I have often heard about interesting conferences and workshops via this list.