A Reading List on the Foundations of Probability and Statistics

The continuing saga of time-travel in the quantum universe has been delayed because I have been working hard to finish writing a paper. Rest assured, it is coming in the next week or two. For now, I have been getting more interested in the foundations of probability and statistics. More accurately, I have always been interested (and opinionated) on the subject, but I have recently become interested in reading more widely around the subject, in the hope that I will actually come to know what I am talking about. The literature on this subject is vast, so I have decided to concentrate on the arguments for different conceptions of probability and how they are used to justify statistical methodology. I have also decided to concentrate on books and collections rather than listing references to original papers, except in a few instances where I could not find a collection containing an important paper. The references are generally to the most recent edition of the texts rather than the originals. I have added comments to the references that I know something about, and will add more as I read through them. If anyone thinks I have missed something vital then please mention it in the comments.

Disclosure: All links to Amazon are affiliate links.

General Introductions

T. L. Fine, Theories of Probability (Academic Press 1973)
READ This is an excellent book, but it is not for the feint of heart. Fine does not view any of the major approaches to probability as adequate, so some parts of the book are a bit ranty, but personally I do love a good rant. It covers most of the major approaches to probability theory in full gory mathematical detail. This includes, axiomatic, relative frequency, algorithmic complexity, classical, logical and subjective approaches. Fairly unique to this text is a comprehensive treatment of comparative probability, where you just have a relation of “more probable than” rather than a quantitative measure of probability. This occurs right at the beginning of the book, and may put some readers off as it is extremely technical and unfamiliar. However, once Fine gets to the more familiar territory of quantitative probability, the book becomes a lot more readable. If you are interested in the mathematical foundations of probability then you will not find a better book. A final warning is that some sections of the book are a bit out of date, since it was written in the 1970’s and there has been a lot of progress in some areas since then, e.g. in maximum entropy methods and algorithmic complexity. Nevertheless, nobody has done such a comprehensive job of covering the mathematics since this book was published.
Maria Galavotti, A Philosophical Introduction to Probability (Stanford: Center for the Study of Language and Information Publications 2005)
READ A better title for this book would be “A Histotico-Philosophical Introduction to Probability”. Galavotti covers all the standard interpretations of probability: classical, frequentist, propensity, logical, subjective; but she does so by focussing on the people that developed these views. Each chapter consists of sections devoted to individual researchers in the foundations of probability, beginning with a potted biography followed by a description of their view. This contrasts with other introductory texts, which tend to focus on a specific version of each viewpoint, e.g. von Mises theory of frequentism is usually discussed in detail, with only passing mentions of the other proponents like Venn and Reichenbach. This historical approach is useful as an entry point to the historical literature and has the advantage that it covers a wider variety of opinions than other introductory texts. There are several people who are frequently mentioned in the modern literature, but usually without a detailed description of their viewpoint. From this point of view, I found the accounts of Reichenbach, Jeffreys and Ramsay very useful. Reichenbach was a frequentist, but he took a Bayesian approach to statistical inference. Given the close association between frequentism and classical statistics on the one hand, and subjectivism and Bayesian statistics on the other, it is easy to overlook the possibility of Reichenbach’s position and to view criticisms of classical statistics as criticisms of frequentism in general. The treatment of Ramsay is especially good, as this is an area in which Galavotti has done significant scholarship. Ramsay is one of the originators of the subjective view of probability, but he is usually viewed as a pluralist about probability because he made comments to the effect that a different account of probability is required for science in his published essay. Unfortunately, Ramsay died before he completed his account of probability in the sciences. Using unpublished notebooks as sources, Galavotti argues that Ramsay was not a pluralist, and that his account of scientific probability would have been in terms of the stability of subjective probabilities. This is not completely conclusive, but represents an interesting alternative to the usual account of Ramsay’s viewpoint.

However, there are three negatives about Galavotti’s approach in this book. Firstly, given the number of viewpoints she discusses, many of the discussions are too brief to give a real understanding of the subtleties involved. Secondly, her treatment of the basic features of probability theory at the beginning of the book is rather awkward, and would be confusing for someone who had never encountered probability before (part of the awkwardness may have to do with the fact that this is a translation of the Italian original). Thirdly, mathematics is eschewed in this book, even where it would be extremely helpful. In some cases, the main objections to a viewpoint are that the mathematics does not say what the proponents would like it to say, and it is not possible to do justice to these arguments without writing down an equation or two. Therefore, even though this book has “Introduction” in the title, I cannot recommend it as a first textbook on the subject. It would be better to read something like Hacking or Gillies first, and then to use this as supplementary reading to get some historical context. Overall, this is a distinctive and original work, and provides a useful complement to more conventional textbooks on the subject.

Donald Gillies, Philosophical Theories of Probability (Routledge 2000)
READ This the the best introductory textbook on the foundations of probability from a philosophy point of view that I have read. The first part of the book treats most of the best known theories of probability: Classical, logical, frequentist, subjective Bayesain and propensities. The only mainstream interpretation that is missing is a discussion of Lewis’ conception of objective chances and the principal principle. This is a shame as it is currently one of the most fashionable, particularly amongst the philosophers of quantum theory that I associate with. Gillies does a good job of explaining the distinction between objective and subjective approaches to probability and the discussion of the merits and criticisms of each view is largely balanced and measured. Places where mathematical technicalities come up, such as infinite sample spaces and limit theorems, are described accurately. Even though the mathematical technicalities are omitted, as is appropriate in an introductory book, what he says about them is conceptually accurate. The second part of the book lays out the author’s own views on probability, which include a defence of a pluralist approach where different interpretations of probability are appropriate for different subject areas. He comes down in favour of a propensity-long run frequency view of objective chances and a subjectivist view of other probabilities, with a spectrum of other possibilities in between. There is much that I disagree with in this part of the book, but this is not a big criticism because almost all philosophical textbooks become controversial when the author discusses their own views. For completeness, here are the main points that I disagree with:

  1. I think the argument that exchangeability cannot justify statistical methodology is based on a double standard about the degree to which interpretations of probability are allowed to be approximate. Frequentist theories are given far more lenience in the degree to which they only have to approximate reality.
  2. I do not think that the “intersubjective” interpretation of probability is distinct from the usual subjective one. The distinction is based on a misunderstanding about what an “agent” is in the subjective theory. It is not necessarily and individual human being, but could be a well-programmed computer or a community that has approximately shared values. Thus, the intersubjective theory is just a special case of the usual subjective one.
  3. I do not agree with the pluralist view of probability. For example, the argument that probabilities in economics are fundamentally different from those in natural sciences is based on our ability to conduct repeatable experiments. This is a feature of our epistemic situation and not a feature of reality. For example, we could imagine a race of aliens that are able to create multiple copies of the planet Earth that are identical in all factors that are relevant for economics. They could then perform experiments about economics that have the same status as the experiments that we perform in physics. I also think that Gillies distinction fails to take account of the way probabilities are used in modern subjects such as quantum information theory, where you surely have subjective probabilities infecting our description of natural physical systems.

Despite these criticisms, which would take a whole article to explain fully, this is still an extremely good introductory text.

Ian Hacking, An Introduction to Probability and Inductive Logic (CUP 2001)
READ A general introduction geared towards philosophy students. Would be suitable for undergrads with no prior exposure to probability, but perhaps a little logic and/or naive set theory. A bit simplistic for anyone with a stronger background than that, but the later chapters may be useful to those unfamiliar with the different philosophical approaches to probability theory.
Alan Hájek, Interpretations of Probability, The Stanford Encyclopedia of Philosophy (Spring 2010 Edition), Edward N. Zalta (ed.), http://plato.stanford.edu/archives/spr2010/entries/probability-interpret/
READ As usual for the Stanford Encyclopedia of Philosophy, this is a good summary and starting point for references.
D. H. Mellor, Probability: A Philosophical Introduction (Routledge 2005)
READ I have to admit that this textbook on the philosophy of probability left me feeling more confused than I was when I started reading it. Perhaps this is because this is definitely a philosophy textbook and Mellor does does not shirk on the philosophical jargon from time to time, e.g. “Humean theory of causality”. One thing that I liked about the approach taken in this book is that Mellor introduces three kinds of probability — objective chances, epistemic probability and credences — right at the beginning and then goes on to discuss how each interpretation of probability reads them. This is in contrast to most other treatments, which make a broad distinction between objective and subjective interpretations and then go on to discuss each interpretation on its own without any common thread. Mellor’s approach is better because each of the interpretations of probability has its own scope, e.g. von Mises denies the relevance of anything but chances and subjectivists deny everything except credences, so this makes it clear when the different interpretations are discussing the same thing, and when they are attempting to reduce one concept to another. The only problem with this approach is that it suggests that there definitely are three types of probability and hence it effectively presupposes that a pluralist approach is going to be needed. I would prefer to say that there appear to be three types of probability statement in the language we use to discuss the theory, and that an interpretation of probability has to supply a meaning for each kind of statement, without assuming at the outset that the different types of statement actually do correspond to distinct concepts. Unlike Gillies, this book does include extensive discussion of the principal principle and its relatives, which is a good thing. However, I found Mellor’s discussion of things like limits and infinite samples spaces to be much more misleading than the discussion in Gillies. For example, when he introduces the concept of limit he uses an example of a function that tends to its limit uniformly and from one side. This is unlike probabilistic limits which can be subject to large fluctuations. He also suggests at one point that the only probabilities that make sense on an infinite sample space are zero and one, before going on to correct himself. Whilst he eventually does get the concepts right, these sort of statements are apt to confuse. Now, in an introductory philosophical text, I do not expect every mathematical concept to be treated in full rigour, but Gillies shows that it is possible to discuss these concepts at a heuristic level without saying anything inaccurate. Finally, Mellor has a tendency to assume that the mathematics does say what the advocates of each interpretation want it to say and then goes on to criticize at a more conceptual level, whereas I think that some of the most effective arguments against interpretations of probability are just that the mathematics does not say what they need it to say. For all these reasons, I would recommend this as a supplementary text, particularly for philosophers, but not as a first introductory text on the subject.

General Collections

These are collections of papers that are not specific to a single approach. Collections on specific approaches are listed in the appropriate sections.

Antony Eagle (ed.), Philosophy of Probability: Contemporary Readings (Routledge 2010). Due to be published Nov. 19th.
UNREAD Contains many of the classic papers including de Finetti, Popper and Lewis, as well as modern commentary.

History

Ian Hacking, The Emergence of Probability: A Philosophical Study of Early Ideas about Probability, Induction and Statistical Inference, 2nd edition (CUP 2006)
READ In this book, Hacking takes a look at the emergence of the concept of probability during the enlightenment. The history goes up to Bernoulli’s proof of the first limit theorem. Hacking’s goal in looking at the history is to defend a philosophical thesis. Modern debates on the foundations of probability and statistics are focussed on whether probability should be regarded as an objective physical concept (what Hacking calls aleatory probability), usually framed in terms of frequencies or propensities, or as an epistemic concept concerning our knowledge and belief. Hacking argues that it was essential to the development of the probability concept that both of these ideas arose in tandem. The history is fascinating and Hacking’s argument provides a lot of context for modern debates.
Ian Hacking, The Taming of Chance (CUP 1990).
UNREAD My impression is that it covers a later period of history than the previous text.
David Salsburg, The Lady Tasting Tea: How Statistics Revolutionized Science in the Twentieth Century (Holt McDougal 2002)
READ This is a popular-level book about mathematical statistics. It is a very difficult subject to write a popular book about. Most popular science books are about the weird and wonderful things that we have discovered about reality, but this book is about the evolution of the methods that we use to justify such discoveries, and hence it is one level more abstract. It is really difficult to convey the content of the different approaches without using much mathematics. Salsburg’s approach is historical and he largely tackles developments in chronological order. The first half of the book covers Pearson, Fisher and Neyman-Pearson(II) and the invention of mathematical statistics in the twentieth century. He paints a vivid picture of the personalities involved and conveys how their ideas revolutionized the scientific method. I learnt a lot from this part of the book. In particular, I did not realize the extent to which the attempt to prove Darwinian evolution was a driving force in the development of mathematical statistics and I also did not realize how important UCL was in the early development of the subject. The second part of the book covers more modern developments and is less successful. The reason for this is that research in statistics has expanded into a vast number of different directions and is still a work in progress. Therefore, it is not clear what future generations will think the most important developments are. Salsburg deals with this by basing most of the remaining chapters on the life and theory of individual statisticians. However, these sketches are too brief to leave much of a lasting impression on the reader. There is also a brief discussion of classical vs. Bayesian statistics. Whilst the Bayesian methodology is given some credit, Salsburg is pretty firmly in the classical camp and the dismissal of Bayesianism essentially boils down to the fact that he thinks it is too subjective. I would have liked to have seen a more balanced discussion of this. Overall, this is an interesting book and I recommend it to anyone interested in the foundations of probability and statistics as supplementary reading. Salsburg is to be admired for attempting a popularization of such an important topic, but I have my doubts about how much readers with no prior background in statistics will get out of the book.
Stephen M. Stigler, The History of Statistics: The Measurement of Uncertainty before 1900 (Harvard University Press 1986)
UNREAD A very well regarded treatment of the early history of statistics.
Jan von Plato, Creating Modern Probability: Its Mathematics, Physics and Philosophy in Historical Perspective (CUP 1994)
UNREAD Said to be selective in its treatment of the history. Attempts to unify von Mises with de Finetti at the end of the book.

@scidata pointed me to this correspondence (pdf) between Fermat and Pascal, which provides a record of early ideas about probability.

Classical (Laplacian) Approach To Probability

Pierre Simon Marquis de Laplace, A Philosophical Essay on Probabilities (Dover 1996)
UNREAD One of the earliest attempts to outline the theory of probability. Origin of the principle of indifference.

Frequentism

Richard von Mises, Probability, Statistics and Truth (Dover 1981)
UNREAD The canonical work on the ensemble-based, frequentist approach to probability.

Subjective/Personalist Bayesianism

As you may be able to detect from the structure of the list, this is my current favored approach, and I have a particular fondness for the works of de Finetti and Jeffrey. This may change as I read further into the subject.

José M. Bernardo and Adrian F. M. Smith, Bayesian Theory (Wiley 2000)
READ (well, at least the first few chapters). The modern technical “bible” of subjective Bayesianism. Contains a very intricate decision theoretic derivation of probability theory that is much more complicated than Savage as well as virtually every theorem that crops up in subjective foundations.
Bruno de Finetti, Theory of Probability: A Critical Introductory Treatment, 2 volumes (Wiley 1990)
READ. Despite the title, this is not really suitable for those without a background in probability theory or the foundational debate. Contains the loss-function approach where one takes previsions (the subjective correlate of expectation values) as fundamental rather than probabilities. Also contains extensive discussion of the finite vs. countable additivity debate and the de Finetti representation theorem.
Bruno de Finetti, Probabilism (1989), Erkenntnis, 31:169-223.
PARTLY READ. English translation of Probabilismo, which was de Finetti’s first work on subjective probability from 1937. Needs to be read in conjunction with Richard Jeffrey, Reading Probabilismo (1989), Erkenntnis, 31:225-237.
Bruno de Finetti, Philosophical Lectures on Probability, edited by Alberto Mura (Springer 2008)
READ Based on transcripts of a graduate course given by de Finetti in 1979. This is for die-hard de Finetti fans only. He was obviously pretty senior when he gave this course and there is a lot of repetition. It is useful if you are a scholar who wants to pin down precisely what the later de Finetti’s ideas on fundamental topics were. Everyone else should read de Finetti’s textbook instead.
Richard Jeffrey, Subjective Probability: The Real Thing (CUP 2004). Free pdf version
READ A very readable introduction to the basics of the subjective approach. Also discusses Jeffrey conditioning (a generalization of Bayes’ rule) and applications to confirmation theory.
Richard Jeffrey, The Logic of Decision 2nd edition (University of Chicago Press 1990)
READ A philosopher’s account of the decision theoretic foundations of subjective probability. Jeffrey’s approach to decision theoretic foundations differs from the more commonly used approach of Savage in that he ascribes both probabilities and utilities to propositions, whereas Savage assigns probabilities to “states of the world” and utilities to “acts”. In general, Jeffrey also allows utilities to change as the state of belief changes, which helps to solve problems with the Bayesian treatment of things like the prisoner’s dilemma and Newcomb’s paradox. The representation theorems in Jeffrey’s approach are not as strong as in Savage’s, i.e. the probability function is not quite unique unless utilities are unbounded. Nevertheless, this is an interesting and arguably more realistic approach to decision theory as it should be applied in the real world. Finally, this book contains a comprehensive treatment of Jeffrey conditioning, which is a generalization of Bayesian conditioning to the case where an observation does not make any event in the sample space certain.
H. E. Kyburg and Howard E. Smokler (eds.), Studies in Subjective Probability (Wiley 1964)
READ This collection is mainly of historical interest in my view. The most relevant paper for contemporary Bayeisanism is de Finetti’s, which is available from a variety of other sources. The collection starts with an excerpt from Venn’s book, which sets the stage by outlining common objections to subjective approaches to probability (Venn was one of the first to present a detailed relative frequency theory). The other paper that I found interesting is Ramsey’s, since this was the first paper to present the modern subjective approach to probability based on Dutch book and decision theoretic arguments.
Leonard J. Savage, The Foundations of Statistics (Dover 1972)
READ The canonical work on decision theoretic foundations of the subjective approach.

Logical Probabilities

Rudolf Carnap, Logical Foundations of Probability (University of Chicago Press 1950)
UNREAD Supposedly one of the best worked out treatments of logical probability.
John Maynard Keynes, A Treatise On Probability (MacMillan 1921) – free ebook available from project Guttenberg
UNREAD Supposedly more readable than Carnap. WARNING – Because this book is out of copyright there are numerous editions available from online bookstores that are of dubious quality. This is why I am not linking to any of the dozens of versions on Amazon. The best advice is to use the Guttenberg ebook or look for an edition from a reputable publisher in a bricks and mortar bookshop. (Irrelevant fact: according to my mother, my maternal grandmother worked as a maid for Keynes.)

Objective Bayesianism and MaxEnt

Arguably, objective Bayesianism is the same thing as logical probabilities, but since I rarely hear people mention Jaynes and Cox in the same breath as Carnap and Keynes, I have decided to give the former their own section. Jaynes, in particular, if far more focussed on methodology and applications than the earlier authors.

Richard T. Cox, The Algebra of Probable Inference (The John’s Hopkin’s Press 1961)
UNREAD Contains the Cox axioms that characterize probability theory as an extension of logic.
Solomon Kullback, Information Theory and Statistics (Dover 1968)
UNREAD Origin of the minimization of relative entropy as an update rule in statistics. Closely related to MaxEnt.
Edwin T. Jaynes, Probability Theory: The Logic of Science
(CUP 2003)
UNREAD The doyen of MaxEnt, in his own words.

Propensities and Objective Chances

Charles Sanders Peirce, Philosophical Writings of Peirce
, edited by Justus Buchler (Dover 1955)
UNREAD Peirce foreshadows the propensity concept of Popper in papers 11-14.
Karl R. Popper, The Propensity Interpretation of the Calculus of Probability and the Quantum Theory (1957) in S. Körner (ed.), The Colston Papers, 9: 65–70 and The Propensity Interpretation of Probability (1959) British Journal of the Philosophy of Science, 10: 25–42
UNREAD Introduces the idea of propensities. Interestingly, for Popper, quantum mechanics provides a strong motivation for the need for single-case probabilities.
Karl R. Popper, The Logic of Scientific Discovery
(Routledge Classics 2002)
UNREAD Chapter 8 explains his views on probability in comparison other approaches.

David Lewis, Philosophical Papers: Volume II
(OUP 1986). Also available online if you have a subscription to Oxford Scholarship Online.
UNREAD Contains a reprint of the 1980 paper that introduced the principal principle, as well as a paper on conditional probabilities.

Application to the Problem of Induction and Philosophy of Science

Most of the philosophy-based introductory texts cover this subject, but these texts are specifically focussed on understanding the scientific method.

John Earman, Bayes or Bust? A Critical Examination of Bayesian Confirmation Theory
(MIT Press 1992)
READ This is a landmark book in the philosophy of Bayesian inference as an approach to the confirmation of scientific theories. Earman is a Bayesian (at least sometimes) and he gives an honest appraisal of the successes and failures of Bayesianism in this context. The book starts with an analysis of Bayes’ original paper, which I did not personally find very interesting because I am more interested in contemporary theory than historical analysis. The second chapter gives a good overview of Bayesian methodology. The remainder of the book discusses the successes and failures of the Bayesian approach. I found the discussion of “convergence to the truth” results, done in terms of martingale theory, particularly insightful, since I have only previously seen this discussed in terms of exchangeability, and the assumptions in the martingale approach seem more reasonable (at least for non-statistical hypotheses which is all that Earman discusses). Earman argues that the “problem of old evidence” is subsumed into the more general problem of how to incorporate new theories into a Bayesian analysis. It is the latter that is the real obstacle to obtaining scientific objectivity in the Bayesian approach. Also interesting is the discussion of the role of a more sophisticated version of Sherlock Holmes style eliminative induction in science, illustrated with a case study of experimental tests of General Relativity. At the end of the book, the Bayesian approach is compared to formal learning theory, with neither really winning the battle. Subject to a mathematical conjecture, Earman shows that formal learning theory does not really have an edge of Bayesianism. Learning theory has developed significantly since the publication of this book, so it would be interesting to see where this debate stands today. The conclusion of the book is rather pessimistic. Bayesianism seems to provide a better account of scientific inference than its rivals, but it does not really license scientific objectivity.
Colin Howson and Peter Urbach, Scientific Reasoning: The Bayesian Approach
, 2nd edition (Open Court 1993)
READ This is a great book that argues for a Bayesian approach to scientific methodology. Most of the other major approaches to probability are well-criticized and it reads well as an introduction to the whole area. I would have liked to see more mathematical detail in some sections, but this is a book for philosophy students and it does have good pointers to the literature where you can follow up on details. Particularly insightful are the chapters that criticize classical statistical methodology, e.g. estimators, confidence intervals, least-squares regression, etc. This goes far beyond the usual myopic focus on idealized coin-flips and covers many topics that are relevant to the design of real scientific experiments, e.g. randomization, sampling, etc. My only complaint is that they kind of wuss out at the end of the book by arguing for a von-Mises style relative-frequency interpretation of objective chances, connecting it to Bayesian probabilities by an asymptotic Dutch book argument that I found unconvincing because it does not refer to a bet whose outcome can be decided (similar remarks apply to the argument for countable additivity). Despite this reservation, this book is valuable ammunition for researchers who want to be Bayesian about everything.
Brian Skyrms, Choice and Chance: An Introduction to Inductive Logic
, 4th edition (Wadsworth 1999)
READ This is not a text about probability per-se, but about how to go about formulating a calculus of inductive inference, in close parallel to the calculus of deductive logic. The usual problems of induction are extensively discussed, so this would be a great companion to a first course in the philosophy of science. Probability is introduced towards the end of the book and, of couse, the whole approach taken by this book biases the discussion towards logical (Keynes/Carnap) approaches to probability. Ultimately, I think that the problems addressed by this book are best treated by a subjective Bayesian approach, and that the construction of an objective calculus of induction is doomed to failure. However, a lot is learnt from the attempt so I would heartily recommend this book to new students of the foundations of scientific methodology.

Criticism

Krzysztof Burdzy, The Search for Certainty: On the Clash of Science and Philosophy of Probability
(World Scientific 2009)
UNREAD I do love a good rant and Burdzy certainly seems to have a lot of them stored up when it comes to the foundations of probability. He argues that neither the frequentist or subjectivist foundation can account for the practice of actual probabilists and statisticians. He also offers his own account, but the criticism seems to be the main point.

Mathematical Foundations

Eventually, a bit of rigorous measure-theoretic probability is needed, so…

A. N. Kolmogorov, Foundations of the theory of probability
, Second English Edition (Chelsea 1956)
UNREAD Classic text from the originator of measure-theoretic probability.
David Williams, Probability with Martingales
(CUP 1991)
UNREAD A lively modern treatment of rigorous probability theory that has been recommended to me many times.

One response to “A Reading List on the Foundations of Probability and Statistics

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.