Philosophy Department masthead

INFERENCE TO THE BEST EXPLANATION AND EMPIRICALLY EQUIVALENT THEORIES  

Richard Bohms           

 

Paul Thagard, Peter Lipton, and others have claimed that Inference to the Best Explanation (hereafter IBE) has a number of distinctly philosophical applications apart from its role in scientific explanation. [1] I intend to show that the criteria presented by Thagard and Lipton are insufficient to motivate a choice between at least some empirically equivalent theories and that IBE therefore fails to resolve essentially metaphysical debates. I will do so by applying the criteria of IBE to the question of the interpretation of quantum mechanics, focusing on the ontological assumptions involved in the formulations of the three leading versions of the theory.

 

 

The Uncertainty Principle

 

The starting point, for the illustrative purposes of this essay, will be the Heisenberg Uncertainty Principle, which states that the product of the uncertainties in the measurement of a particle’s position and momentum cannot be less than Planck’s constant divided by 2p. [2] (A similar relation exists between time and energy.) In other words, the relative degrees of precision to which each of these properties can be measured are inversely related, as a very precise measurement of a particle’s position results in a correspondingly imprecise measurement of a particle’s momentum, and vice versa. This is an empirically confirmed fact, accepted by all versions of quantum mechanics. But how are we to understand the underlying reality that produces these observations? What, in fact, is going on at the quantum level of the world?

 

 

The Copenhagen Interpretation        

 

The Copenhagen interpretation of quantum theory, named for the city in which its founder, Niels Bohr, did most of his work, is the most widely accepted interpretation among physicists, and is usually the only version taught in physics textbooks. According to the Copenhagen interpretation, a particle is only real while it is being observed. When the particle is not being observed, what actually exists is a probability function wave containing the possible positions and momenta of the particle. [3] The act of observation, or measurement, causes this wave function to “collapse” to a particular value (within a range of accuracy dependent upon the capability of the measurement apparatus), which causes the probability wave of the other property in the uncertainty relation to become more imprecise to the degree required to maintain the inequality. So a particle does not, and cannot, possess both a precise position and a precise momentum at the same time.

           

Suppose the position of a particle was measured with absolute precision, so that the uncertainty was zero. By Heisenberg’s equation, h/2pDp would be infinite, since it involves division by zero. This would mean that the particle could have any momentum whatsoever. [4] Similarly, suppose that we could determine a particle’s momentum with absolute precision. By the uncertainty relation, we would have to conclude that the particle could be anywhere. It is possible to obtain approximate measurements such that the total uncertainties are minimized, but in this case the particle has neither a position nor a momentum, but only a probability function, much the same as in its unobserved state.

            One of the consequences of this theory is that consciousness and observation play an integral role in determining reality. This is reminiscent of Berkeley’s famous maxim, esse est percipi; [5] but in a more contemporary context, it has much in common with the logical positivist doctrine that was prevalent in analytic philosophy at the time of its development. According to logical positivism, a proposition is only meaningful if it can either be known a priori through the meaning of its terms, or else be rendered probable through empirical experience (or at least as a logical deduction from such experience.) Any non-tautologous proposition that is not verifiable through experience cannot even be judged “false,” but rather, meaningless. [6] To a logical positivist, it makes no sense to speak of what a quantum particle is “really” doing; our experience only gives us observational results to the extent allowed by the uncertainty principle, and any attempt to probe beyond the limit of our experience is metaphysical speculation.

           

Some philosophers of science and philosophically inclined physicists, however, reject this as an inadequate portrayal of reality. The true state of the universe, it may be argued, cannot be dependent on the presence of conscious observers, as this would be inconsistent with our understanding of the origin of the universe and the development of life. While the universe is 15 billion years old, life on earth is only 4 billion years old, and multicellular organisms have only been present for 600 million years. It seems absurd that 96 percent of the history of the universe could be determined only upon the development of conscious life in its final 4 percent of its history. Another objection is that even the conscious observers are composed of quantum particles, and therefore subject to the laws of quantum mechanics. It seems as though reality can only be defined by the observations of a non-quantum observer external to the universe that cause its wave function to collapse.

 

 

The Many-Worlds Interpretation      

 

The difficulties of the Copenhagen interpretation led Hugh Everett to formulate an alternative version of quantum mechanics, known as the “many-worlds” interpretation. In this interpretation, at any time at which N quantum states are possible, the universe divides into N otherwise identical copies, each of which contains one of the possible quantum states. This means that every possible quantum state is actualized in some universe. More recently, to avoid the idea of a mechanical dividing process, David Deutsch has proposed that the number of universes remains constant. Most universes begin in an identical state, and measurement causes differentiation, rather than division. [7]

           

The most prominent criticism of this view is that it involves too much metaphysical baggage. To posit the existence of an unimaginable number of universes seems to be an extreme solution to the problem of understanding the nature of quantum measurement. Its proponents, however, reply that this theory is epistemologically simpler than competing theories, and therefore, that the trade-off in metaphysical complexity is justified. Rather than stipulate an assumption that conscious observation causes a wave function to collapse to a particle state, many-worlds allows every mathematically possible quantum state to be actualized.

           

Another perceived weakness of this view is that the existence of the other worlds is beyond empirical verifiability. Since the observer himself is divided into identical copies along with the universe, the observer can only be aware of his own universe, but not those in which his duplicates exist. Within the past twenty years, though, an experiment has been conceived which could detect other universes by using an intelligent computer, though this experiment is clearly beyond our current technological capability. [8] It depends, for its success, in recombining universes that have divided. Branching is not a one-way process, which actually is not all that strange considering that the laws of quantum physics are time symmetric.

 

 

The Hidden Variables Interpretation

 

The third main view of quantum mechanics is the “hidden variables” theory, which, of the various formulations, most closely resembles reality as we perceive it on a macroscopic level. Particles exist in a definite position, with a definite momentum, at all times. The uncertainty principle is an epistemological barrier, rather than an ontological barrier. In order to measure the position of a particle as precisely as possible, it is necessary to use a photon with a short wavelength, since the position cannot be determined to any greater accuracy than the wavelength of the measuring photon. However, a photon with a short wavelength carries a high energy, some of which will be transferred to the particle during the measurement. We cannot predict the effect of this energy transfer (in theory, it would require complete knowledge of the quantum states of both the particle and the photon, but then we would not need to perform a measurement), but it is certain that the interaction of the measurement process will affect the momentum of the particle. Conversely, in order to measure the momentum precisely, it is necessary to use a photon with a low energy; that is, one with a longer wavelength. As we have seen, though, this limits the precision to which the particle’s position can be determined. Uncertainty, then, can be seen as fundamental to the process of measurement, rather than as fundamental to the intrinsic nature of the particle.

           

The trade-off for the hidden-variable theory is that “locality” must be sacrificed. For the theory to work, information must be exchanged faster than the speed of light (which is not supposed to be allowed by Einstein’s theory of relativity.) [9] David Bohm’s version of this theory accomplishes this information transfer by means of the “quantum potential,” which instantaneously measures the state of the entire universe and communicates it to every particle. [10]

 

 

IBE as a Potential Solution   

 

All three of the theories considered here are empirically equivalent. They are based on the same observational evidence, and they make identical observational predictions. Therefore, the question of the interpretation of quantum mechanics cannot be decided by empirical science, but must be approached philosophically. It is a question of metaphysical, rather than physical, explanation. Given that the observational data is universally agreed on, which of the explanations we have considered best explains that data? Intuitively, we take this to mean the theory that is most likely to be true, but short of resorting to the formal probability calculus (with the associated difficulties in determining the appropriate values for the components of the relevant equations), what features might guide us to likeliness? What should we look for in the “best” explanation?

 

 

Consilience

 

In choosing a theory, we intuitively desire a theory that is capable of a great range of explanation. The more facts that a theory is able to explain, we believe, the more likely it is to be true. (We only hold this belief to a certain extent. Most of us reject conspiracy theories precisely because they manage to explain everything as part of the conspiracy. Such a theory is perhaps “too good to be true.”) A classic example of explanatory power is the Newtonian theory of mechanics, which, within the confines of a single theory, was able to account for planetary orbits, the motions of comets, and nautical tides, all of which had previously required separate theories for their explanation. [11]   Thagard calls this notion of explanatory power the criterion of consilience, which he explains by saying,

Roughly, a theory is said to be consilient if it explains at least two classes of facts. Then one theory is more consilient than another if it explains more classes of facts than the other does. Intuitively, we show one theory to be more consilient than another by pointing to a class or classes of facts which it explains but which the other does not. [12]

 

A recent example from the history of physics can be seen in string theory, which was originally proposed as an explanation of the strong nuclear force. Its early proponents tried for years to eliminate a massless, spin two particle that kept appearing in their calculations. Finally, it was realized that the properties of this “unwanted” particle matched those of the graviton. So it turned out that the theory could in principle account for a quantum description of gravity, though that was not a factor in the development of the theory. [13]

 

It is difficult to see how the criterion of consilience can help make a decision regarding quantum theory, though. Because we are considering empirically equivalent theories, which predict identical observational results, the theories necessarily explain the same classes of facts. Thus, the three theories rank equally on this measure, and other factors must be considered if a decision is to be possible.

 

 

Simplicity

 

Another consideration in accepting a theory is that we prefer theories that do not require a host of background assumptions, particularly those that appear to serve only to explain a particular issue within the larger theory. (If the background assumption too obviously serves this purpose, we are inclined to dismiss it as ad hoc.) Fresnel argued that the wave theory of light was superior to Newton’s corpuscularian theory because it did not require Newton’s assumption of “fits of easy transmission and easy reflection.” [14] To the extent that a theory avoids these types of assumptions, Thagard considers it a simple theory. He defines simplicity as “a function of the size and nature of the set A [of auxiliary hypotheses] needed by a theory T to explain facts F.” [15] Combining the notions of consilience and simplicity, Thagard states, “A simple consilient theory not only must explain a range of facts; it must explain these facts without making a host of assumptions with narrow application.” [16] Thagard emphasizes that his use of simplicity must not be confused with ontological economy. Ontological economy [17] is a relative feature of the competing theories themselves, whereas simplicity in the sense intended by Thagard is a relative feature of the corresponding sets of auxiliary hypotheses, which may or may not involve ontological claims. Thagard writes, “Ontological complexity does not detract from the explanatory value or acceptability of a theory, so long as the complexity contributes toward consilience and simplicity...Ontological economy is not an important criterion of the best explanation.” [18]

 

Thagard’s notion of simplicity does not help us to resolve our question, as each theory depends on auxiliary hypotheses that serve similar functions within the confines of their respective theories. The Copenhagen interpretation depends on the assumption that conscious observation causes a wave function to instantaneously collapse from a probability state to a definite value. The Everett many-worlds interpretation relies on a process in which the universe instantaneously divides every time a quantum choice is to be made in order to actualize every possibility. (It also allows universes to recombine to allow wave interference phenomena to occur.) The Bohmian hidden variable system requires us to posit a “quantum potential” wave, which is able to instantaneously convey the entire state of the universe to a particle. (The careful reader will note that all three forms of quantum mechanics require non-locality, or action at a distance, as previously noted.) [19] Because these auxiliary hypotheses serve a similar purpose, we cannot reject any of them as ad hoc in order to prefer another. If ad hocness is a legitimate charge, it should weigh equally against all three theories.

 

While Thagard cautions that simplicity does not mean ontological economy, in this case the auxiliary hypotheses are of an ontological nature, so it is reasonable to claim that ontological economy should play a major role in the evaluation of relative simplicity. In this light, perhaps many-worlds may be judged the most ontologically complex of the three, but this must be balanced against the perceived epistemological benefits of the theory, and there is no a priori requirement to prefer one sort of simplicity over another. Likewise, it may be argued that the Copenhagen interpretation is simpler because it does not require a quantum potential wave, but a hidden variable theorist may reply that his theory is simpler because it removes uncertainty from its ontological status and makes it instead an epistemological issue. Simplicity, then, can be a very complex issue, and it does not help us to resolve our question.

 

 

Analogy          

 

We do seem to grasp unfamiliar concepts more quickly if we can relate them somehow to concepts that are more familiar. For example, many people are taught to understand molecular structure in high school chemistry by being shown balls of different sizes and colors that are connected by wooden dowels. Sometimes the structure of the atom is explained as being somewhat like the solar system, with the sun representing the nucleus and the planets representing the electrons. While these illustrations can be useful, though, we do recognize the danger of taking similarities too far. For example, while in the solar system, each planet has its own orbit, while several electrons may share the same orbit within the atom. But generally we will wind up with a better understanding of the new, unfamiliar concept because we are able to draw an analogy to something more familiar to us. Thagard defines analogy by way of the following explanation.

 

Suppose A and B are similar in respect to P, Q, and R, and suppose we know that A’s having S explains why it has P, Q, and R. Then we may conclude that B has S is a promising explanation of why B has P, Q, and R. We are not actually able to conclude that B has S; the evidence is not sufficient and the disanalogies are too threatening. But, the analogies between A and B increase the value of the explanation of P, Q, and R in A by S. [20]

 

At first glance, analogy may seem to favor a hidden variable theory, but it is simple enough to draw a disanalogy with macroscopic mechanics by considering the two slit experiment. Suppose we perform our classical experiment by throwing baseballs at a wall with two vertical slits in it, each one large enough for a baseball to pass through. Furthermore, to introduce an element of randomness, we will blindfold the pitcher and point him in the direction of the wall, so that he could not intentionally hit one slit as opposed to the other. This will result in most of the pitches bouncing off the wall, rather than penetrating it. After enough pitches are thrown, we will observe that two piles of baseballs have formed, one behind each slit. This will happen whether or not anyone happens to watch the baseballs pass through the wall.

           

If, however, we repeat the experiment on the quantum level in its more familiar form, using photons or electrons, we will get different results depending on the observations that are made. If we try to detect or observe the particles as they pass through the slits, we will get a similar result to our baseball experiment. However, if we choose not to detect the particles as they pass through the slits, we will find a wave interference pattern behind the barrier, which does not happen in the baseball experiment. So we cannot draw an analogy between classical and quantum mechanics, because the observational data yields many results that are not predicted by classical mechanics, and, in fact, it is on account of these disanalogous results that we require an explanation. If the analogy were sustainable, then the principles of classical mechanics would suffice and we would not have needed to develop a separate theory of quantum mechanics (not to mention proceeding to debate its proper interpretation.)

 

 

Other Notions of IBE

 

Thus far, I have focused on Thagard’s criteria, rather than Lipton’s, because Lipton’s main criterion that we should prefer the “loveliest possible explanation” mainly falls within the confines of Thagard’s criterion of consilience. [21] One aspect of Lipton’s formulation of IBE that deserves closer attention, though, is his claim that we prefer explanations that fit with our background beliefs. [22] If interpreted as a descriptive statement, it may suffice as a general description of how philosophers and scientists with philosophical interests go about making their choice of theories. But it clearly does not apply as an account of the thought development of most theoretical physicists, who are taught the Copenhagen interpretation in their textbooks and coursework, and instead revise their background beliefs to conform to the theory.

           

However, if advocated as a prescriptive principle, it makes Lipton’s claim that loveliness is a guide to likeliness dubious. If we take “background beliefs” to mean metaphysical principles, inference may be impossible. In making an inference, we are inferring from the observations to the hypothesis. If, however, we are holding to prior metaphysical commitments, it seems as though we are accommodating the evidence to the hypothesis, with no inferential process involved. So it may be more appropriate in this context to restrict the use of “belief” to refer to background knowledge, [23] but if this is taken to mean shared empirical observations, we are left facing the very problem of underdetermination that we are trying to solve. 

 

It may be suggested, though, that perhaps Lipton is misguided in pursuing loveliness instead of likeliness. Perhaps a probabilistic conception of IBE could successfully address this question. But it cannot be done in a Bayesian manner because the calculation using Bayes' theorem reduces to the relative probability of the priors. [24] P(E) is 1, because the evidence is established by observation. Because all three theories predict the same observational data, the P(E|H) terms also equal 1, so the probability of any of the theories, conditioned on the evidence, is identical to the prior probability of the theory. However, it is far from obvious how these probabilities may be determined.

 

 

Conclusion     

 

IBE does not succeed in assessing the philosophical merits of competing empirically equivalent theories. When attempting to apply the criteria, we find ourselves pulled in different directions by the criteria. At times, we are even pulled in different directions by emphasizing different aspects of the same criteria. Without the possibility of empirically distinguishing the theories, and without any a priori metaphysical principles to guide our choice, our selection of any particular theory as the best explanation seems arbitrary. This may be attributed to a fundamental difference between philosophical and scientific explanations. Scientific disputes can be decided on the basis of empirically observable phenomena, while philosophical disputes cannot.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 



[1] Paul Thagard, “The Best Explanation: Criteria for Theory Choice,” Journal of Philosophy 75 (1978): 92;

Peter Lipton, Inference to the Best Explanation (New York: Routledge, 1991),  pp. 70-73.

[2] In mathematical symbolism, DpDq³h/2p.

[3] I am dismissing, perhaps unfairly, an instrumentalist understanding of quantum mechanics, and assuming an ontological commitment on the part of the Copenhagen theorist.

[4] Momentum equals mass times velocity. Due to the equivalence of mass and energy in relativity theory, the particle’s velocity will still be limited by the speed of light, as the particle’s mass approaches infinity as the velocity approaches c.

 

[5] Principles of Human Knowledge, Part I, sect. 3.

[6] E.D. Klemke (ed.), Contemporary Analytic and Linguistic Philosophies (Buffalo: Prometheus, 2000), p. 31.

[7] P.C.W. Davies and J.R. Brown (eds.) The Ghost in the Atom (Cambridge: Cambridge, 1986), pp. 34-38. Much of the critical analysis of this position is also taken from this section.

[8] David Deutsch describes this experiment in P.C.W. Davies and J.R. Brown (eds.) The Ghost in the Atom (Cambridge: Cambridge, 1986), pp. 95-101.

[9] Although, as I will show later, all three versions of quantum theory require us to abandon locality, this is usually only perceived as a weakness for the hidden variable theory. My own suspicion is that this is a historical product of Einstein’s own advocacy of a (local) hidden variable theory in his long-running debate with Neils Bohr.

[10] The details of Bohm’s theory are presented in D. Bohm and B.J. Hiley, The Undivided Universe (London: Routledge, 1993)

[11] Thagard, p. 81.

[12] Ibid., p. 79.

[13] For the most thorough non-technical treatment of string theory, see Brian Greene, The Elegant Universe (New York: Vintage, 1999)

[14] Thagard, p. 86.

[15] Ibid.

[16] Ibid., p. 87.

[17] Also referred to as metaphysical simplicity; hence the potential confusion.

[18] Ibid., p. 89.

[19] Victor Stenger notes that an instrumentalist application of the Copenhagen interpretation denies ontological status to the wave function. In this case, we do not have to accept non-locality, but at the cost of saying nothing about the world. Timeless Reality (Buffalo: Prometheus, 2000) p. 124, 135.

[20] Thagard, p. 90.

 

[21]   “...we may characterize the best explanation as the one which would, if correct, be the most explanatory or provide the most understanding: the ‘loveliest’ explanation.” Lipton, p. 61.

[22] Ibid., p. 119.

[23] Lipton does not clearly distinguish between the two uses on pp. 118-119.

[24] The standard form of Bayes’ Theorem is P(HçE) =  P(H) * P(EçH) / P(E). P(HçE) is the probability of H conditioned on E, P(H) is the prior probability of H, P(EçH) is the likelihood of E, and P(E) is the expectedness of E.

 


 

Gold navigation bar image

1903 West Michigan Avenue
Kalamazoo, MI 49008-5328
(269) 387-4389 (voice) | (269) 387-4390 (fax)
Copyright © 2003
philosophy@wmich.edu
Last Updated: June 23, 2004