![]() | ||
PREDICTION AND THE STRONG ADVANTAGE THESISJohn ShoemakerAbstract:
Philosophers of science frequently ascribe special epistemic significance to successful
predictions. In his Inference to the Best Explanation, Peter Lipton makes
a case for what he calls the "Strong Advantage" thesis (SA): "[A] successful prediction tends to provide more
reason to believe a theory than the same datum would have provided for the same
theory, if that datum had been accommodated instead" (Lipton, 1991, 134). I argue that SA is inconsistent with Lipton's own example of "twin
scientists." An examination of Lipton's argument in the light of Bayesian
considerations regarding the relation of theory and evidence clarifies the intuitions
that lead us, mistakenly, to endorse SA.
In the mid-19th century, scientists confronted an unruly zoo of basic chemical elements which seemed to elude any theoretical ordering or pattern (Maher, 274-5). [1] In 1871, Dmitri Mendeleyev proposed that the 60 known chemical elements be arranged according to their atomic weights such that certain family resemblances emerge. When Mendeleyev placed the elements within columns, drawn according to their manifest properties and their respective oxidization and reduction states, he saw that these also formed rows so that the weight of the element in each successive column increased by roughly whole-valued increments. But arranging the elements in this fashion left “holes” in the table, prompting Mendeleyev to predict the weight and characteristics of several then undescribed elements: he named them “eka-aluminum,” “eka-boron,” and “eka-silicon.” His proposal was received with some skepticism by the scientific community, but the subsequent discovery of gallium, scandium, and germanium – the properties of which corresponded to Mendeleyev’s hypothesized elements – quickly convinced the Royal Society to award him the Davy Medal (Maher, 274; Lipton, 134; Ihde, 243-9). [2]
As some interpret this historical case, it was not so much Mendeleyev’s systematization of the elements as his prediction of the existence and properties of several then unknown elements that won him the Davy Medal. From this and similar anecdotes from the history of science, many philosophers of science conclude that it is in general better to predict an observation than to first observe an event and then accommodate it into a theory. For Popperians, including Lakatos, the prediction of novel observations is characteristic of progressive scientific programs (Lakatos, 1970). [3] However, the impression that predictive success makes on the imagination is hardly limited to those minds given to a Popperian persuasion, as William Whewell’s description of the phenomenon demonstrates:
"Men cannot help believing that the laws laid down by discoverers must be in a great measure identical with the real laws of nature, when the discoverers thus determine effects beforehand in the same manner in which nature herself determines them when the occasion occurs. Those who can do this, must, to a considerable extent, have detected nature's secret; – must have fixed upon the conditions to which she attends, and must have seized the rules by which she applies them. Such a coincidence of untried facts with speculative assertions cannot be the work of chance, but implies some large portion of truth in the principles on which the reasoning is founded” (Whewell, 1858, ch. 5, sec. 3, pt. 10).
In other words, predictive success seems to warrant the inference that “nature’s secret” has been discovered in some respect – indeed, arguments for scientific realism are often founded on this very inference. [4] Whewell’s observation also gives us a more precise way of understanding the abovementioned advantage prediction is thought to have over and against accommodation. If predictive success gives us more reason to increase our confidence in a theory’s truth than accommodative success, clearly an epistemic advantage is being described. Thus some philosophers conclude that a successful prediction – a prediction about an as yet unknown experimental outcome or observation that is subsequently found to obtain – tends to confirm a theory (i.e. the theory from which the prediction was derived) more than a successful accommodation of the same events into a theory. [5] This claim is typically justified by an appeal to an inference to the best explanation: the best explanation for a theory’s predictive success is that the theory is approximately true, whereas the best explanation of a theory’s accommodative success is that the theory was designed to fit the data. The latter explanation is then supposed to compete with, preclude, or at least partially “screen off,” the so-called “truth” explanation.
Consideration of Mendeleyev’s achievement, or of similarly dramatic predictive successes, does not warrant the conclusion that prediction itself carries a truly special epistemic value over and against “mere” accommodation, despite arguments which militate on behalf of this claim. A number of thinkers who have written in opposition to the predictive advantage claim have argued convincingly that a rational assessment of evidence regards predicted and accommodated evidence with indifference – most notably, J. M. Keynes (1921), and more recently, Paul Horwich (1982), George Schlesinger (1987), and Robin Collins (1994). In fact, Peter Lipton, whose rationale for a special predictive advantage thesis will shortly be examined, has conceded that a number of the standard arguments given in support of some kind of “predictive advantage” thesis ought to be abandoned as inadequate, or have been satisfactorily refuted (Lipton, 136-8); accordingly, I will concentrate my thoughts on Lipton’s original argument.
The “Strong Advantage” Thesis
In the eighth chapter of his Inference to the Best Explanation, Lipton offers arguments in support of what he dubs the “strong advantage” thesis – “According to the ‘strong advantage’ thesis… a successful prediction tends to provide more reason to believe a theory than the same datum would have provided for the same theory, if that datum had been accommodated instead” (Lipton, 134). [6] After reviewing three common arguments for such a preference, and after one-by-one refuting them, Lipton proceeds to his own argument. It will be instructive as my own exposition and argument unfolds to return to some of the details of these “common” arguments; however, I will attend to Lipton’s own argument primarily. I will contend that Lipton’s arguments fail to demonstrate the truth of the “strong advantage” thesis. However, Lipton’s actual conclusion regarding the advantage of prediction for fallible rational agents remains and deserves attention as well. I will argue that this latter conclusion is limited in scope – it does not recommend a preference for predictive (as opposed to accommodative) success to scientists considering both theory and evidence.
Some clarification of the “strong advantage” (hereafter, the SA) thesis is required. What is at issue is whether the fact that a given observation was predicted by a certain theory ought to give us more reason for believing the theory than if the very same observation were known to have been accommodated by the very same theory. [7] The stipulation that the theory and observational data be the same in both cases serves to isolate the effect prediction has in providing additional evidential value, over and against accommodation. We ought to concern ourselves with this because, as Robin Collins explains,
“[A]ll sides to the debate agree that knowing that a theory predicted, instead of accommodated, a set of data can give us an additional reason for believing it is true by telling us something about the structural/relational features of the theory…. [T]he issue is whether or not the information that a theory predicted, instead of accommodated, a set of data should increase our confidence in the theory given that we already know the relevant structural/relational features of the theory” (Collins, 213).
Those who have championed inference to the best explanation (IBE) – Paul Thagard and Peter Lipton, especially – have offered plausible criteria for discerning better explanations. However, it should be clear that acceptance of these criteria is consistent with a rejection of the SA thesis. [8] Lipton is evidently aware of this, and he takes care to contrast the SA thesis with the “weak advantage” (hereafter, the WA) thesis, according to which “predictions tend to provide more support than accommodations, because either the theory or the data tend to be different in cases of prediction than they are in cases of accommodation” (Lipton, 134). The WA thesis violates what I will call the ceteris paribus conditions explicit in the SA thesis – that is, the relevant theory and observations under consideration in the WA thesis are not assumed to be the very same. Lipton seems sure that the WA thesis is acceptable and for the sake of the present argument, I will concede this. Nevertheless, it remains to be seen whether, given the SA thesis’ ceteris paribus conditions, a theory is better confirmed by predictive success.
Imagine twin scientists, Al and Bert, who “independently and coincidentally” construct one and the same theory, say, the general theory of relativity (hereafter, GTR). [9] Al, on the basis of some background knowledge K, [10] posed GTR, and from GTR predicted E, which, it turns out, was subsequently observed. Bert, on the other hand, though he shares K, posed GTR to accommodate E. In both Al and Bert’s GTR, we may assume, the same auxiliary hypotheses are employed to explain observations. We may further assume that GTR does indeed entail E, and moreover that whatever Al’s GTR entails/explains, Bert’s GTR entails as well. Moreover, because both Al and Bert’s theories are the same, we may infer that both of their theories have what Robin Collins calls identical “structural/relational features” (Collins, 212). These features are essential to a theory since they are the logical/semantic and evidential relations embodied in the theory. It will prove important to remember that “loveliness” (as Lipton describes it) is simply an epistemic evaluation of these structural/relational features (as Collins describes them).
According to the SA thesis, Al has more reason to believe GTR than does Bert, even though both know, independently of one another, that GTR explains the observed E, and this is because E was predicted, rather than accommodated, in Al’s story. In Lipton’s words, we should conclude that
“… the predictor [Al] has more reason to believe the theory [GTR] than the accommodator [Bert], though they share theory [GTR], data [E], and background beliefs [K]” (Lipton, 135). [I have inserted the bracketed names for clarity.]
Lipton confesses that this is not intuitively obvious. However, adopting Whewell’s terms, we might say that Al’s successful prediction of E tempts us to infer that Al had some insight into “nature’s secret,” since his predictive success could not be plausibly explained by either chance or by doctoring the theory to fit E, since E was unavailable to him. On the other hand, Bert’s success can apparently be given a more mundane interpretation – like a typical student of mathematics, he knew the answer (E) was in the back of the book and could simply make sure his theory entailed E. Might not knowing that Bert designed GTR to fit E make the truth of GTR irrelevant to Bert’s constructing it? Isn’t Bert’s knowing that his theory had to fit E a sufficient explanation for GTR’s fitting E – after all, we require no other appeal to some insight into “nature’s secrets” (as with Al)?
An
Argument for Indifference
How is it that one explanation of an observation should come to compete with, or undermine our rational degree of confidence in, another explanation of the same observation? [11] Horwich offers a case in which an observation is made – “The car won’t start” – call this F, and two explanations are offered. First, it might be explained, “The car is out of gas” (HG); second, “The starter is broken” (HS) is offered to explain the car’s not starting. We can represent the following explanatory relationship in Bayesian terms.
1: P(HG/F) = P(HG)P(F/ HG)/P(F) 2: P(HG/F&HS) = P(HG/ HS)P(F/ HG&HS)/P(F/ HS)
In step 1, upon learning that the car won’t start, we have more reason to believe the claim that the car is out of gas than we did before learning F (P(HG/F) > P(HG)). We had no reason, prior to learning F, to think that the car was out of gas (P(F) < 1) and if HG obtains, that would satisfactorily explain F, so P(F/HG) = 1. However, in step 2, upon learning not only that it will not start, but also that the starter is broken, my rational degree of confidence in HG should fall to my original confidence in P(HG). Horwich’s rationale is that P(F/HS) = 1 since HS explains F; P(F/ HG&HS) = 1, since, if the starter is broken and the gas tank is empty, I am certain that F; however, since HG and HS obtain independent of one another – presumably there is no connection, causal or otherwise, between the starter’s being broken and the gas tank’s being empty – P(HG/ HS) = P(HG). This result reflects common sense: that the starter is broken sufficiently explains the car’s not starting, and to look for a further reason seems superfluous. [12] In this way, HS’s obtaining competes with, or undermines the truth of HG.
By contrast, suppose a neighbor of mine does not know that my car is green (F is “My car is green”). I may offer either of the following to him as information: first, that I insisted on purchasing a green car (HI); second, that the previous owner of my car painted it green (HP). [13] As before, we have the relationship expressed in a Bayesian formalism:
1’: P(HP/F) = P(HP)P(F/HP)/P(F) 2’: P(HP/F&HI) = P(HP/ HI)P(F/HP&HI)/P(F/HI)
As in step 1 above, we see that in step 1’, upon learning that F, my neighbor has greater reason to believe that the previous owner had the car painted green: P(F) < 1, In this case, however, my neighbor’s learning that I would only have bought a green car does not at all undermine his confidence in HP, since there is a positive dependence between the previous owner of this car – my car – painting the car green and my owning this particular green car (i.e. P(HP/HI) >> P(HP)). If the previous owner of this car had not painted it green, I would not have bought it.
Horwich asks us to consider these two cases: first, the case where my car will not start, and second, the case where my car is painted green. Which is Bert’s case like, the car’s not starting, or my car’s being green? Insofar as Bert’s accommodating GTR to fit E is dependent on E’s obtaining, his case resembles the latter. Adopting Horwich’s conventions, we have:
F: GTR fits E HT: GTR is true HR: Bert required a theory which would fit E
We may apply Bayes’ theorem as before:
1”: P(HT/F) = P(HT)P(F/ HT)/P(F) 2”: P(HT/F&HR) = P(HT/HR)P(F/HT&HR)/P(F/HR)
Just as P(HP/HI) > P(HP), since my buying this particular car is not independent of the car’s being green, so P(HT/HR) > P(HT) because the information that GTR successfully accommodated E does increase our confidence that GTR is true. Once again employing a counterfactual, if E did not obtain, Bert would not have posed GTR.
Maher has urged that this conclusion simply begs the question against the SA thesis (Maher, 274). Accommodative success has some evidential value, no doubt, but the SA thesis allows this. What the SA thesis denies is that P(HT/F&HR) = P(HT/F). If accommodation is a liability, the burden is on Maher to specify it. Perhaps, as the quote from Whewell suggests, Al has an insight into “nature’s secret” that Bert does not. But then we must inquire into the nature of such an intuition. In our twin paradox, we are assuming that both Al and Bert share background knowledge K. [14] Is Al better at making inferences from his background knowledge than Bert? Perhaps, but then we allow that Bert has inconsistent degrees of belief, and nothing about accommodation implies inconsistency. [15] Maher must either awkwardly posit a mysterious intuition into “nature’s secret” that Al has and Bert doesn’t (this strikes me as an unnecessary stretch) or deny the conceptual possibility of the twin-experiment. The latter cannot be maintained, though, since the stipulation that the twins have K and their inferential capabilities in common is merely an indirect way of describing a single individual in two hypothetical situations.
In Lipton’s own argument, the competing explanations are the “fudging,” not the accommodation, and “truth” explanations. Lipton hopes to avoid the above objection (the information that a theory accommodated, rather than predicted, E is irrelevant to the truth of the theory) by arguing that it is the fudging, and not the accommodation, that is suspect and that preempts the “truth” explanation.
The Risk of Fudging
What is it about prediction that implies that Al’s GTR should enjoy more support from E than does Bert’s? For Lipton, it is better to ask what it is about accommodation that diminishes the support Bert’s GTR gets from E. Lipton makes his argument for the SA thesis explicit in the three following claims (Lipton, 140). The first, I will call M:
M: “When data need to be accommodated, there is a motive to force a theory and auxiliaries to make the accommodation.”
Lipton calls the forcing of a theory or auxiliaries to fit an observation “fudging.” By “fudging,” I take him to mean the ad hoc (in a non-technical sense) creation or modification of a theory or auxiliary hypothesis, which produces theories that are unlovely – and, Lipton suggests, less plausible and/or likely (Lipton, 142). [16] We can imagine Bert’s situation: he wants a theory that explains E. And despite all the attention given to the underdetermination of theory by evidence, he struggles to come up with even one that adequately explains E. In desperation, Bert might annex E into his background knowledge as brute fact – “That’s Mercury for you.” This appears to be a paradigm case of fudging: the modification complicates K without bringing much explanatory power, if any. This does not rule out the proposed explanation; however, its initial plausibility and explanatory power are much less than, say, the hypothesis that a nearby planetoid (call it Vulcan) perturbs Mercury’s orbit in the observed way. [17] M does not imply that accommodating theories or auxiliaries are usually also fudged; it merely indicates the tendency of a human psychological state of affairs to obtain (e.g. in Bert’s mind) when accommodation is required. M is intended to capture the real risk that accommodation presents – fudging. [18]
Lipton combines M with P, P: “In the case of prediction, by contrast, there is no motive for fudging, since the scientist does not know the right answer in advance.”
To give S:
S: “… [T]here is reason to suspect accommodations that does not apply to predictions, and this makes predictions better.”
If I know that a theory has accommodated observations then I have defeasible evidence of fudging (Lipton, 148). Since we are supposed to prefer lovely explanations and theories, it follows we should want to avoid fudged explanations and theories. From this, in turn, it is supposed to follow that we should assign less confidence in accommodations and prefer predictions. That a theory successfully accommodates some data has (at least) two possible explanations: one, the “truth” explanation – the theory is approximately true, and two, the fudging explanation – the theory or some of its auxiliaries have been fudged to fit the data. More importantly, the fudging explanation preempts the “truth” explanation; if the fudging explanation satisfactorily explains the fact of an accommodation’s fit with the data, then no explanatory work (or very little) is left to the “truth” explanation.
We must take care not to identify the so-called “fudging explanation” with “accommodation explanation.” Lipton, earlier in his eighth chapter, rejects one argument for the SA thesis, conceding to Horwich (1982) that the accommodation explanation itself may not preempt the “truth” explanation (Lipton, 138). But where the accommodation explanation fails to compete with the “truth” explanation, the fudging explanation does not.
Two
Flavors of Fudge
Does SA follow from S, as it applies to our “twin paradox”? In the case of Al and Bert, S indicates that Bert has defeasible reason to suspect his fudging GTR to fit E, whereas Al has no such reason, since he predicted E. Surely there is something odd about concluding this in Bert’s case, though. Al’s GTR is identical to Bert’s GTR. If Bert’s GTR might be fudged, why not Al’s? We might also ask why Bert should suspect his accommodation of fudging. Since he has constructed GTR to accommodate E, it is would seem to follow that he knows the relevant structural/relational features in GTR. If Bert does in fact know the relevant structural/relational features of GTR, then he also knows whether GTR is fudged or not. But here we come face to face with a conceptual ambiguity that Lipton ignores. For, on the one hand, we may speak of “fudging,” whereby we refer to a kind of inferential process, which is more or less unreliable at producing good explanations. This is the sense in which Lipton makes his rebuttal against Horwich. On the other hand, we may describe a theory as “fudged” when it has, or lacks, certain characteristics – this Lipton, explains, is why “fudged” explanations are less confirmable than more “lovely” ones. [19]
If we interpret Lipton as making the latter claim, the risk of “fudging” is no problem at all and SA collapses. Horwich, as we have seen, shows that if we confine our assessment of the plausibility of theories to considerations of the probabilistic relations between explanations and observations – captured in the Bayesian reconstruction of inference as depending upon the prior probability of the explanation, the likelihood of the observation given the explanation’s truth, and the expectedness of the observation – both Al and Bert, given our ceteris paribus conditions, ought to have the same degree of confidence in GTR at the end of the day, without consulting each other. Each has merely to look and see whether the theory is fudged or not.
At this point, Lipton’s reply
might simply be, “If the risk of fudging isn’t completely captured by the constraints
on a Bayesian model, then so much the worse for the good Reverend Bayes!” In fact,
Lipton is hardly so hostile to Bayesian models of inference, as his reference
to Patrick Maher’s article – a Bayesian attempt at what amounts to the SA thesis
– shows. Moreover, he would surely be loath to abandon what amounts to a central
notion in IBE (“fudged” is an epistemic evaluation founded on the presence or
lack of explanatory virtues, without which IBE is a failed project), in order
to persuade us of the SA thesis.
[20] So what remains of Lipton’s argument?
“Actual” Versus “Assessed” Support
The “twin paradox,” as I’ve called it, is misleading. If some third person (or the twins themselves when they meet to compare notes) actually knows that Al and Bert have each constructed the same theory, then it must possible to know this. If it is possible to know this, given that “fudged” is a description of a theory’s structural/relational features, it is also possible to know whether GTR is “fudged.” Should Al and Bert meet and compare notes, each must be able to recognize the other’s evidence (E) and theory (GTR) as identical to his own. Otherwise, Lipton’s exposition and solution to the paradox are absurd. If all Lipton wants to claim was that each twin’s theory and evidence seem the same, as far as each could tell, then we must be allowing that Al’s GTR is not the same as Bert’s GTR.
Clearly, Lipton has not offered any argument for the SA thesis. SA makes itself out to be a claim regarding the actual additional support observational evidence lends when it is predicted, as opposed to when it is accommodated. Both Collins and Horwich have argued cogently that SA is false; in fact, Lipton himself seems persuaded (Lipton, 136-8). Despite this, Lipton persists in making a weaker claim about the additional assessed (as opposed to actual) additional support observational evidence gives when predicted, as opposed to accommodated. This explains why Lipton implicitly ignores the ceteris paribus clause of the SA thesis while arguing for SA; Lipton denies that persons are typically in possession of enough relevant information concerning the structural/relational features of a theory and the certainty of the data in order to properly assess the actual confirmational relations between them. Perhaps a more modest view of our human rationality should take account of such ignorance. Where the relevant confirmational relations are opaque, we may see room for the “fudging explanation” to contribute to our evaluation.
“The actual support that the evidence gives to theory does not depend on the information that the evidence was accommodated or predicted but, since we can only have imperfect knowledge of what this support is, the information is epistemically relevant” (Lipton 152).
For most of us, so the reasoning might go, complicated scientific theories are mysterious and we believe them because we are assured that scientists are competently carrying on (testing their theories, rejecting those that don’t fit, attempting to explain things in better ways, and the like). Although the inferences some scientists make are not “black boxes” to them, they are so to us, so any information we can get concerning the reliability of some scientific practices would surely help the layperson in making an independent rational assessment of theories and evidence. [21]
However, Lipton also claims that his claim is applicable to scientists aware of relevant facts:
“The assumption that support is transparent to the investigator, that no distinction need be drawn between actual and assessed support, is an idealization, very similar to the idealization epistemologists sometimes make that people are deductively omniscient, so that they know all the deductive consequences of their beliefs” (Lipton, 152)
I
happily concede the point that humans are fallible and frequently unaware of the
logical consequences of their background beliefs.
[22] This amounts to little for Al and Bert, though.
However poor either is at deduction, as long as each is aware that his GTR is
identical to the other’s, they know that whatever Al’s GTR renders probable, Bert’s
does equally.
Though
the considered arguments in support of a special predictive advantage (like
SA) are misguided, it should be clear that what all parties to the discussion
may still agree on is that theories ought to have the power to specify observations
(given certain K). This power can be defined in terms of the ordered pairs of the form <Ei
, r> it specifies. The first member of the pair (Ei) represents
some bit of observation or evidence, whereas the second ® is the likelihood that
bit of observation or evidence has conditional on H and K, i.e. P(Ei/H&K)
= r.
[23]
Given that Al and Bert have K in common, it follows that the power of GTR is identical for both, whether or not either knows it. This is simply a formalization of the easily passed-over truth that any theory predicts, or accommodates, what it does just in virtue of the theory itself.
A
Table For Sixty-Three
We can now see more clearly now that Mendeleyev’s achievement was not a matter of producing a prediction from “thin air,” as it were, only later confirmed by experiment. Rather, his prediction was an extension of the systematic organization he described in the chemical zoo. Indeed, without this background knowledge (imagine a possible case in which he was aware of only three elements), Mendeleyev’s table would have been, initially, even more implausible. The predictive success that his periodic table theory would have enjoyed would be perhaps more startling for those who found Mendeleyev’s periodic table improbable. However, this is consistent with the indifference claim I have argued for, and follows from Bayesian considerations. [24] In the actual case, what once appeared as a brute fact resistant to explanation to the community of chemical scientists of the 1860s now had a unifying, even lovely, explanation of the various features of the chemical elements. Had Mendeleyev instead accommodated the three elements (gallium, scandium, and germanium) into the very same system, each would have found its same place at the same table reserved and waiting open, just as in Mendeleyev’s actual prediction. Finally, as I have urged, neither our counterfactual Mendeleyev (analogous to Bert), nor the actual Mendeleyev (analogous to Al), should worry that he had mysteriously fudged his table, and our “twin Mendeleyev’s” should have the same degree of confidence in their table. [25]
[1] Maher’s history is itself derived from the work of historian of chemistry Aaron Ihde’s The Development of Modern Chemistry. Maher does not mention Mendeleyev’s preliminary table published in 1869, which shares the basic organizing principles with its later incarnation. However, Maher does mention some other chemists’ previous attempts at discerning a pattern (Ihde, 236-43). [2] Interestingly, physics underwent a very similar crisis in the 1970s – a “particle zoo” of subatomic particles, each with its own spin, charge, strangeness, charm, etc. Quark theory held out the best hope of simplifying the mess, and has since enjoyed some experimental confirmation. [3] Though, for my purposes, the so-called “critical rationalist” position – the position of Popper and Lakatos – will not be addressed (not directly, at least), a conclusive answer to the question “Is there a truly special epistemic value attached to prediction, as opposed to ‘mere’ accommodation?” would surely contribute to an assessment of the merits of Popperian philosophy of science. [5] Patrick Maher cites a distinguished company of scientists and philosophers who claim for prediction a special epistemic value (which value an accommodation lacks): “Leibniz (1678), Huygens (1690, preface), Whewell (1847 vol. 2, p. 64f.), Peirce (1883), Duhem (1914, ch. II, §5), Popper (1965, p. 241f.), Lakatos (1970, p. 123), and Kuhn (1977, p. 322)” (Maher, 273). [6] I will move somewhat loosely between the language of inference to the best explanation, which lends itself to talk of explanations, and the language of theories and hypotheses. Nothing in my arguments hangs on a distinction between them. [7] Robin Collins offers his own version of the SA thesis, which he calls the SEP – the “special epistemic value of prediction” – thesis (Collins, 212). According to SEP, “[T]he information that a theory predicted instead of accommodated a set of data should increase our confidence in its truth given that we already know the relevant structural/relational features of the theory” (Collins, 215). Collins considers the case of an appraiser who is evaluating the objective degree of support a theory (like GTR) has: after deliberating on the matter, he learns that the theory was designed to accommodate one or more of its observational consequences. The question, as Collins sees it, is whether the occurrence of “certain psychological events in those who developed [a theory]” has any relevance to the truth of the theory (Collins, 213). Although Collins does not mention them, his argument is easily applied to the case of the twin scientists. The only difference between Al and Bert, under this analysis, is that a certain psychological process occurred in Bert’s case, whereas it did not in Al’s. Collins then argues that such psychological processes can have no non-mysterious connection to the truth of GTR (Collins, 216-9). [8] As we shall see, the rejection of the SA thesis may follow from accepting these criteria as objective (if they are indeed criteria at all). [10] It is easy to forget that in Al’s case, unless he is a Popperian scientist bent on bold or implausible conjecturing, he is responding to some part of K which seems to require explanation in the first place. [12] For simplicity’s sake, I follow Horwich’s convention of assigning unity to likelihoods that effectively explain the observational evidence in question. A more accurate assessment would set these values to something a bit less than 1. However, it is plausible that, nevertheless, the “abductive ratio” (given by P(O/HP)/P(O)) equals 1. [13] Maher, wondering aloud, writes “Now the fact that my car is green does not imply that it was green when I bought it, and so it is reasonable to suppose that… P(HP/HI&F) > P(HP/HI), contrary to Horwich’s claim. So if there is any analogy here at all, it is one that supports the predictivist thesis” (Maher, 283). There is no analogy. If GTR is true, it was true before Einstein conceived of it – at least, so says the realist. Perhaps to clear away any confusion, Horwich should have made HP say, “The previous owner of this car painted it an ineradicable green”. [14] Maher gives the example of a coin-flipping experiment: A predicts the outcome of all 100 tosses before the experiment, B waits for 99 tosses to be observed, “accommodates” them, and predicts the outcome of the 100th toss. After 99 tosses, suppose all of A’s outcome predictions were correct. Both A and B predict that the 100th toss will yield “heads.” While, on the one hand, we are inclined to say that the probability that B is correct in his prediction is no better than on chance, we believe that A’s chances are much better – even though both A and B have specified the same outcomes. From a Bayesian formalization of this kind of experiment, Maher concludes something like the SA/SEP thesis (Maher, 274-281, esp. 275-6). This is surely a mistake, though. It is one thing to claim that “Whatever A predicts about the outcomes of the coin toss is probably right,” but quite another to say that “A’s theory is probably right.” As with Whewell, we think that A has some insight into coin-flipping that B (as well as the rest of us!) does not. However, this insight remains a “black box” mechanism. Maher avoids describing what A’s theory is, the point being, presumably, that it does not matter. This is false. Theories are not, like A’s prediction or even B’s accommodation, sets of ordered pairs {experiment, outcome}produced via some mysterious insight. Instead, theories are supposed to be explicit characterizations of the world, which explain, or have as their entailments (conjoined with auxiliary hypotheses) these ordered pairs. In Maher’s thought-experiment, we think that A and B should lay their cards on the table, making their “theories” of coin-flipping explicit, if possible. Therefore, Maher’s thought-experiment is not a case where two identical theories with the same entailments (99 of which are known to obtain) are more or less plausible depending on the background knowledge of A or B. Instead, we have a competition of two “black box” mechanisms, one (A) which seems to have proven itself reliable in 99 trials, the other (B) which has no track record at all to give us any confidence in its reliability. While some inference may have this “black box” character, scientific inference does not. For example, Newton’s laws of motion and gravity, given certain boundary conditions and auxiliary hypotheses, specify orbits for the planets. It is not the case that Newtonian physics just is a naked assertion of the case-by-case ordered pairs that Newton, or anybody else, had in mind when contemplating the motion of planets, or apples, or any other projectile. On the contrary, Newton’s laws of motion and gravity are made explicit and from these and other background beliefs or auxiliary hypotheses we infer an observation consequence (e.g. any two massive bodies will, given there are no other forces acting on them, have a motion described by certain equations). [15] There is a fair amount of literature devoted to the problem of conditionalizing on “old evidence.” Lyle Zynda (1995) describes the efforts of Richard Jeffrey (1991), Bas van Fraassen (1988), and Daniel Garber (1983), in particular. Generally, attempted solutions involve either partial conditionalization (as with Jeffrey) and/or models of “logical learning”, where an individual discovers that a certain logical and/or probabilistic relation holds between propositions she holds.
[16] As we shall see, Lipton must
maintain a distinction between “fudging” and “accommodation,” and so we should
avoid conflating fudging with accommodation. Assuming Horwich’s (1982) arguments
regarding competing explanations of truth versus “tailor-made” are adequate to
answering arguments endorsing SA thesis (aside from Lipton’s), it appears that
if Lipton fails to make a clear and real distinction between fudging and accommodation,
his argument would collapse into a kind addressed already (by Horwich, Collins,
etc.). [17] I realize that to make even this relatively meager claim is to invite a storm of questions concerning the relationship of the explanatory virtues, embodied in Lipton’s “loveliness” criteria, or perhaps Thagard’s (1978, 79). Setting these to one side, the Vulcan-hypothesis certainly bears a resemblance to the Adams-Leverrier Neptune-hypothesis, which proved itself in 1846 (Snyder, 461-3), so it bears analogy to the kinds of explanations offered in Newtonian physics. However, nothing in my argument turns on successfully establishing a connection, logical or otherwise, between loveliness and plausibility or likeliness. [18] Lipton confides in us that such fudging may be innocent, and in some cases, even salutary. By and large, however, fudging produces poor explanations of observations, in that fudged explanations themselves tend to be “unlovely” (Lipton, 142). [19] Or, to put a more Bayesian spin on it, fudged theories tend to have a lower prior probability than non-fudged ones do. [20] It is strange that despite allowing for Horwich’s argument against a predictive advantage claim to stand – rejection of such claims is compatible with acceptance of the IBE criteria – Lipton yet chooses to defend the claim by complicating his own model of inference to the best explanation. [21] Premise P states that accommodation is riskier than prediction, since there is the threat, defeasible as it might be, of fudging. However, in the limit of ignorance concerning a scientific theory and/or its relevant evidence, all kinds of factors rear their heads to be considered in attributing authority. Clearly, these factors must be considered case-by-case. Whether X predicted Y will prove a dubious criterion by which to judge X’s theory. [22] It is, in fact, critical to understanding the role of “old evidence” in our reasoning that we recognize our inferential limitations. See note 15. [23] This articulation of “power” is derived from a comment from Tim McGrew. Also note that in the special case of entailment ([H & K]ÉEi), the pairs take the form <Ei , 1>. [24] If S’s degree of confidence in Mendeleyev’s counterfactual prediction of 60 elements, given three elements accommodated, can be represented as P(H/K&E4&E5&E6&…E63), where En is the observation and description of chemical element n, and K = E1&E2&E3, we can rewrite this as P(H/E1&E2&E3&…E63). This is identical to the representation of S’s degree of confidence in H (after 63 observations) regardless of at what stage our hypothetical Mendeleyev had stopped accommodating and started predicting. For a slightly different version of this argument see (Horwich, 110). [25] Of course, we need not stipulate that our counterfactual Mendeleyev has not consulted the actual Mendeleyev, since this is an impossibility.
REFERENCES
Collins, Robin. “Against the Epistemic Value of Prediction Over Accommodation.” Nous 28 (1994): 210-224.
Horwich, Paul. Probability and Evidence. New York: Cambridge University Press, 1982.
Ihde, Aaron. The Development of Modern Chemistry. New York: Harper and Row, 1964.
Lakatos, Imre. "Falsification and the Methodology of Scientific Research Programmes." Criticism and the Growth of Knowledge. eds. Lakatos, Imre and A. Musgrave. Cambridge: Cambridge University Press, 1970.
Lipton, Peter. Inference to the Best Explanation. New York: Routledge, 1991.
Maher,
Patrick “Prediction, Accommodation, and the Logic of Discovery.” Philosophy of Science 1 (1988): 273-285.
Popper, Karl. Conjectures and Refutations: The Growth of Scientific Knowledge. New York: Harper and Row, 1963.
Schlesinger,
George. “Accommodation and Prediction.”
Australasian Journal of Philosophy 65 (1987): 33-42.
Snyder,
Laura. “Is Evidence Historical?” Scientific Methods: Conceptual and Historical
Thagard,
Paul. "The Best Explanation: Criteria
for Theory Choice." Journal of Philosophy. 75 (1978): 76-92.
Whewell, William. Novum Organon Renovatum. [3rd ed. (London, 1858)]. rpt. in William Whewell's Theory of Scientific Method. ed. R. Butts Pittsburgh: University of Pittsburgh Press, 1968.
Zynda, Lyle. “Old Evidence and New Theories.” Philosophical Studies. 77 (1995): 67- 95.
| ||
|
1903
West Michigan Avenue | ||