# presumptuous philosophers

Here’s a ubiquitous gaffe in modern philosophy of probability, exemplified by a thought experiment of Nick Bostrom:

“It is the year 2100 and physicists have narrowed down the search for a theory of  everything to only two remaining plausible candidate theories, T1 and T2 (using  considerations from super-duper symmetry). According to T1 the world is very,  very big but finite, and there are a total of a trillion trillion observers in  the cosmos. According to T2, the world is very, very, very big but finite, and  there are a trillion trillion trillion observers. The super-duper symmetry  considerations seem to be roughly indifferent between these two theories. The  physicists are planning on carrying out a simple experiment that will falsify  one of the theories. Enter the presumptuous philosopher: “Hey guys, it is  completely unnecessary for you to do the experiment, because I can already show  to you that T2 is about a trillion times more likely to be true than T1.”

The presumptuous philosopher is attempting to apply a “self indicating assumption”. The self indicating assumption itself is quite sound. According to it, you take yourself to have been sampled uniformly at random from a pool of observers in which worlds are represented in proportion to their objective chance. (Meaning that when an observer is selected, the probability it came from world w is proportional to the product of the objective chance of w and the number of observers w has.) That’s maybe a bit vague, but it’s clear enough for most purposes. Credences, then, are expected long run frequencies. (But not quite literally! As we shall see.)

Bostrom’s presumptuous philosopher isn’t doing that, though. Note: T1 or T2 is the “theory of everything”. So either T1 is a necessary truth or T2 is a necessary truth. That’s what a “theory of everything” is. It’s not the case that half of the worlds are T1 worlds and half of the worlds are T2 worlds. If that were the case, the presumptuous philosopher wouldn’t be presumptuous at all: intuition would square with his recommendation, as his counterparts would be vindicated in proportion to their numbers. No…what our intuitions rail against is the fact that T2 may, for all we know, be a total fiction: not merely un-actual, but impossible. Indeed, we think there is a 50% chance of that. The presumptuous philosopher is screwing up. But does that mean that self indication doesn’t work in general?

Not at all. It just doesn’t apply in this case. Why? Because the choice between T1 and T2 is a matter of epistemic probability, not objective chance. Self indicating assumptions only apply to cases of objective chance. Recall: in self indication, we take ourselves to have been sampled uniformly at random from a pool of observers in which the proportion of world w members is in proportion to the product of world w’s objective chance and the number of observers w has. Bostrom’s presumptuous philosopher attempts to substitute epistemic probability for objective chance. That’s a mistake, but you don’t throw the baby out with the bathwater.

So: if T1 is the correct theory of everything then the frequency of T1 observers is 1. If T2 is the correct theory of everything then the frequency of T1 observers is 0. Now, when computing an expected frequency you do use epistemic probabilities (obviously), so the expected frequency of T1 observers in the pool of all observers is 1/2. It’s not one over a trillion. If T1 is true then there aren’t any T2 observers at any other worlds. Because if it’s true it’s not just true…it’s a necessary truth. If it weren’t necessary, our intuitions wouldn’t rail against the presumptuous philosopher’s solution.

Would they? Those of some might, I will grant. But then, bad probabilistic intuitions are not to be heeded. That they are so common is a rhetorical problem here, but I don’t really see that anything can be done against this fact. There’s a possible objection I can see. No one would think of it, but it’s clever. And wrong. It’s both clever and wrong. “Hi, I’m Cleverus Wrongly, and I’m here to attack your thesis.”

CW: But imagine a sleeping beauty experiment where beauty gets 2 awakenings if sin e^(A(googol)) is positive where A is the Ackermann function and 1 awakening otherwise. If sin e^(A(googol)) is positive then it is necessarily positive. And if it is negative it is necessarily negative. So the frequency of positive awakenings is 1 or 0 with equal epistemic probabilities. And that means by your argument your credence in sin e^(A(googol)) being negative must be 1/2. But wait…imagine now that it’s the other way around: Beauty gets 2 awakenings if sin e^(A(googol)) is negative and 1 awakening otherwise. Still 1/2! But now imagine that we toss a coin to determine which of the two experiments is run! What’s Beauty’s credence, at wakeup, in “there is just one awakening”? Well, it would be 1/2 upon learning the result of the toss, either way. So it’s 1/2. But the toss of the coin is contingent! So it has to be 1/3! Ha ha! There’s no difference between contingent and necessary after all! Ha ha ha!

Answer: Not really. There is a difference. In the case of the first experiment you put out there, where beauty gets 2 awakenings if sin e^(A(googol)) is positive and 1 awakening otherwise, she should have credence 1/3 in sin e^(A(googol)) being negative. The (correct) reason why doesn’t commit one to a sanction of Bostrom’s “presumptuous philosopher”:

“…the answer 1/3 is exactly correct only when the expected value of the total quantity of conscious, minimally rational life in the universe is independent of the coin toss. In ordinary cases the effect of this factor is negligible: but in the limiting case where Beauty knows that she is the only rational being in the universe, and that her conscious life will be twice as long if the coin lands Tails, her credence in Heads when she wakes up should be 1/2. Generating the thirder result even in cases like this would, as Bostrom (2002) points out, require an implausible skewing of prior credences in favour of more populous worlds.”

This is from footnote #1 of Cian Dorr’s awesome marginalized paper A Challenge for Halfers. Dorr is a thirder, but subscribes to the one-half solution in a case in which not only awakenings but “total quantity of consciousness” is doubled if tails. I almost agree. Which is to say that I don’t agree, but only because the result of the toss is contingent. If it were necessary, I would endorse Dorr’s analysis.

Here is the right way to look at it. Imagine, in parallel, two hypothetical streams of consciousness. The first is from the “sin e^(A(googol)) is positive” multiverse, the second from the “sin e^(A(googol)) is negative” multiverse. The streams are hopping from world to world at regular intervals (say one minute intervals) under the auspices of self indication…that is, worlds are chosen in proportion to the product of their objective chance and quantity of consciousness, then slices of consciousness are chosen uniformly at random from that world’s pool of consciousness slices. As we pan right, we see two experimental awakenings in the first stream for every one in the second (because awakenings in the first are doubled, but not quantity of consciousness). It’s in this sense that the “long run frequency” of “sin e^(A(googol)) is negative” awakenings is 1/3.

This method of computing frequencies…treating epistemic probabilities in parallel, objective chances in series…I will call The Dorr Method. (Update: I guess it’s not really correct to call it that, because it’s not what Dorr does. Even so, Dorr does something similar. But perhaps I should not read too much into a single footnote of his written over a decade ago, and stop speculating as to what he intended at that time.) I’m wholly convinced that it’s correct. Notice that it’s not quite what I described earlier (expected long run frequency), though the latter gives the right result much of the time–including the Bostrom example, in which one stream consists entirely of T1 slices and the other consists entirely of T2 slices. “But wait,” one might say, “the T2 stream is longer.” Well, no. Not really. They’re both infinite. A multiverse encompasses infinitely many worlds. I’m viewing a world as something finite in time (a single expansion/contraction cycle, on one cosmological view), and I take time to be infinite. “But in that case the time scales of the streams don’t match.’ Actually I’m not sure what this objection means, but I will respond by saying “no attempt is made to match time scales” (if that means anything to anybody).  One final objection: “if there are infinitely many worlds how can you speak of the objective chance of a world?” Good point, but although I take the set of worlds to be infinite I hazard that the set of equivalence classes of worlds under the indiscriminability relation is finite. Objective chance is a function over those equivalence classes.

The Dorr Method should really be seen as a rational constraint on credences. If you don’t adopt it, you’re going to get in trouble somewhere. Either you’re going to wind up being a double halfer in SB, which puts you at odds with diachronic constraints such as conditionalization and reflection, or you will be a Lewisian halfer vulnerable to the Doomsday argument (cf. Sadistic Scientist argument of Meacham), or you will be a thirder vulnerable to the Presumptuous Philosopher argument.

None of these concessions are especially tenable.