Showing posts with label utility. Show all posts
Showing posts with label utility. Show all posts

Friday, October 31, 2025

Quantifying saving infinitely many lives

Suppose there is an infinite set of people, all of them worth saving, and you can save some subset of them from drowning. Can you assign a utility U(A) to each subset A of the people that represents the utility of saving the people in A subject to the following pair of reasonable conditions:

  1. If A is a proper subset of B, then U(A) < U(B)

  2. If A is a subset of the people, and x is one of the people not in A while I is an infinite set of people not in A, then U(A∪{x}) ≤ U(AI)?

The first condition says that it’s always better to add extra people to the set of people you save. The second condition says it’s always at least as good to add infinitely many people to the set of people you save as to add just one. (It would make sense to say: it’s always better to add infinitely many, but I don’t need that stronger condition.)

Theorem. For any infinite set of people, there is no real-valued utility function satisfying conditions (1) and (2), but there is a hyperreal-valued one.

It’s obvious we can’t do this with real numbers if we think of the value of saving n lives as proportional to n, since then the value of infinitely many lives will be which is not a real number. What’s mildly interesting in the result is that there is no way to scale the values of lives saved in some unequal way that preserves (1) and (2).

Proof: The hyperreal case follows from Theorem 2 here, where we let Ω = Ω be the set of people, G be the group of permutations of the set of people that shuffle around only finitely many people, and let U be the hyperreal probability (!) generated by the theorem. For this group is clearly locally finite, and any utility satisfying condition (1) and invariant under G will satisfy (2) (apply invariance to a permutation π be that swaps x and a member of I and does nothing else to conclude that U(A∪{x}) = U(A∪{πx}) which must be less than U(AI) by (1)).

The real case took me a fair amount of thought. Suppose we have a real U satisfying (1) and (2). Without loss of generality, the set of people is countably infinite, and hence can be represented by rational numbers Q. For a real number x, let D(x) be the Dedekind cut {q ∈ Q : q < x}. Fix a real number x. Choose any rational q bigger than x. Then for any real y > x we will have D(y) ∖ D(x) infinite, and by (1) and (2) we will have:

  1. U(D(x)) < U(D(x)∪{q}) ≤ U(D(y)).

Let b = infy > xU(D(y)). It follows that U(D(x)) < b ≤ U(D(y)) for all y > x. Let f(x) be the open interval (D(x),b). Then f(x) and f(y) are disjoint and non-empty for x < y. But the collection of disjoint non-empty open intervals of the reals is always countable. (The quick argument is that we can choose a different rational in each such interval.) So f is a one-to-one function on the reals with countable range, a contradiction.

Notes: The positive part of the Theorem uses the Axiom of Choice (I think in the form of the Boolean Prime Ideal Theorem). The negative part doesn’t need the Axiom of Choice if the set of people is countable (the final parenthetical argument about intervals and rationals ostensibly uses Choice but doesn’t need it as the rationals are well-ordered); in general, the argument of the negative part uses the weak version of the Countable Axiom of Choice that says that every infinite set has a countably infinite subset.

Wednesday, April 17, 2024

Desire-fulfillment theories of wellbeing

On desire-fulfillment (DF) theories of wellbeing, cases of fulfilled desire are an increment to utility. What about cases of unfulfilled desire? On DF theories, we have a choice point. We could say that unfulfilled desires don’t count at all—it’s just that one doesn’t get the increment from the desire being fulfilled—or that they are a decrement.

Saying that unfulfilled desires don’t count at all would be mistaken. It would imply, for instance, that it’s worthwhile to gain all the possible desires, since then one maximizes the amount of fulfilled desire, and there is no loss from unfulfilled desire.

So the DF theorist should count unfulfilled desire as a decrement to utility.

But now here is an interesting question. If I desire that p, and then get an increment x > 0 to my utility if p, is my decrement to utility if not p just  − x or something different?

It seems that in different cases we feel differently. There seem to be cases where the increment from fulfillment is greater than the decrement from non-fulfillment. These may be cases of wanting something as a bonus or an adjunct to one’s other desires. For instance, a philosopher might want to win a pickleball tournament, and intuitively the increment to utility from winning is greater than the decrement from not winning. But there are cases where the decrement is at least as large as the increment. Cases of really important desires, like the desire to have friends, may be like that.

What should the DF theorist do about this? The observation above seems to do serious damage to the elegant “add up fulfillments and subtract non-fulfulfillments” picture of DF theories.

I think there is actually a neat move that can be made. We normally think of desires as coming with strengths or importances, and of course every DF theorist will want to weight the increments and decrements to utility with the importance of the desire involved. But perhaps what we should do is to attach two importances to any given desire: an importance that is a weight for the increment if the desire is fulfilled and an importance that is a weight for the decrement if the desire is not fulfilled.

So now it is just a psychological fact that each desire comes along with a pair of weights, and we can decide how much to add and how much to subtract based on the fulfillment or non-fulfillment of the desire.

If this is right, then we have an algorithm for a good life: work on your psychology to gain lots and lots of new desires with large fulfillment weights and small non-fulfillment weights, and to transform your existing desires to have large fulfillment weights and small non-fulfillment weights. Then you will have more wellbeing, since the fulfillments of desires will add significantly to your utility but the non-fulfillments will make little difference.

This algorithm results in an inhuman person, one who gains much if their friends live and are loyal, but loses nothing if their friends die or are disloyal. That’s not the best kind of friendship. The best kind of friendship requires vulnerability, and the algorithm takes that away.

Monday, January 22, 2024

The hyperreals and the von Neumann - Morgenstern representation theorem

This is all largely well-known, but I wanted to write it down explicitly. The von Neumann–Morgenstern utility theorem says that if we have a total preorder (complete transitive relation) on outcomes in a mixture space (i.e., a space such that given members a and b and any t ∈ [0,1], there is a member (1−t)a + tb satisfying some obvious axioms) and satisfying:

  • Independence: For any outcomes a, b and c and any t ∈ (0, 1], we have a ≾ b iff ta + (1−t)c ≾ tb + (1−t)c, and

  • Continuity: If a ≾ b ≾ c then there is a t ∈ [0,1] such that b ≈ (1−t)a + tc (where x ≈ y iff x ≾ y and y ≾ x)

the preorder can be represented by a real-valued utility function U that is a mixture space homomorphism (i.e., U((1−t)a+tb) = (1−t)U(a) + tU(b)) and such that U(a) ≤ U(b) if and only a ≾ b.

Clearly continuity is a necessary condition for this to hold. But what if we are interested in hyperreal-valued utility functions and drop continuity?

Quick summary:

  • Without continuity, we have a hyperreal-valued representation, and

  • We can extend our preferences to recover continuity with respect to the hyperreal field.

More precisely, Hausner in 1971 showed that in a finite dimensional case (essentially the mixture space being generated by the mixing operation from a finite number of outcomes we can call “sure outcomes”) with independence but without continuity we can represent the total preorder by a finite-dimensional lexicographically-ordered vector-valued utility. In other words, the utilities are vectors (u0,...,un − 1) of real numbers where earlier entries trump later ones in comparison. Now, given an infinitesimal ϵ, any such vector can be represented as u0 + u1ϵ + ... + un − 1ϵn − 1. So in the finite dimensional case, we can have a hyperreal-valued utility representation.

What if we drop the finite-dimensionality requirement? Easy. Take an ultrafilter on the space of finitely generated mixture subspaces of our mixture space ordered by inclusion, and take an ultraproduct of the hyperreal-valued representations on each of these, and the result will be a hyperreal-valued utility representing our preorder on the full space.

(All this stuff may have been explicitly proved by Richter, but I don’t have easy access to his paper.)

Now, on to the claim that we can sort of recover continuity. More precisely, if we allow for probabilistic mixtures of our outcomes with weights in the hyperreal field F that U takes values in, then we can embed our mixing space M in an F-mixing space MF (which satisfies the axioms of a mixing space with respect to members of the larger field F), and extend our preference ordering ≾ to MF such that we have:

  • F-continuity: If a ≾ b ≾ c then there is a t ∈ F with 0 ≤ t ≤ 1 such that b ≈ (1−t)a + tc (where x ≈ y iff x ≾ y and y ≾ x).

In other words, if we allow for sufficiently fine-grained probabilistic mixtures, with hyperreal probabilities, we get back the intuitive content of continuity.

To see this, embed M as a convex subset of a real vector space V using an embedding theorem of Stone from the middle of the last century. Without loss of generality, suppose 0 ∈ M and U(0) = 0. Extend U to the cone CM = {ta : t ∈ [0, ∞), a ∈ M} generated by M by letting U(ta) = tU(a). Note that this is well-defined since U(0) = 0 and if ta = ub with 0 ≤ t < u, then b = (1−s) ⋅ 0 + s ⋅ a, where s = t/u, and so U(b) = sU(a). It is easy to see that the extension will be additive. Next extend U to the linear subspace VM generated by CM (and hence by M) by letting U(ab) = U(a) − U(b) for a and b in CM. This is well-defined because if a − b = c − d, then a + d = b + c and so U(a) + U(d) = U(b) + U(c) and hence U(a) − U(b) = U(c) − U(d). Moreover, U is now a linear functional on VM. If B is a basis of VM, then let VMF be an F-vector space with basis B, and extend U to an F-linear functional from VMF to F by letting U(t1a1+...+tnan) = t1U(a1) + ... + tnU(an), where the ai are in B and the ti are in F. Now let MF be the F-convex subset of VMF generated by M. This will be an F-mixing space (i.e., it will satisfy the axioms of a mixing space with the field F in place of the reals). Let a ≾ b iff U(a) ≤ U(b) for a and b in MF. Then if a ≾ b ≾ c, we have U(a) ≤ U(b) ≤ U(c). Let t between 0 and 1 in F be such that (1−t)U(a) + tU(c) = U(b). By F-linearity of U, we will then have U((1−t)a+tc) = U(b).

Thursday, December 16, 2021

When truth makes you do less well

One might think that being closer to the truth is guaranteed to get one to make better decisions. Not so. Say that a probability assignment p2 is at least as true as a probability assignment p1 at a world or situation ω provided that for every event E holding at ω we have p2(E)≥p1(E) and for every event E not holding at ω we have p2(E)≤p1(E). And say that p2 is truer than p1 provided that strict inequality holds in at least one case.

Suppose that a secret integer has been picked among 1, 2 and 3, and p1 assigns the respective probabilities 0.5, 0.3, 0.2 to the three possibilities while p2 assigns them 0.7, 0.1, 0.2. Then if the true situation is 1, it is easy to check that p2 is truer than p1. But now suppose that you are offered a choice between the following games:

  • W1: on 1 win $2, on 2 win $1100, and on 3 win $1000.

  • W2: on 1 win $1, on 2 win $1000, and on 3 win $1100

If you are going by p1, you will choose W1 and if you are going by p2, you will choose W2. But if the true number is 1, you would be better off picking W1 (getting $2 instead of $1), so the truer probabilities will lead to a worse payoff. C’est la vie.

Say that a scoring rule for probabilities is truth-directed if it never assigns a poorer score for a truer set of probabilities. The above example shows that a proper scoring rule need not be truth-directed. For let s(p)(n) be the payoff you will get if the secret number is n and you make your decision between W1 and W2 rationally on the basis of probability assignment p (with ties broken in favor of W1, say). Then s is a proper (accuracy) scoring rule but the above considerations show that s(p2)(1)<s(p1)(1), even though p2 is truer at 1. In fact, we can get a strictly proper scoring rule that isn’t truth-directed if we want: just add a tiny multiple of a Brier accuracy score to s.

Intuitively we would want our scoring rules to be both proper and truth-directed. But given that sometimes we are pragmatically better off for having less true probabilities, it is not clear that scoring rules should be truth-directed. I find myself of divided mind in this regard.

How common is this phenomenon? Roughly it happens whenever the truer and less-true probabilities disagree on ratios of probabilities of non-actual events.

Proposition: Suppose two probability assignments are such that there are events E1 and E2 with probabilities strictly between 0 and 1, with ω1 in neither event, and such that the ratio p1(E1)/p1(E2) is different from the ratio p2(E1)/p2(E2). Then there are wagers W1 and W2 such that p1 prefers W1 and p2 prefers W2, but W1 pays better than W2 at ω1.

Friday, November 19, 2021

Valuing and behavioral tendencies

It is tempting to say that I value a wager W at x provided that I would be willing to pay any amount up to x for W and unwilling to pay an amount larger than x. But that’s not quite right. For often the fact that a wager is being offered to me would itself be relevant information that would affect how I value the wager.

Let’s say that you tossed a fair coin. Then I value a wager that pays ten dollars on heads at five dollars. But if you were to try to sell me that wager for a dollar, I wouldn’t buy it, because your offering it to me at that price would be strong evidence that you saw the coin landing tails.

Thus, if we want to define how much I value a wager at in terms of what I would be willing to pay for it, we have to talk about what I would be willing to pay for it were the fact that the wager is being offered statistically independent of the events in the wager.

But sometimes this conditional does not help. Imagine a wager W that pays $123.45 if p is true, where p is the proposition that at some point in my life I get offered a wager that pays $123.45 on some eventuality. My probability of p is quite low: it is unlikely anybody will offer me such a wager. Consequently, it is right to say that I value the wager at some small amount, maybe a few dollars.

Now consider the question of what I would be willing to pay for W were the fact that the wager is being offered statistically independent of the events in the wager, i.e., independent of p. Since my being offered W entails p, the only way we can have the statistical independence is if my being offered W has credence zero or p has credence one. It is reasonable to say that the closest possible world where one of these two scenarios holds is a world where p has credence one because some wager involving a $123.45 has already been offered to me. In that world, however, I am willing to pay up to $123.45 for W. Yet that is not what I value W at.

Maybe when we ask what we would be willing to pay for a wager, we mean: what we would be willing to pay provided that our credences stayed unchanged despite the offer. But a scenario where our credences stay unchanged despite the offer is a very weird one. Obviously, when an offer is made, your credence that the offer is made goes up, unless you’re seriously irrational. So this new counterfactual question asks us what we would decide in worlds where we are seriously irrational. And that’s not relevant to the question of how we value the wager.

Maybe instead of asking about the prices at which I would accept an offer, I should instead ask about the prices at which I would make an offer. But that doesn't help either. Go back to the fair coin case. I value a wager that pays you ten dollars on heads at negative five dollars. But I might not offer it to you for eight dollars, because it is likely that you would pay eight dollars for this wager only if you actually saw that the coin turned out heads, in which case this would be a losing proposition for me.

The upshot is, I think, that the question of what one values a wager at is not to be defined in terms of simple behavioral tendencies or even simple counterfactualized behavioral tendencies. Perhaps we can do better with a holistic best-fit analysis.

Friday, March 26, 2021

Credences and decision-theoretic behavior

Let p be the proposition that among the last six coin tosses worldwide that proceeded my typing the period at the end of this sentence, there were exactly two heads tosses. The probability of p is 6!/(26⋅2!⋅4!) = 15/64.

Now that I know that, what is my credence in p? Is it 15/64? I don’t think so. I don’t think my credences are that precise. But if I were engaging in gambling behavior with amounts small enough that risk aversion wouldn’t come into play, now that I’ve done the calculation, I would carefully and precisely gamble according to 15/64. Thus, I do not think my decision-theoretic behavior reflects my credence—and not through any irrationality in my decision-theoretic behavior.

Here’s a case that makes the point perhaps even more strongly. Suppose I didn’t bother to calculate what fraction 6!/(26⋅2!⋅4!) was, but given any decision concerning p, I calculate the expected utilities by using 6!/(26⋅2!⋅4!) as the probability. Thus, if you offer to sell me a gamble where I get $19 if p is true, I would value the gamble at $19⋅6!/(26⋅2!⋅4!)$, and I would calculate that quantity as $4.45 without actually calculating 6!/(26⋅2!⋅4!). (E.g., I might multiply 19 by 6! first, then divide by 26⋅2!⋅4!.) I could do this kind of thing fairly mechanically, without noticing that $4.45 is about a quarter of $19, and hence without having much of an idea as to where 6!/(26⋅2!⋅4!) lies in the 0 to 1 probability range. If I did that, then my decision-theoretic behavior would be quite rational, and would indicate a credence of 15/64 in p, but in fact it would be pretty clearly incorrect to say that my credence in p is 15/64. In fact, it might not even be correct to say that I assigned a credence less than a half to p.

I could even imagine a case like this. I make an initial mental estimate of what 6!/(26⋅2!⋅4!) is, and I mistakenly think it’s about three quarters. As a result, I am moderately confident in p. But whenever a gambling situation is offered to me, instead of relying on my moderate confidence, I do an explicit numerical calculation, and then go with the decision recommended to me by expected utility maximization. However, I don’t bother to figure out how the results of these calculations match up with what I think about p. If you were to ask me, I would say that p is likely true. But if you were to offer me a gamble, I would do calculations that better fit with the hypothesis of my having a credence close to a quarter. In this case, I think my real credence is about three quarters, but my rational decision-theoretic behavior is something else altogether.

Furthermore, there seems to me to be a continuum between my decision-theoretic behavior coming from mental calculation, pencil-and-paper calculation, the use of a calculator or the use of a natural language query system that can be asked “What is the expected utility of gambling on exactly two of six coin tosses being heads when the prize for being right is $19?” (a souped up Wolfram Alpha, say). Clearly, the last two need not reflect one’s credences. And by the same token, I think that neither need the first two.

All this suggests to me that decision-theoretic behavior lacks the kinds of tight conceptual connection to credences that people enamored of representation theorems like.

Wednesday, September 9, 2020

Minor inconveniences and numerical asymmetries

As a teacher, I have many opportunities to cause minor inconveniences in the lives of my students. And subjectively it often feels like when it’s a choice between a moderate inconvenience to me and a minor inconvenience to my students, there is nothing morally wrong with the minor inconvenience to the students. Think, for example, of making online information easily accessible to students. But this neglects the asymmetry in numbers: there is one of me and many of them. The inconvenience to them needs to be multiplied by the number of students, and that can make a big difference.

I suspect that we didn’t evolve to be sensitive to such numerical asymmetries. Rather, I expect we evolved to be sensitive to more numerically balanced relationships, which may have led to a tendency to just compare the degree of inconvenience, in ways that are quite unfortunate when the asymmetry in numbers becomes very large. If I make an app that is used just once by each of 100,000 people, and my app’s takes a second longer than it could, then it should be worth spending about two working days to eliminate that delay. (Or imagine—horrors!—that I deliberately put in that delay, say in the form of a splashscreen!) If I give a talk to a hundred people and I spend a minute on an unnecessary digression, it’s rather like the case of a bore talking my ears off for an hour and a half. In fact, I rather like the idea that at the back of rooms where compulsory meetings are held there should be an electronic display calculating for each speaker the total dollar-time-value of the listeners’ time, counting up continuously. (That said, some pleasantries are necessary, in order to show respect, to relax, etc.)

Sadly, I rarely think this way except when I am the victim of the inconvenience. But it seems to me that in an era where more and more of us have numerically asymmetric relationships, sometimes with massive asymmetries introduced by large-scale electronic content distribution, we should think a lot more about this. We should write and talk in ways that don’t waste others’ time in numerically asymmetric situations. We should make our websites easier to navigate and our apps less frustrating. And so on. The strength of the moral reasons may be fairly small when our contributions are uncompensated and others’ participation is voluntary, but rises quite a bit when we are being paid and/or others are in some way compelled to participate.

One of my happy moments when I actually did think somewhat in this way was some years back when, after multiple speeches, I was asked to say a few words of welcome to our prospective graduate students. There were multiple speeches. I stood up, said “Welcome!”, and sat down. I am not criticizing the other speeches. But as for me, I had nothing to add to them but just a welcome from me, so I added nothing but a welcome from me. I should do this sort of thing more often.

Friday, April 24, 2020

More on presentism and decisions

You have seven friends, isolated from each other for a week. And you have a choice between these three options:

  1. In four days, all of your friends will experience an innocent pleasure P at the same time.

  2. Over the next week, each day a different one of your friends will experience P.

  3. You presently experience an innocent pleasure whose magnitude is twice that of P.

It seems like a good idea to go for options 1 or 2 over option 3. But there is very little reason to prefer 1 over 2 or 2 over 1.

On eternalism, the parity between 1 and 2 makes perfect sense: in both cases, reality will contain seven copies of P, and the only difference is between how the copies are arranged in spacetime. And it also makes perfect sense that 1 or 2 is a better choice than 3: reality on 1 or 2 contains 3.5 times as much innocent pleasure.

But on presentism, I think it is difficult to explain these judgments. First, it’s difficult to explain why the sacrifice of 3 is worth it: a real, because present, pleasure is being sacrificed for a bunch of unreal, because future, pleasures. (Growing block has this problem, too.)

Now, if the choice is between 1 and 3, then at least the presentist can say this:

  • On option 1, there will be an occurrence of 3.5 times the pleasure that would have occurred on option 3.

I am dubious that it makes sense to compare the future pleasure to the present one on presentism, but let’s grant that for the sake of the argument.

But now suppose the choice is between 2 and 3. Then, one cannot say there will be 3.5 times the pleasure. Rather:

  • On option 2, on seven occasions, there will be half of the pleasure of option 3.

But the locution “on seven occasions” is misleading. For it makes it sound like there will be seven of something valuable. But there won’t be seven of something. Rather:

  • There will be one of P to friend 1, and there will be one of P to friend 2, and so on.

But one cannot conjoin these “will be” claims into a single:

  • There will be one of P to friend 1 and one of P to friend 2, and so on.

For that will never happen.

The deep point here is this. Cross-time counting on presentism is logically quite different from synchronic counting. In fact, in a sense it’s not “counting” at all, for there won’t be and has not been that number of items. One way to see the point is to compare the logical analysis of synchronic and cross-time counting claims on presentism:

  • “There are (presently) two unicorns”: There exist x and y such that x is a unicorn and y is a unicorn and x ≠ y and for all z if z is a unicorn, then z = x or z = y.

  • “There are (cross-time) two unicorns”: It was, is or will be the case that: There exists x such that x is a unicorn and it was, is or will be the case that there exists y such that y is a unicorn and x ≠ y, and it was, is and will be the case that for every z if z is unicorn, then z = x or z = y.

These are logically very different claims.

(I am also a little worried about the technical details of the cross-time identity claims on presentism, by the way.)

Tuesday, October 17, 2017

Hope vs. despair

A well-known problem, noticed by Meirav, is that it is difficult to distinguish hope from despair. Both the hoper and the despairer are unsure about an outcome and they both have a positive attitude towards it. So what's the difference? Meirav has a story involving a special factor, but I want to try something else.

If I predict an outcome, and the outcome happens, there is the pleasure of correct prediction. When I despair and predict a negative outcome, that pleasure takes the distinctive more intense "I told you so" form of vindicated despair. And if the good outcome happens, despite my despair, then I should be glad about the outcome, but there is a perverse kind of sadness at the frustration of the despair.

The opposite happens when I hope. When the better outcome happens, then even though I may not have predicted the better outcome, and hence I may not have the pleasure of correct prediction, I do have the pleasure of hope's vindication. And when the bad outcome happens, I forego the small comfort of the vindication of despair.

The pleasures of correct prediction and the pains of incorrect prediction are doxastic in nature: they are pleasures and pains of right and wrong opinion. But hope and despair can, of course, exist without prediction. But when I hope for a good outcome, then I dispose myself for pleasures and pains of this doxastic sort much as if I were predicting the good outcome. When I despair of the good outcome, then I dispose myself for these pleasures and pains much as if I were predicting the bad outcome.

We can think of hoping and despairing as moves in a game. If you hope for p, then you win if and only if p is true. If you despair of p, then you win if and only if p is false. In this game of hoping and despairing, you are respectively banking on the good and the bad outcomes.

But this banking is restricted. It is in general false that when I hope for a good outcome, I act as if it were to come true. I can hope for the best while preparing for the worst. But nonetheless, by hoping I align myself with the best.

This gives us an interesting emotional utility story about hope and despair. When I hope for a good outcome, I stack a second good outcome--a victory in the hope and despair game, and the pleasure of that victory--on top of the hoped-for good outcome, and I stack a second bad outcome--a sad loss in the game--on top of the hoped-against bad outcome. And when I despair of the good outcome, I moderate my goods and bads: when the bad outcome happens, the badness is moderated by the joy of victory in the game, but when the good outcome happens, the goodness is tempered by the pain of loss. Despair, thus, functions very much like an insurance policy, spreading some utility from worlds where things go well into worlds where things go badly.

If the four goods and bads that the hope/despair game super-adds (goods: vindicated hope and vindicated despair; bads: frustrated hope and needless despair) are equal in magnitude, and if we have additive expected utilities with expected utility maximization, then as far this super-addition goes, you are better off hoping when the probability of the good outcome is greater than 1/2 and are better off despairing when the probability of the bad outcome is is less than 1/2. And I suspect (without doing the calculations) that realistic risk-averseness will shift the rationality cut-off higher up, so that with credences slightly above 1/2, despair will still be reasonable. Hope, on the other hand, intensifies risks: the person who hoped whose hope was in vain is worse off than the person who despaired and was right. A particularly risk-averse person, by the above considerations, may have reason to despair even when the probability is fairly high. These considerations might give us a nice evolutionary explanation of why we developed the mechanisms of hope and despair as part of our emotional repertoire.

However, these considerations are crude. For there can be something qualitatively bad about despair: it makes one not be as single-minded. It aligns one's will with the bad outcome in such a way that one rejoices in it, and one is saddened by the good outcome. To engage in despair on the above utility grounds is like taking out life-insurance on someone one loves in order to be comforted should the person die, rather than for the normal reasons of fiscal prudence.

This suggests a reason why the New Testament calls Christians to hope. Hope in Christ is part and parcel of a single-minded betting of everything on Christ, rather than the hedging of despair or holding back from wagering in neither hoping nor despairing. We should not take out insurance policies against Christianity's truth. But when the hope is vindicated, the fact that we hoped will intensify the joy.

I am making no claim that the above is all there is to hope and despair.

Tuesday, February 21, 2017

Total and average epistemic and pragmatic utilities

The demiurge flipped a fair coin. If it landed heads, he created 100 people, of whom 10 had a birthmark on their back. If it landed tails, he created 10 people, of whom 9 had a birthmark on their back. You’re one of the created people and the demiurge has just apprised you of the above facts.

What should your credence be that you have a birthmark on your back?

This seems a plausible answer:

  • Answer A: (1/2)(10/100)+(1/2)(9/10)=1/2

Let’s think a bit about Brier scores, considered as measures of epistemic disutility. If everybody goes for Answer A, then the expected total epistemic disutility will be:

  • TD(A) = (1/2)(100)(1/2)2 + (1/2)(10)(1/2)2 = 13.75

That’s not the best one we can do. It turns out that the strategy that minimizes the expected total epistemic disutility will be:

  • Answer B: 19/110

which yields the expected total disutility:

  • TD(B) = 7.9.

The same 19/110 answer will be optimal with any other proper scoring rule. Moreover, what holds for proper scoring rules also holds for betting scenarios, and so the strategy of going for 19/110, if universally adopted, will make for better total utility in betting scenarios. In other words, we have both an epistemic utility and a pragmatic utility argument for the strategy of adopting 19/110.

On the other hand, the 1/2 answer will optimize the expected average epistemic and pragmatic utilities in the population. But do we want to do that? After all, we know from Parfit that optimizing average pragmatic utilities can be a very bad idea (as it leads to killing off of those who are below average in happiness).

Yet the 1/2 answer has an intuitive pull.

Monday, January 9, 2017

Maps from desires and beliefs to actions

On a naive Humean picture of action, we have beliefs and desires and together these yield our actions.

But how do beliefs and desires yield beliefs? There are many (abstractly speaking, infinitely many, but perhaps only a finite subset is physically possible for us) maps from beliefs and desires to actions. Some of these maps might undercut essential functional characteristics of desires—thus, perhaps, it is impossible to have an agent that minimizes the satisfaction of her desires. But even when we add some reasonable restrictions, such as that agents be more likely to choose actions that are more likely to further the content of their desires, there will still be infinitely many maps available. For instance, an agent might always act on the strongest salient desire while another agent might randomly choose from among the salient desires with weights proportional to the strengths—and in between these two extremes, there are many options (infinitely many, speaking abstractly). Likewise, there are many ways that an agent could approach future change in her desires: allow future desires to override present ones, allow present desires to override future ones, balance the two in a plethora of ways (e.g., weighting a desire by the time-integral of its strength, or perhaps doing so after multiplying by a future-discount function), etc.

One could, I suppose, posit an overridingly strong desire to act according to one particular map from beliefs and desires to actions. But that is psychologically implausible. Most people aren’t reflective enough to have such a desire. And even if one had such a desire, it would be unlikely to in fact have strength sufficient to override all first-order desires—rare (and probably silly!) is the person who wouldn’t be willing to make a slight adjustment to how she chooses between desires in order to avoid great torture.

Nor will it help to move from desires to motivational structures like preferences or utility assignments. For instance, the different approaches towards risk and future change in motivational structure will still provide an infinity of maps from beliefs (or, more generally, representational structures) and motivational structures to actions.

Here’s one move that can be made: Each of us in fact acts according to some “governing mapping” from motivational and representational structures to actions (or, better, probabilities of actions, if we drop Hume’s determinism as we should). We can then extend the concept of motivational structure to include such a highest level mapping. Thus, perhaps, our motivational structure consists of two things: an assignment of utilities and a mapping from motivational and representational structures to actions.

But at this point the bold Humean claim that beliefs are impotent to cause action becomes close to trivial. For of course everybody will agree that we all implement some mapping from motivational and representational structures to actions or action probabilities (maybe not numerical ones), and if this mapping itself counts as part of the motivational structure, then everyone will agree that we all have a motivational structure essential to all of our actions. A naive cognitivist, for instance, can say that the governing mapping is one which assigns to each motivational and representational structure pair the action that is represented as most likely to be right (yes, this mapping doesn’t depend on the specific contents of the motivational structure).

Perhaps, though, a Humean can at least maintain a bold claim that motivational structures are not subject to rational evaluation. But if she does that, then the only way she can evaluate the rationality of action is by the action’s fit to the motivational and representational structures. But if the motivational structures include the actually implemented governing mapping, then every action an agent performs fits the structures. Hence the Humean who accepts the actual governing mapping as part of the motivational structure has to say that all actions are rational. And that’s a bridge too far.

Of course a non-Humean also has to give an account of the plurality of ways in which motivational and representational structures can be mapped onto actions. And if the claim that there is an actually implemented governing mapping is close to trivial, as I argued, then the non-Humean probably has to accept it, too. But she has at least one option not available to the Humean. She can, for instance, hold that motivational structures are subject to rational evaluation, and hence that there are rational constraints—maybe even to the point of determining a unique answer—on what the governing mapping should be like.

Monday, November 23, 2015

Values cannot be accurately modeled by real numbers

Consider a day in a human life that is just barely worth living. Now consider the life of Beethoven. For no finite n would having n of the barely-worth-living days be better than having all of the life of Beethoven. This suggests that values in human life cannot be modeled by real numbers. For if a and b are positive numbers, then there is always a positive integer n such that nb>a. (I am assuming additiveness between the barely-liveable days. Perhaps memory wiping is needed to ensure additiveness, to avoid tedium?)

Friday, April 24, 2015

Blackmail, promises and self-punishment

I was reading this interesting paper which comes up with "blackmail" stories against both evidential and causal decision theory (CDT). I'll focus on the causal case. The paper talks about an Artificial Intelligence context, but we can transpose the stories into something more interpersonal. John blackmails Patrick in such a way that it's guaranteed that if Patrick pays up there will be no more blackmail. As a good CDT agent, Patrick pays up, since it pays. However, Patrick would have been better off if he were the sort of person who refuses to pay off blackmailers. For John is a very good predictor of Patrick's behavior, and if John foresaw that Patrick would be unlikely to pay him off, then John wouldn't have taken the risk of blackmailing Patrick. So CDT agents are subject to blackmail.

One solution is to add to the agent's capabilities the ability to adopt a policy of behavior. Then it would have paid for Patrick to have adopted a policy of refusing to pay off blackmailers and he would have adopted that policy. One problem with this, though, is that the agent could drop the policy afterwards, and in the blackmail situation it would pay to drop the policy. And that makes one subject to blackmail once again. (This is basically the retro-blackmail story in the paper.)

Anyway, thinking about these sorts of cases, I've been playing with a simplistic decision-theoretic model of promises and weak promises—or, more generally, commitments. When one makes a commitment, then on this model one changes one's utility function. The scenarios where one fails to fulfill the commitment get a lower utility, while scenarios where one succeeds in fulfilling the commitment are unchanged in utility. You might think that you get a utility bonus for fulfilling a commitment. That's mistaken. For if we got a utility bonus for fulfilling commitments, then we would have reason to promise to do all sorts of everyday things that we would do anyway, like eat breakfast.

This made me think about agents who have a special normative power: the power to lower their utility function in any way that they like. But they lack the power to raise it. In other words, they have the power to replace their utility function by a lower one. This can be thought of in terms of commitments—lowering the utility value of a scenario by some amount is equivalent to making a commitment of corresponding strength to ensure that scenario isn't actualized—or in terms of mechanisms for self-punishment. Imagine an agent who can make robots that will zap him in various scenarios.

Now, it would be stupid for an agent simply to lower his utility function by a constant amount everywhere. That wouldn't change the agent's behavior at all, but would make sure that the agent is less well off no matter what happens. However, it wouldn't be stupid for the agent to lower his utility function for scenarios where he gives in to blackmail by agents who can make good predictions of his behavior and who wouldn't have blackmailed him if they thought he wouldn't give in. If he lowers that utility enough—say, by making a promise not to negotiate with blackmailers or by generating a robot that zaps him painfully if he gives in—then a blackmailer like John will know that he is unlikely to give in to blackmail, and hence won't risk blackmailing him.

The worry about the agent changing policies and thereby opening oneself to blackmail does not apply on this story. For the agent in my model has only been given the power to lower his utility function at will. He doesn't have the power to raise it. If the agent were blackmailed, he could lower his utility function for the scenarios where he doesn't give in, and thereby get himself to give in. But it doesn't pay to do that, as is easy to confirm. It would pay for him to raise his utility function for the scenarios where he gives in, but he can't do that.

An agent like this would likewise give himself a penalty for two-boxing in Newcomb cases.

So it's actually good for agents to be able to lower their utility function. Setting up self-punishments can make perfect rational sense, even in the case of a perfect rational agent, so as to avoid blackmail.

Tuesday, January 21, 2014

Utility and the infinite multiverse

If we live in an infinite universe, then when we look at total values and disvalues, total utilities, we will always run into infinities. There will be infinitely many persons, of whom infinitely many will provide instances of flourishing, after all. Now one might say: "So what? Our individual actions only affect a finite portion of that infinite sea of value and disvalue."

But this may be mistaken. For if there are infinitely many persons, presumably there are infinitely many persons who have a rational and morally upright generally benevolent desire. A generally benevolent desire is a distributive desire for each person to flourish. It is not just a desire that the proposition <Everyone flourishes> be true, but a desire in regard to each person, that that person flourish, though the desire may be put in general terms because of course we can't expect people to know who the existent persons are.

Now, if you have a rational and morally upright desire, then you are better off to the extent that this desire is satisfied (some people will think this is true even with "and morally upright" omitted). Thus, if you have a rational and morally upright general benevolence, then even if some men are islands, you are not. Whenever someone comes to be better off, you come to be better off, and whenever someone comes to be worse off, you come to be worse off. So if infinitely many people have a rational and morally upright general benevolence, whenever I directly do something good or bad to you, I thereby benefit or harm infinitely many people. And no matter how small the benefit or harm to each of these generally benevolent people, it surely adds up to infinity.

St. Anselm thought that our sins were infinitely bad as they were offenses against an infinite God. If we live in a multiverse, those of our sins that harm people also harm infinitely many people.

One might object that the generally benevolent person will only be infinitesimally benefitted or harmed by a finite harm to one person in the infinite sea of persons in the multiverse. That may be true of some very weakly benevolent people. But there will also be infinitely many generally benevolent people whose general benevolence will be sufficiently strong that the benefit or harm will be non-infinitesimal. After all, one can imagine a person who, if faced with a choice whether she should gain a dollar or a stranger she knows nothing about should gain a hundred dollars would always prefer the latter option. Such a person counts benefits and harms to other people at at least 1/100th of what such benefits and harms to herself would count as. And so if I deprive anybody of a hundred dollars, each such a generally benevolent person will, in effect, be harmed to a degree equal to a one dollar deprivation. As long as there are infinitely many generally benevolent people with at least that 1:100 preference ratio, the argument will yield that a non-infinitesimal harm to anybody results in an infinite harm. And plausibly there would in fact be infinitely many people with a 1:1 preference ratio, or maybe even a 2:1 preference ratio (they would rather that others benefit than themselves).

So we cannot avoid dealing infinite utilities if there are infinitely many persons. For each of our nontrivial actions will affect infinitely many persons, since infinitely many persons will have rational and morally upright desires that bear on the action.

Moreover, even denying the existence of an infinite multiverse, or of an infinite universe, won't get us off the hook. For even if we don't think such an infinitary hypothesis is true, we surely assign non-zero epistemic probability to it. The arguments against the hypothesis may be strong but are not so strong as to make us assign zero or infinitesimal probability to it. And a non-zero non-infinitesimal probability of an infinite good still has infinite expected utility.

Interestingly, too, as long as overall people flourish across an infinite multiverse, each such non-infinitesimally generally benevolent person will seem to be infinitely well off. Such are the blessings of benevolence in an overall good universe.

The above argument will be undercut if we think that one only benefits from the fulfillment of a desire when one is aware of that fulfillment. But that view is mistaken. An author who wrote a good book is well off for being liked even if she does not know that she is liked.

Thursday, October 3, 2013

The structure of the space of utilities

What kind of a structure do the utilities that egoists (on an individual level) and utilitarians (on a wider scale) want to maximize have? A standard approximation is that utilities are like real numbers. They have an order structure, so that we can compare utilities, an additive structure, so we can add utilities, and a multiplicative structure, so we can rescale them with probabilities. But that is insufficiently general. We want to allow for cases such as that any amount of value V2 swamps any amount of value V1. Thus, Socrates thought that any amount of virtue is better to have than any amount of pleasure. The structure of the real numbers won't allow that to happen.

A natural generalization is to note that the multiplicative structure of the space of utilities was overkill. We don't need to be able to multiply utilities by utilities. That operation need not make sense. We simply need to be able to multiply utilities by probabilities. Since probabilities are real numbers, a structure that will allow us to do that is that of a partially ordered vector space. However, we should not impose more structure on the utilities than there really is. It makes sense to multiply a utility by a probability in order to represent the value of such-and-such a chance at the utility. And since we have an additive structure on the utilities, we can make sense of multiplying a utility by a number greater than 1. E.g., 2.5U=U+U+(0.5)U. But it is not clear that it always makes conceptual sense to negate utilities. While it makes sense to think of a certain degree of pain as the negative of a certain degree of pleasure, it is not clear that such a negation operation is available in general.

Getting rid of the spurious structure of multiplying utilities by a negative number, and removing the unnecessary multiplication by numbers greater than 1, we get naturally get a structure as follows. Utilities are a partially ordered set with an operation + on them and there is an action of the commutative multiplicative monoid [0,1] on the utilities, with the order, addition and action all compatible.

A further generalization is that [0,1] may not be the best way to represent probabilities in general. So generalize that to a commutative monoid (with multiplicative notation). We now have this. A utility space is a pair (P,U) where P is a commutative monoid with multiplicatively written operation and an action on U, U is a commutative semigroup with an additively written operation + and a partial order ≤, where the operations, action and orders satisfy:

  • (xy)a=x(ya) for x,yP and aU
  • x(a+b)=xa+xb for xP and a,bU
  • If ab, then xaxb for a,bU and xP
  • If ab and cU, then a+cb+c.

I keep on going back and forth on whether U really should have an addition operation, though. I do not know if utilities can be sensibly added.

Tuesday, June 25, 2013

Scoring rules and outcomes of betting behavior

The literature has two ways of measuring the fit between one's credence in a proposition and reality. The "epistemic way" uses a scoring rule to measure the distance between one's credence and the truth (if the proposition is true, then a credence of 0.8 is closer to truth than a credence of 0.6). The "pragmatic way" looks at how well one is going to do if one bets in accordance with one's credence.

A standard condition imposed on scoring rules is propriety. A proper scoring rule is one where you don't expect to improve your by shifting your credence without evidence.

I think the two ways come to the same thing, at least in the case of a single proposition. Any appropriate (yeah, some more precision is needed) betting scenario gives rise to a proper scoring rule, where your score for a credence is minus the expected utility on the assumption that you bet according to your credence in the scenario. And, conversely, any proper scoring rule can be generated in this way from an appropriate betting scenario (or at least a limit of them—this is where the details get a bit sketchy).

Wednesday, May 16, 2012

Propriety and open-mindedness

A scoring rule s(i,r) measures the closeness between one's credence assignment r to a proposition and the truth value, i=0 for false and i=1 for true. I shall assume scoring rules to be continuous. Smaller scores are better.

A scoring rule is proper provided that by your own lights it does not tell you to expect a better (i.e., smaller) score if you just change your credence from r. Given a credence of r in the proposition in question, your expected score from adopting a credence of r' is rs(1,r')+(1−r)s(1,r'). So a proper scoring rule says that this function of r' achieves a minimum at r'=r. A strictly proper scoring rule achieves a minimum only at r'=r.

A scoring rule is open-minded provided that by your own lights it does not tell you to expect worse (i.e., bigger) score if you learn the truth value of some other proposition (we can think of this as the result of a binary experiment). If a scoring rule is not open-minded, then there will be circumstances where score-optimization with respect to one proposition sometimes requires you to shut your ears to other facts. A scoring rule is strictly proper provided that optimizing your score with respect to p requires you to be willing to learn the truth value of any proposition q that by your lights is not independent of p.

As Director of Graduate Studies, I have to attend graduation whenever one of our graduate students gets his Ph.D. On previous occasions, this has been very onerous. But this time I took a notepad and had a lot of fun doing math. In particular, I proved:

  • A scoring-rule is proper if and only if it is open-minded.
  • A scoring-rule is strictly proper if and only if it is strictly open-minded.
I've since learned from Richard Pettigrew that the implication from propriety to open-mindedness was basically already known. I don't know if the other implication was known as well.

My proof used the standard representation of scoring-rules in terms of convex fucntions, though it turns out that there are simpler proofs at least of the left-to-right implications.

Moreover, the left-to-right implications yield a proof of Good's Theorem. Just use the negative of expected utilities in practical decisions made optimally on the basis of credence r in p, given p and given not p, to define s(1,r) and s(0,r) respectively. It is trivial that this is a proper scoring rule, at least modulo continuity (but the simpler proofs of the left-to-right implications don't use continuity; I think continuity can anyway be proved in this case, but haven't checked details). Hence s is open-minded. But open-mindedness for this rule s is basically what Good's Theorem says.

Sunday, May 6, 2012

Open-mindedness and propriety

If H is true, I am epistemically better off the more confident I am of H, and if H is false, I am epistemically worse off in respect of H the more confident I am of H. Here are three fairly plausible conditions on an epistemic utility assignment (I am not so sure about Symmetry in general, but it should hold in some cases):

  1. Symmetry: The epistemic utility of assigning credence p to H when H is true is equal to the epistemic utility of assigning credence 1−p to H when H is false.
  2. Propriety: For any p, if you've assigned a credence p to H, then it is not the case that by your own lights you expect to increase your epistemic utility in respect of H by changing your credence without further evidence.
  3. Open-mindedness: For any p, if you've assigned a credence p to H, then for every experiment X you do not by your own lights expect to decrease your epistemic utility in respect of H by finding out the outcome of X.
Say that a credence level p is open provided that for every experiment X you do not by your own lights expect to decrease your epistemic utility in respect of H. If a credence level p is open, then when your credence is at p, you are never required, on pain of expecting to lower your epistemic utility in respect of H, to stop up your ears when the result of an experiment is to be announced. A credence level p is closed provided that for every experiment X you expect by your own lights not to increase your epistemic utility in respect of H. (So, a credence level could be both open and closed, if you expect no experiment to make a difference.)

So, here is an interesting question: Are all, some or no symmetric and proper epistemic utility functions open-minded?

I've been doing a bit of calculus over the past couple of days. I might have slipped up, but this morning's symbol-fiddling seems to show that assuming that the utility functions are 2nd-order differentiable at most points (e.g., at all but countably many) there is no symmetric, proper and open-minded epistemic utility function, and for every symmetric, proper and 2nd-differentiable utility function, the only open or closed credences are 0 and 1. But I will have to re-do the proofs to be sure.

If correct, this is paradoxical.

Thursday, February 9, 2012

Probabilities, scoring functions, and an argument that it is infinitely worse to be certain that a truth is false than it is good to be certain that that truth is true

One oddity of the normal 0-to-1 probability measure is that it hides epistemically significant differences near the endpoints in a way that may skew intuitions.  You need a ton of evidence to move your probability from 0.99 to 0.9999.  But the absolute difference in probabilities is only 0.0099.

It turns out there is a nice solution to this, apparently due to Alan Turing, which I had fun rediscovering yesterday.  Define
  • φ(H) = log(1/P(H 1) = log(P(H)/P(~H)), and 
  • φ(H|E) = log(1/P(H|E 1) = log(P(H|E)/P(~H|E)).  
This symmetrically transforms probabilities from the 0-to-1 range to a  to +∞ range.  To the right we have the graph of the transformation function.

But here is something else that's neat about φ.  It lets you rewrite Bayes' theorem so it becomes:
  • φ(H|E) = φ(H) + C(E,H), 
where C(E,H) = log(P(E|H)/P(E|~H)) is the log-Bayes'-ratio measure of confirmation.

And it gets better.  Suppose E1,...,En are pieces of evidence that are conditionally independent given H and conditionally independent given ~H.  (One can think of these pieces of evidence as independent tests for H versus ~H.  For instance, if our two hypotheses are that our coin is fair or that it is biased 9:1 in favor of heads, then E1,...,En can be the outcomes of successive tosses.)  Then:
  • φ(H|E1&...&En) = φ(H)+C(E1,H)+...+C(En,H).
In other words, φ linearizes the effect of independent evidence.  (Doesn't this make nicely plausible the claim that C(E,H) is the correct measure of confirmation, or at least is ordinally equivalent to it?)

Jim Hawthorne tells me that L. J. Savage used φ to prove a Bayesian convergence theorem, and it's not that hard to see from the above formulae how might go about doing that.

Moreover, there is a rather interesting utility-related fact about φ.  Suppose we're performing exactly similar independent tests for H versus ~H that provide only a very small incremental change in probabilities.  Suppose each test has a fixed cost to perform.  Suppose that in fact the hypothesis H is true, and we start with a φ-value of 0 (corresponding to a probability of 1/2).  Then, assuming that the conditional probabilities are such that one can confirm H by these tests, the expected cost of getting to a φ-value of y by using such independent tests turns out to be, roughly speaking proportional to y.  Suppose, on the other hand, that you have a negative φ-value y and you want to know just how unfortunate that is, in light of the fact that H is actually true.  You can quantify the badness of the negative φ-value by looking at how much you should expect it to cost to perform the experiments needed to get to the neutral φ-value of zero.  It turns out that the cost is, again roughly speaking, |y|.  In other words, φ quantifies experimental costs.

This in turn leads to the following intuition.  If H is true, the epistemic utility of having a negative φ-value of y is going to be proportional to y, since the cost of moving from y to 0 is proportional to |y|.  Then, assuming our epistemic utilities are proper, I have a theorem that shows that this forces (at least under some mild assumptions on the epistemic utility) a particular value for the epistemic utility for positive y.

Putting this in terms of credences rather than φ-values, it turns out that our measure of the epistemic utility of assigning credence r to a truth is proportional to:
  • log(1/r  1) for r1/2
  • 2 1/r for r1/2.
In particular, it's infinitely worse to be certain that a truth p is false than it is good to be certain that the truth p is true (cf. a much weaker result I argued for earlier).  (Hence, if faith requires certainty, then faith is only self-interestedly rational when there is some other infinite benefit of that certainty--which there may well be.)

The plot to the right shows the above two-part function.  (It may be of interest to note that the graph is concave--concavity is a property discussed in the scoring-rule literature.)  Notice how very close to linear it is in the region between around 0.25 and 0.6. 

My one worry about this is that by quantifying the disvalue of below-1/2 credence of a truth in terms of the experimental costs of getting out of the credence, one may be getting at practical rather than epistemic utility.  I am not very worried about this.

Tuesday, January 24, 2012

Beating Condorcet (well, sort of)

This builds on, but also goes back over the ground of, my previous post.

I've been playing with voting methods, or as I might prefer to call them "utility estimate aggregation methods." My basic model is there are n options (say, candidates) to choose between and m evaluators ("voters"). The evaluators would like to choose the option that has the highest utility. Unfortunately, the actual utilities of the options are not known, and all we have are estimates of the utilities by all the evaluators.

A standard method for this is the Condorcet method. An option is a Condorcet winner provided that it "beats" every other option, when an option x "beats" an option y provided that a majority of the evaluators estimates x more highly than y. If there is no Condorcet winner, there are further resolution methods, but I will only be looking at cases where there is a Condorcet winner.

My first method is

  • Method A: Estimate each option's utility with the arithmetical average of the reported utilities assigned to it by all the evaluators, and choose the option with the highest utility.
(I will be ignoring tie-resolution in this post, because all the utilities I will work with are real-numbered, and the probability of a tie will be zero.) This method can be proved to maximize epistemically expected utility under the
  • Basic Setup: Each evaluator's reported estimate of each option's utility is equal to the actual utility plus an error term. The error terms are (a) independent of the actual utilities and (b) normally distributed with mean zero. Moreover, (c) our information as to the variances of the error terms is symmetric between the evaluators, but need not be symmetric between the options (thus, we may know that option 3 has a higher variance in its error terms than option 7; we may also know that some evaluators have a greater variance in their error terms; but we do not know which evaluators have a greater variance than which).

Unfortunately, it is really hard to estimate absolute utility numbers. It is a lot easier to rank order utilities. And that's all Condorcet needs. So in that way at least, Condorcet is superior to Method A. To fix this, modify the Basic Setup to:

  • Modified Setup: Just like the Basic Setup, except that what is reported by each evaluator is not the actual utility plus error term, but the rank order of the actual utility plus error term.
In particular, we still assume that beneath the surface—perhaps implicitly—there is a utility estimate subject to the same conditions. Our method now is
  • Method B: Replace each evaluator's rank ordering with roughly estimated Z-scores by using the following algorithm: a rank of k (between 1 and n) is transformed to f((n+1/2−k)/n), where f is the inverse of the cumulative normal distribution function. Each option's utility is then estimated as the arithmetical average of the roughly estimated Z-scores across the evaluators, and the option with the highest estimate utility is chosen.

Now time for some experiments. Add to the Basic Setup the assumptions that (d) the actual utilities in the option pool are normally distributed with mean zero and variances one, and (e) the variances of all the evaluators' error terms are equal to 1/4 (i.e., standard deviation 1/2). All the experiments use 2000 runs. Because I developed this when thinking about grad admissions, the cases that interest me most are ones with a small number of evaluators and a large number of options, which is the opposite of how political cases work (though unlike in admissions, I am simplifying by looking for just the best option). Moreover, it doesn't really matter whether we choose the optimal option. What matters is how close the actual utility of the chosen option is to the actual utility of the optimal option. The difference in these utilities will be called the "error". If the error is small enough, there is no practically significant difference. Given the normal distribution of option utilities, about 95% of actual utilities are between -2 and 2, so if we have about 20 option, we can expect the best option to have a utility of somewhere of the order of magnitude of 2. Choosing at random would then give us an average error of the order of magnitude of 2. The tables below give the average errors for the 2000 runs of the experiments. Moreover, so as to avoid between different choices of resolution methods, I am discarding data from runs during which there was no Condorcet winners, and hence comparing Method A and Method B to Condorcet at its best (interestingly, Method A and Method B also work less well when there was no Condorcet winner). Discarded runs were approximately 2% of runs. Source code is available on request.

Experiment 1: 3 evaluators, 50 options.

Condorcet0.030
Method A0.023
Method B0.029
So, with a small number of evaluators and a large number of options, Method A significantly beats Condorcet. Method B slightly beats Condorcet.

Experiment 2: 50 evaluators, 50 options.

Condorcet0.0017
Method A0.0011
Method B0.0015
So we have a similar distribution of values, but of course with a larger number of evaluators, the error is smaller. It is interesting, however, that even with only three evaluators, the error was already pretty small, about 0.03 sigma for all the methods.

Experiment 3: 3 evaluators, 3 options.

Condorcet0.010
Method A0.007
Method B0.029
Method B is much worse than Condorcet and Method A in this case. That's because with three options, the naive Z-score estimation method in Method B fails miserably. With 3 options Method B is equivalent to a very simple method we might call Method C where we simply average the rank order numbers of the options across the evaluators. At least with 3 options, that is a bad way to go. Condorcet is much better, and Method A is even better if it is workable.

Experiment 4: 50 evaluators, 3 options.

Condorcet0.0003
Method A0.0002
Method B0.0159
The badness of Method B for a small number of options really comes across here. Condorcet and Method A really benefit from boosting the number of evaluators, but with only 3 options, Method B works miserably.

So, one of the interesting consequences is that Method B is strongly outperformed by Condorcet when the number of options is small. How small? A bunch of experiments suggests that it's kind of complicated. For three evaluators, Method B catches up with Condorcet at around 12 options. Somewhat surprisingly, for a greater number of evaluators, it needs more options for Method B to catch up with Condorcet. I conjecture that Method B works better than Condorcet when the number of options is significantly greater than the number of evaluators. In particular, in political cases where the opposite inequality holds, Condorcet far outperforms Method B.

One could improve on Method B, whose Achilles heel is the Z-score estimation, by having the evaluators include in their rankings options that are not presently available. One way to do that would be to increase the size of the option pool by including fake options. (In the case of graduate admissions, one could include a body of fake applications generated by a service.) Another way would be by including options from past evaluations (e.g., applicants from previous years). Then these would enter into the Z-score estimation, thereby improving Method B significantly. Of course, the down side of that is that it would be a lot more work for the evaluators, thereby making this unworkable.

Method A is subject to extreme evaluator manipulation, i.e., "strategic voting". Any evaluator can produce any result she desires by just reporting her utilities to swamp the utilities set by others. (The Basic Setup's description of the errors rules this out.) Method B is subject to more moderate evaluator manipulation. Condorcet, I am told, does fairly well. If anything like Method A is used, what is absolutely required is a community of justified mutual trust and reasonableness. Such mutual trust does, however, make possible noticeably better joint choices, which is an interesting result of the above.

So, yes, in situations of great trust where all evaluators can accurately report their utility estimates, we can beat Condorcet by adopting Method A. But that's a rare circumstance. In situations of moderate trust and where the number of candidates exceeds the number of evaluators, Method B might be satisfactory, but its benefits over Condorcet are small.

One interesting method that I haven't explored numerically would be this:

  • Method D: Have each evaluator assign a numerical evaluations to the options on a fixed scale (say, integers from 1 to 50). Adjust the numerical evaluations to Z-scores, using data from the evaluator's present and past evaluations using some good statistical method. Average these estimated Z-scores across evaluators and choose the option with the highest average.
Under appropriate conditions, this method should converge to Method A over time in the Modified Setup. There would be possibilities for manipulation, but they would require planning ahead, beyond the particular evaluation (e.g., one could keep all one's evaluations in a small subset of the scale, and then when one really wants to make a difference, one jumps outside of that small subset).