Alexander Pruss's Blog: prevision

Showing posts with label prevision. Show all posts

Tuesday, November 22, 2022

Hyperreal expected value

I think I have a hyperreal solution, not entirely satisfactory, to three problems.

The problem of how to value the St Petersburg paradox. The particular version that interests me is one from Russell and Isaacs which says that any finite value is too small, but any infinite value violates strict dominance (since, no matter what, the payoff will be less than infinity).
How to value gambles on a countably infinite fair lottery where the gamble is positive and asymptotically approaches zero at infinity. The problem is that any positive non-infinitesimal value is too big and any infinitesimal value violates strict dominance.
How to evaluate expected utilities of gambles whose values are hyperreal, where the probabilities may be real or hyperreal, which I raise in Section 4.2 of my paper on accuracy in infinite domains.

The apparent solution works as follows. For any gamble with values in some real or hyperreal field V and any finitely-additive probability p with values in V, we generate a hyperreal expected value E_p, which satisfies these plausible axioms:

Linearity: E_p(af+bg) = aE_pf + bE_pg for a and b in V
Probability-match: E_p1_A = p(A) for any event A, where 1_A is 1 on A and 0 elsewhere
Dominance: if f ≤ g everywhere, then E_pf ≤ E_pg, and if f < g everywhere, then E_pf < E_pg.

How does this get around the arguments I link to in (1) and (2) that seem to say that this can’t be done? The trick is this: the expected value has values in a hyperreal field W which will be larger than V, while (4)–(6) only hold for gambles with values in V. The idea is that we distinguish between what one might call primary values, which are particular goods in the world, and what one might call distribution values, which specify how much a random distribution of primary values is worth. We do not allow the distribution values themselves to be the values of a gamble. This has some downsides, but at least we can have (4)–(6) on all gambles.

How is this trick done?

I think like this. First it looks like the Hahn-Banach dominated extension theorem holds for V₂-valued V₁-linear functionals on V₁-vector spaces V₁ ⊆ V₂ are real or hyperreal field, except that our extending functional may need to take values in a field of hyperreals even larger than V₂. The crucial thing to note is that any subset of a real or hyperreal field has a supremum in a larger hyperreal field. Then where the proof of the Hahn-Banach theorem uses infima and suprema, you move to a larger hyperreal field to get them.

Now, embed V in a hyperreal field V₂ that contains a supremum for every subset of V, and embed V₂ in V₃ which has a supremum for every subset of V₂. Let Ω be our probability space.

Let X be the space of bounded V₂-valued functions on Ω and let M ⊆ X be the subspace of simple functions (with respect to the algebra of sets that Ω is defined on). For f ∈ M, let ϕ(f) be the integral of f with respect to p, defined in the obvious way. The supremum on V₂ (which has values in V₃) is then a seminorm dominating ϕ. Extend ϕ to a V-linear function ϕ on X dominated by V₂. Note that if f > 0 everywhere for f with values in V, then f > α > 0 everywhere for some α ∈ V₂, and hence ϕ(−f) ≤ − α by seminorm domination, hence 0 < α ≤ ϕ(f). Letting E_p be ϕ restricted to the V-valued functions, our construction is complete.

I should check all the details at some point, but not today.

Monday, November 21, 2022

Dominance and countably infinite fair lotteries

Suppose we have a finitely-additive probability assignment p (perhaps real, perhaps hyperreal) for a countably infinite lottery with tickets 1, 2, ... in such a way that each ticket has infinitesimal probability (where zero counts as an infinitesimal). Now suppose we want to calculate the expected value or previsio E_pU of any bounded wager U on the outcome of the lottery, where we think of the wager as assigning a value to each ticket, and the wager is bounded if there is a finite M such that |U(n)| < M for all n.

Here are plausible conditions on the expected value:

Dominance: If U₁ < U₂ everywhere, then E_pU₁ < E_pU₂.
Binary Wagers: If U is 0 outside A and c on A, then E_pU = cP(A).
Disjoint Additivity: If U₁ and U₂ are wagers supported on disjoint events (i.e., there is no n such
that U₁(n) and U₂(n) are both non-zero), then E_p(U₁+U₂) = E_pU₁ + E_pU₂.

But we can’t. For suppose we have it. Let U(n) = 1/(2n). Fix a positive integer m. Let U₁(n) be 2 for n ≤ m + 1 and 0 otherwise. Let U₂(n) be 1/m for n > m + 1 and 0 for n ≤ m + 1. Then by Binary Wagers and by the fact that each ticket has infinitesimal probability, E_pU₁ is an infinitesimal α (since the probability of any finite set will be infinitesimal). By Binary Wagers and Dominance, E_pU₂ ≤ 1/(m+1). Thus by Disjoint Additivity, E_p(U₁+U₂) ≤ α + 1/(m+1) < 1/m. But U < U₁ + U₂ everywhere, so by Dominance we have E_pU < 1/m. Since 0 < U everywhere, by Dominance and Binary Wagers we have 0 < E_pU.

Thus, E_pU is a non-zero infinitesimal β. But then β < U(n) for all n, and so by Binary Wagers and Dominance, β < E_pU, a contradiction.

I think we should reject Dominance.

Thursday, March 18, 2021

Valuations and credences

One picture of credences is that they are derived from agents’ valuations of wagers (i.e., previsions) as follows: the agent’s credence in a proposition p is equal to the agent’s valuation of a gamble that pays one unit if p is true and 0 units if p false.

While this may give the right answer for a rational agent, it does not work for an irrational agent. Here are two closely related problems. First, note that the above definition of credences is dependent on the unit system in which the gambles are denominated. A rational agent who values a gamble that pays one dollars on heads and zero dollars otherwise at half a dollar will also value a gamble that pays one yen on heads and zero yen otherwise at half a yen, and we can attribute a credence of 1/2 in heads to the agent. In general, the rational agent’s valuations will be invariant under affine transformations and so we do not have a problem. But Bob, an irrational agent, might value the first gamble at $0.60 and the second at 0.30 yen. What, then, is that agent’s credence in heads?

If there were a privileged unit system for utilities, we could use that, and equate an agent’s credence in p with their valuation of a wager that pays one privileged unit on p and zero on not-p. But there are many units of utility, none of them privileged: dollars, yen, hours of rock climbing, glazed donuts, etc.

And even if there were a privileged unit system, there is a second problem. Suppose Alice is an irrational agent. Suppose Alice has two different probability functions, P and Q. When Alice needs to calculate the value of a gamble that pays exactly one unit on some proposition and exactly zero units on the negation of that proposition, she uses classical mathematical expectation based on P. When Alice needs to calculate the value of any other gamble—i.e., a gamble that has fewer than or more than two possible payoffs or a gamble that has two payoffs but at values other than exactly one or zero—she uses classical mathematical expectation based on Q.

Then the proposed procedure attributes to Alice the credence function P. But it is in fact Q that is predictive of Alice’s behavior. For we are never in practice offered gambles that have exactly two payoffs. Coin-toss games are rare in real life, and even they have more than two payoffs. For instance, suppose I tell you that I will give you a dollar on heads and zero otherwise. Well, a dollar is worth a different amount depending on when exactly I give it to you: a dollar given earlier is typically more valuable, since you can invest it for longer. And it’s random when exactly I will pay you. So on heads, there are actually infinitely many possible payoffs, some slightly larger than others. Moreover, there is a slight chance of the coin landing on the edge. While that eventuality is extremely unlikely, it has a payoff that’s likely to be more than a dollar: if you ever see a coin landing on edge, you will get pleasure out of telling your friends about it afterwards. Moreover, even if we were offered a gamble that had exactly two payoffs, it is extremely unlikely that these payoffs would be exactly one and zero in the privileged unit system.

The above cases do not undercut a more sophisticated story about the relationship between credences and valuations, a story on which one counts as having the credence that would best fit one’s practical valuations of gambles with two-values, and where there is a tie, one’s credences are underdetermined or interval-valued. In Alice’s case, for instance, it is easy to say that Q best fits the credences, while in Bob’s case, the credence for heads might be a range from 0.3 to 0.6.

But we can imagine a variant of Alice where she uses P whenever she has a gamble that has only two payoffs, and she uses Q at all other times. Since in practice two-payoff gambles don’t occur, she always uses Q. But if we use two-payoff gambles to define credences, then Alice will get P attributed to her as her credences, despite her never using P.

Can we have a more sophisticated story that allows credences to be defined in terms of valuations of gambles with more payoffs than two? I doubt it. For there are multiple ways of relating a prevision to a credence when we are dealing with an inconsistent agent, and none of them seem privileged. Even my favorite way, the Level Set Integral, comes in two versions: the Split and Shifted versions.

Monday, July 6, 2020

Avoiding Dutch Books Despite Inconsistent Credences

Preprint is available here. Just accepted by Synthese.

Thursday, February 13, 2020

Domination and probabilistic consistency

Suppose that ≼ is a total preorder on simple utility functions on some space Ω with an algebra of subsets. Define f ≺ g iff f ≼ g but not g ≼ f. Think of ≼ as a decision procedure: you are required (permitted) to choose g over f iff f ≺ g (g ≼ f).

Suppose ≼ doesn’t allow choosing a dominated wager:

If f < g everywhere, then f ≺ g.

Let 1_A be the function that is 1 on A and 0 outside A. Define Ef = sup{c : c ⋅ 1_Ω ≼ f}. Here are some facts about E:

If c < Ef < c′, then c ⋅ 1_Ω ≺ f ≺ c′⋅1_Ω.
E(c ⋅ 1_Ω)=c
If f ≤ g everywhere, then Ef ≤ Eg.
If Ef < Eg, then f ≺ g.

(But we can’t count on its being the case that f ≺ g if and only if Ef < Eg.)

Now consider what I’ve called the independent and cumulative decision procedures for sequences of choices. On an independent decision procedure, at each stage you must choose a wager that is ≼-maximal (and you may choose any of the maximal ones). On a cumulative decision procedure, at each stage you must choose a wager that when added to what you’ve already chosen yields a ≼-maximal wager (and you may choose any of the maximal ones).

I think (I haven’t written it all down) I can prove that the following conditions are equivalent:

E is an expected value with respect to a finitely-additive probability P on Ω.
The independent decision procedure applied to a sequence of binary choices never permits you to choose a sequence of wagers whose sum is strictly dominated by the sum of a different sequence of wagers you could have chosen.
The cumulative decision procedure applied to a sequence of binary choices never permits you to choose a sequence of wagers whose sum is strictly dominated by the sum of a different sequence of wagers you could have chosen.

The probability P is defined by P(A)=E(1_A).

All that said, I think that when your credences are inconsistent, you may need to decide neither independently nor cumulatively, but holistically, taking into account what wagers you made and what wagers you expect you will make.

Monday, February 3, 2020

A problem for Level Set Integrals

Suppose that you have inconsistent but monotone credences: if p entails q then P(Q)≥P(p). Level Set Integrals (LSI) provide a way of evaluating expected utilities that escapes Dutch Books and avoids domination failure: if E(f)≥E(g) then g cannot dominate f.

Sadly, by running this simple script, I’ve just discovered that LSI need not escape multi-shot domination failure. Suppose you have these monotonic credences for a coin toss:

Heads: 1/4
Tails: 1/4
Heads or Tails: 1
Neither: 0.

Suppose you’ll first be offered a choice between these two wagers:

A: $1 if Heads and $1 if Tails
A′: $3 if Heads and $0 if Tails

and then second you will be offered a choice between these two wagers:

B: $1 if Heads and $1 if Tails
B′: $0 if Heads and $3 if Tails.

You will first choose A over A′ and then B over B′. This is true regardless of whether you use the independent multi-shot decision procedure where you ignore previous wagers or the cumulative method where you compare the expected utilities of the sum of all the unresolved wagers. The net result of your choices is getting $2 no matter what. But if you chose A′ over A and B′ over B, you’d have received $3 no matter what.

Stepping back, classical expected utility with consistent credences has the following nice property: When you are offered a sequence of choices between wagers, with the offers not known in advance to you but also not dependent on your choices, and you choose by classical expected utility, you won’t choose a sequence of wagers dominated by another sequence you could have chosen.

Level Set Integrals with in my above case (and I strongly suspect more generally, but I don’t have a theorem yet) do not have this property.

I wonder how much it matters that one does not have this property. The property is one that involves choosing sequentially without knowing what choices are coming in the future, but the choices available in the future are not dependent on the choices already made. This seems a pretty gerrymandered situation.

If you do know what choices will be available in the future, it is easy to avoid being dominated: you figure out what sequence of choices has the overall best Level Set Integral, and you stick to that sequence.

Wednesday, January 22, 2020

Lebesgue sums previsions don't always lead to Dutch Books for inconsistent credences

Suppose E is the Lebesgue-sum prevision. Namely, if W is a wager on a finite space Ω with a credence (perhaps inconsistent P) and U_W is the utility function corresponding to W, then EW = ∑_yP({ω : U_W(ω)=y}).

Suppose your decision procedure for repeated wagers is to accept a wager if and only if the wager’s value is non-negative (independently of whatever other wagers you might have accepted). Suppose, further, that Ω has exactly two points and the credence of each point is non-negative and of at least one it is positive.

Proposition: Then, no finite sequence of wagers forms a Dutch Book.

Proof: Consider the sequence of utility functions U₁, ..., U_n that corresponds to a Dutch Book sequence of wagers W₁, ..., W_n. Then U₁ + ... + U_n < 0 everywhere on Ω and yet EW_i ≥ 0 for all i. Let a and b be the two points of Ω. Reordering the wagers if necessary (the order doesn’t matter on this decision procedure), we can assume that the wagers W₁, ..., W_m are such that U_i(a)≠U_i(b) for i ≤ m, and that W_m + 1, ..., W_n are such that U_i(a)=U_i(b) for i > m. Then EW_i = U_i(a)=U_i(b) for i > m. Hence, U_i is positive everywhere on Ω for i > m. So, if W₁, ..., W_n form a Dutch Book, so do W₁, ..., W_m. Now, EW_i = αU_i(a)+βU_i(b) where α and β are the probabilities of a and b respectively. It follows that EW₁ + ... + EW_n = α(U₁(a)+...+U_m(a)) + β(U₁(b)+...+U_m(b)). Since this is a Dutch Book, it follows that the two sums on the right hand side are both negative. Since α and β are non-negative and at least one is positive, it follows that EW₁ + ... + EW_n < 0, and hence this isn’t a Dutch Book.

Monday, December 16, 2019

Previsions for inconsistent credences and arguments for probabilism

Fix a sample space Ω and an algebra F events on Ω. A gamble is an F-measurable real-valued function on Ω. A credence function is a function from a F to the reals. A prevision or price function on a set of set G of gambles is just a function from G to the real numbers. A previsory method E on a set of gambles G and a set of credence functions C assigns to each credence function P ∈ C a prevision E_P on G.

A previsory method on G and C has the weak domination property provided that if f and g are two gambles such as that f ≤ g everywhere on Ω, then E_P(f)≤E_P(g) for every f and g in G and P in C. It has the strong domination property provided that it has the weak domination property and if f < g everywhere on Ω, then E_P(f)<E_P(g). It has the zero property provided that E_P(0)=0.

Mathematical expectation is a previsory method on the set of all bounded gambles and all probability functions. It has the zero and strong domination properties.

The level set integral is a previsory method on the set of all bounded gambles and all monotonic credence functions (P is monotonic iff P(⌀)=0, P(Ω)=1 and P(A)≤P(B) whenever A ⊆ B). It has the zero and weak domination properties.

The level set integral has the strong domination property on the set of weakly countably additive monotonic credence functions, where P is weakly countably additive provided that Ω cannot be written as a countable union of sets each of credence 0. If F (or Ω) is finite, we get weak countable additivity for free from monotonicity.

A previsory method E requires (permits) a gamble f given a credence P provided that E_P(f)>0 (E_P(f)≥0); it requires (permits) it over some set S of gambles provided that E_P(f)>E_P(g) (E_P(f)≥E_p(g)) for every g in S.

A previsory method with the zero and weak domination properties cannot be strongly Dutch-Booked in a single wager: i.e., there is no gamble U such that U < 0 everywhere that the method requires. If it also has the strong domination property, it cannot be weakly Dutch-Booked in a single wager: there is no U such that U < 0 everywhere that the method permits.

Suppose we combine a previsory method with the following method of choosing which gambles to adopt in a sequence of offered gambles: you are required (permitted) to accept gamble g provided that E_P(g₁ + ... + g_n + g)>E_P(g₁ + ... + g_n) (≥, respectively) where g₁ + ... + g_n are the gambles already accepted. Then given the zero and weak domination properties, we cannot be strongly Dutch-Booked by a sequence of wagers, and given additionally the strong domination property, we cannot be weakly Dutch-Booked, either.

Given that level set integrals provide a non-trivial and mathematically natural previsory method with the zero and strong domination properties on a set of credence functions strictly larger than the consistent ones, Dutch-Book arguments for consistency fail.

What about epistemic utility, i.e., scoring-rule, arguments? I think these also fail. A scoring-rule assigns a number s(p, q) to a credence function p and a truth function q (i.e., a probability function whose values are always 0 or 1). Let T be truth, i.e., a function from Ω to truth functions such that T(ω)(A) if and only if ω ∈ A. Thus, T(ω) is the truth function that says “we are at ω” and we can think of s(p, T) as a gamble that measures how far p is from truth.

If E is previsory method on a set of gambles G and a set of credence functions C, then we say that s is an E-proper scoring rule provided that s(p, T) is in G for every p in C and E_ps(p, T)≤E_ps(q, T) for every p and q in C. We say that it is strictly proper if additionally we have strict inequality whenever p and q are different.

If E is mathematical expectation, then E-propriety and strict E-propriety are just propriety and strict propriety.

It is thought (Joyce and others) that one can make use of the concept of strictly propriety to argue for that credence functions should be consistent. This uses a domination theorem that says that if s is a strictly proper additive scoring rule, then for any inconsistent credence function p there is a consistent function q such that s(p, T(ω)) < s(q, T(ω)) for all ω. (Roughly, an additive scoring rule adds up scores point-by-point over Ω.)

However, I think the requirement of additivity is one that someone sceptical of the consistency requirement can reasonably reject. There are mathematical natural previsory methods E that apply to some inconsistent credences, such as the monotonic ones, and these can be used to define (at least under some conditions) strictly E-proper scoring rules. And the domination theory won’t apply to these rules because they won’t be additive. Indeed, that is one of the things the domination theorem shows: if C includes an inconsistent credence function and E has the strong domination property, then no strictly E-proper scoring rule is additive.

So, really, how helpful the domination theorem is for arguing for consistency depends on whether additivity is a reasonable condition to require of a scoring rule. It seems that someone who thinks that it is OK to reason with a broader set of credences than the consistent ones, and who has a natural previsory method E with the strong domination property for these credences, will just say: I think the relevant notion isn’t propriety but E-propriety, and there are no strongly E-proper scoring rules that are additive. So, additiveness is not a reasonable condition.

Are there any strongly E-proper scoring rules in such cases?

[The rest of the post is based on the mistake that E-propriety is additive and should be dismissed. See my discussion with Ian in the comments.]

Sometimes, yes.

Suppose E is previsory method with the weak domination condition on the set of all bounded gambles on Ω. Suppose that E has the scaling property that E_p(cf)=cE_p(f) for any real constant c. (Level Set Integrals have scaling.) Further, assume the separability property that there is a countable set of B of bounded gambles such that for any two distinct credences p and q, there is a bounded gamble f in B such that E_pf ≠ E_qf. (Level Set Integrals on a finite Ω—or on a finite field of events—have separability: just let B be all functions whose values are either 0 or 1, and note that E_p1_A = p(A) where 1_A is the function that is 1 on A and 0 outside it.) Finally, suppose normalization, namely that E_p1_Ω = 1. (Level Set Integrals clearly have that.)

Note that given separability, scaling and normalization, there is a countable set H of bounded gambles such that if p and q are distinct, there exist f and g in H such that E_p requires f over g (i.e., E_pf > E_pg) and E_q does not or vice versa. To see this, let H consist of B together with all constant rational-valued functions, and note that if E_pf < E_qf, then we can choose a rational number r such that r lies between E_pf and E_qf, and then E_p and E_q will disagree on whether f is required over r ⋅ 1_Ω.

Let H be the countable set in the above remark. By scaling, we may assume that all the gambles in H are bounded by 1. Let (f₁, g₁),(f₂, g₂),... be an enumeration of all pairs of members of H. Define s_n(p, T(ω)) for a credence function p in C as follows: if E_p requires f_n over g_n then s_n(p, T(ω)) = −f_n(ω), and otherwise s_n(p, T(ω)) = −g_n(ω).

Note that s_n is an E-proper scoring rule. For suppose that q is a different credence function from p and E_ps_n(p, T)>E_ps_n(q, T). Now there are four possibilities depending on whether E_p and E_q require f_n over g_n and it is easy to see that each possibility leads to a contradiction. So, we have E-propriety.

Now, let s(p, T) be Σ_n = 1^∞ 2⁻ⁿs_n(p, T). The sum of E-proper scoring rules is E-proper, so this is an E-proper scoring rule.

What about strict propriety? Suppose that p and q are credence functions in C and E_ps(p, T)≤E_ps(q, T). By the E-propriety of each of the s_n, we must have E_ps_n(p, T)=E_ps_n(q, T) for all n. Thus, for all pairs of members of H, the requirements of E_p and E_q must agree, and by choice of H, p and q cannot be different.

Alexander Pruss's Blog