Showing posts with label open-mindedness. Show all posts
Showing posts with label open-mindedness. Show all posts

Monday, January 20, 2025

Open-mindedness and epistemic thresholds

Fix a proposition p, and let T(r) and F(r) be the utilities of assigning credence r to p when p is true and false, respectively. The utilities here might be epistemic or of some other sort, like prudential, overall human, etc. We can call the pair T and F the score for p.

Say that the score T and F is open-minded provided that expected utility calculations based on T and F can never require you to ignore evidence, assuming that evidence is updated on in a Bayesian way. Assuming the technical condition that there is another logically independent event (else it doesn’t make sense to talk about updating on evidence), this turns out to be equivalent to saying that the function G(r) = rT(r) + (1−r)F(r) is convex. The function G(r) represents your expected value for your utility when your credence is r.

If G is a convex function, then it is continuous on the open interval (0,1). This implies that if one of the functions T or F has a discontinuity somewhere in (0,1), then the other function has a discontinuity at the same location. In particular, the points I made in yesterday’s post about the value of knowledge and anti-knowledge carry through for open-minded and not just proper scoring rules, assuming our technical condition.

Moreover, we can quantify this discontinuity. Given open-mindedness and our technical condiiton, if T has a jump of size δ at credence r (e.g., in the sense that the one-sided limits exist and differ by y), then F has a jump of size rδ/(1−r) at the same point. In particular, if r > 1/2, then if T has a jump of a given size at r, F has a larger jump at r.

I think this gives one some reason to deny that there are epistemically important thresholds strictly between 1/2 and 1, such as the threshold between non-belief and belief, or between non-knowledge and knowledge, even if the location of the thresholds depends on the proposition in question. For if there are such thresholds, then now imagine cases of propositions p with the property that it is very important to reach a threshold if p is true while one’s credence matters very little if p is false. In such a case, T will have a larger jump at the threshold than F, and so we will have a violation of open-mindedness.

Here are three examples of such propositions:

  • There are objective norms

  • God exists

  • I am not a Boltzmann brain.

There are two directions to move from here. The first is to conclude that because open-mindedness is so plausible, we should deny that there are epistemically important thresholds. The second is to say that in the case of such special propositions, open-mindedness is not a requirement.

I wondered initially whether a similar argument doesn’t apply in the absence of discontinuities. Could one have T and F be openminded even though T continuously increases a lot faster than F decreases? The answer is positive. For instance the pair T(r) = e10r and F(r) =  − r is open-minded (though not proper), even though T increases a lot faster than F decreases. (Of course, there are other things to be said against this pair. If that pair is your utility, and you find yourself with credence 1/2, you will increase your expected utility by switching your credence to 1 without any evidence.)

Wednesday, May 15, 2024

Very open-minded scoring rules

An accuracy scoring rule is open-minded provided that the expected value of the score after a Bayesian update on a prospective observation is always greater than or equal to the current expected value of the score.

Now consider a single-proposition accuracy scoring rule for a hypothesis H. This can be thought of as a pair of functions T and F where T(p) is the score for assigning credence p when H is true and F(p) is the score for assigning credence p when H is false. We say that the pair (T,F) is very open-minded provided that the conditional-on-H expected value of the T score after a Bayesian update on a prospective observation is greater than or equal to the current expected value of the T score and provided that the same is true for the F score with the expected value being conditional on not-H.

An example of a very open-minded scoring rule is the logarithmic rule where T(p) = log p and F(p) = log (1−p). The logarithmic rule has some nice philosophical properties which I discuss in this post, and it is easy to see that any very open-minded scoring rule has these properties. Basically, the idea is that if I measure epistemic utilities using a very open-minded scoring rule, then I will not be worried about Bayesian update on a prospective observation damaging other people’s epistemic utilities, as long as these other people agree with me on the likelihoods.

One might wonder if there are any other non-trivial proper and very open-minded scoring rules besides the logarithmic one. There are. Here’s a pretty easy to verify fact (see the Appendix):

  • A scoring rule (T,F) is very open-minded if and only if the functions xT(x) and (1−x)F(1−x) are both convex.

Here’s a cute scoring rule that is proper and very open-minded and proper:

  • T(x) =  − ((1−x)/x)1/2 and F(x) = T(1−x).

(For propriety, use Fact 1 here. For open-mindedness, note that the graph of xT(x) is the lower half of the semicircle with radius 1/2 and center at (1/2,0), and hence is convex.)

What’s cute about this rule? Well, it is symmetric (F(x) = T(1−x)) and it has the additional symmetry property that xT(x) = (1−x)T(1−x) = (1−x)F(x). Alas, though, T is not concave, and I think a good scoring rule should have T concave (i.e., there should be diminishing returns from getting closer to the truth).

Appendix:

Suppose that the prospective observation is as to which cell of the partition E1, ..., En we are in. The open-mindedness property with respect to T then requires:

  1. iP(Ei|H)T(P(H|Ei)) ≥ T(P(H)).

Now P(Ei|H) = P(H|Ei)P(Ei)/P(H). Thus what we need is:

  1. iP(Ei)P(H|Ei)T(P(H|Ei)) ≥ P(H)T(P(H)).

Given that P(H) = ∑iP(Ei)P(H|Ei), this follows immediately from the convexity of xT(x). The converse is easy, too.

Wednesday, February 1, 2023

Open-mindedness and propriety

Suppose we have a probability space Ω with algebra F of events, and a distinguished subalgebra H of events on Ω. My interest here is in accuracy H-scoring rules, which take a (finitely-additive) probability assignment p on H and assigns to it an H-measurable score function s(p) on Ω, with values in [−∞,M] for some finite M, subject to the constraint that s(p) is H-measurable. I will take the score of a probability assignment to represent the epistemic utility or accuracy of p.

For a probability p on F, I will take the score of p to be the score of the restriction of p to H. (Note that any finitely-additive probability on H extends to a finitely-additive probability on F by Hahn-Banach theorem, assuming Choice.)

The scoring rule s is proper provided that Eps(q) ≤ Eps(p) for all p and q, and strictly so if the inequality is strict whenever p ≠ q. Propriety says that one never expects a different probability from one’s own to have a better score (if one did, wouldn’t one have switched to it?).

Say that the scoring rule s is open-minded provided that for any probability p on F and any finite partition V of Ω into events in F with non-zero p-probability, the p-expected score of finding out where in V we are and conditionalizing on that is at least as big as the current p-expected score. If the scoring rule is open-minded, then a Bayesian conditionalizer is never precluded from accepting free information. Say that the scoring rule s is strictly open-minded provided that the p-expected score increases of finding out where in V we are and conditionalizing increases whenever there is at least one event E in V such that p(⋅|E) differs from p on H and p(E) > 0.

Given a scoring rule s, let the expected score function Gs on the probabilities on H be defined by Gs(p) = Eps(p), with the same extension to probabilities on F as scores had.

It is well-known that:

  1. The (strict) propriety of s entails the (strict) convexity of Gs.

It is easy to see that:

  1. The (strict) convexity of Gs implies the (strict) open-mindedness of s.

Neither implication can be reversed. To see this, consider the single-proposition case, where Ω has two points, say 0 and 1, and H and F are the powerset of Ω, and we are interested in the proposition that one of these point, say 1, is the actual truth. The scoring rule s is then equivalent to a pair of functions T and F on [0,1] where T(x) = s(px)(1) and F(x) = s(px)(0) where px is the probability that assigns x to the point 1. Then Gs corresponds to the function xT(x) + (1−x)F(x), and each is convex if and only if the other is.

To see that the non-strict version of (1) cannot be reversed, suppose (T,F) is a non-trivial proper scoring rule with the limit of F(x)/x as x goes to 0 finite. Now form a new scoring rule by letting T * (x) = T(x) + (1−x)F(x)/x. Consider the scoring rule (T*,0). The corresponding function xT * (x) is going to be convex, but (T*,0) isn’t going to be proper unless T* is constant, which isn’t going to be true in general. The strict version is similar.

To see that (2) cannot be reversed, note that the only non-trivial partition is {{0}, {1}}. If our current probability for 1 is x, the expected score upon learning where we are is xT(1) + (1−x)F(0). Strict open-mindedness thus requires precisely that xT(x) + (1−x)F(x) < xT(1) + (1−x)F(0) whenever x is neither 0 nor 1. It is clear that this is not enough for convexity—we can have wild oscillations of T and F on (0,1) as long as T(1) and F(1) are large enough.

Nonetheless, (2) can be reversed (both in the strict and non-strict versions) on the following technical assumption:

  1. There is an event Z in F such that Z ∩ A is a non-empty proper subset of A for every non-empty member of H.

This technical assumption basically says that there is a non-trivial event that is logically independent of everything in H. In real life, the technical assumption is always satisfied, because there will always be something independent of the algebra H of events we are evaluating probability assignments to (e.g., in many cases Z can be the event that the next coin toss by the investigator’s niece will be heads). I will prove that (2) can be reversed in the Appendix.

It is easy to see that adding (3) to our assumptions doesn’t help reverse (1).

Since open-mindedness is pretty plausible to people of a Bayesian persuasion, this means that convexity of Gs can be motivated independently of propriety. Perhaps instead of focusing on propriety of s as much as the literature has done, we should focus on the convexity of Gs?

Let’s think about this suggestion. One of the most important uses of scoring rules could be to evaluate the expected value of an experiment prior to doing the experiment, and hence decide which experiment we should do. If we think of an experiment as a finite partition V of the probability space with each cell having non-zero probability by one’s current lights p, then the expected value of the experiment is:

  1. A ∈ Vp(A)EpAs(pA) = ∑A ∈ Vp(A)Gs(pA),

where pA is the result of conditionalizing p on A. In other words, to evaluate the expected values of experiments, all we care about is Gs, not s itself, and so the convexity of Gs is a very natural condition: we are never oligated to refuse to know the results of free experiments.

However, at least in the case where Ω is finite, it is known that any (strictly) convex function (maybe subject to some growth conditions?) is equal to Gu for a some (strictly) proper scoring rule u. So we don’t really gain much generality by moving from propriety of s to convexity of Gs. Indeed, the above observations show that for finite Ω, a (strictly) open-minded way of evaluating the expected epistemic values of experiments in a setting rich enough to satisfy (3) is always generatable by a (strictly) proper scoring rule.

In other words, if we have a scoring rule that is open-minded but not proper, we can find a proper scoring rule that generates the same prospective evaluations of the value of experiments (assuming no special growth conditions are needed).

Appendix: We now prove the converse of (2) assuming (3).

Assume open-mindedness. Let p1 and p2 be two distinct probabilities on H and let t ∈ (0,1). We must show that if p = tp1 + (1−t)p2, then

  1. Gs(p) ≤ tGs(p1) + (1−t)Gs(p2)

with the inequality strict if the open-mindedness is strict. Let Z be as in (3). Define

  1. p′(AZ) = tp1(A)

  2. p′(AZc) = (1−t)p2(A)

  3. p′(A) = p(A)

for any A ∈ H. Then p′ is a probability on the algebra generated by H and Z extending p. Extend it to a probability on F by Hahn-Banach. By open-mindedness:

  1. Gs(p′) ≤ p′(Z)EpZs(pZ) + p′(Zc)EpZcs(pZc).

But p′(Z) = p(ΩZ) = t and p′(Zc) = 1 − t. Moreover, pZ = p1 on H and pZc = p2 on H. Since H-scores don’t care what the probabilities are doing outside of H, we have s(pZ) = s(p1) and s(pZc) = s(p2) and Gs(p′) = Gs(p). Moreover our scores are H-measurable, so EpZs(p1) = Ep1s(p1) and EpZcs(p2) = Ep2s(p2). Thus (9) becomes:

  1. Gs(p) ≤ tGs(p1) + (1−t)Gs(p2).

Hence we have convexity. And given strict open-mindedness, the inequality will be strict, and we get strict convexity.