General Theory of Processes

General Theory of Processes - Part 1

Introduction
Pavings and Mosaics
1. Properties of pavings and mosaics
2. Compact pavings
Envelopes
1. Properties of envelopes
Capacitance
Scrapers
Choquet Capacities
Measurable Projection
Measurable Graph
Debut
Measurable Section
Epilogue
References

I have been attending a reading course on stochastic analysis led by Professor Ioannis Karatzas, where the students take turns in presenting a topic of their choice. I recently presented on Choquet's theory of capacities and its applications to measure theory and in the general theory of processes. This blog post is based on this presentation. I have freely copied^[1] from many sources, but my primary reference are the (unfortunately unpublished) lecture notes by Prof. Karatzas.

Introduction

Kolmogorov laid the modern axiomatic foundations of probability theory with the German monograph Grundbegriffe der Wahrscheinlichkeitsrechnung which appeared in Ergebnisse Der Mathematik in 1933. This was a period of intense discussions on the foundations of probability, and a majority of probabilists at the time considered measure theory not only a waste of time, but also an offense to "probabilistic intuition" (Meyer, 2009). But by 1950, with the work of Doob in particular, these discussions of foundations had been settled.

Continuous-time processes, on the other hand, were difficult to tame even with measure theory: if a particle is subject to random evolution, to show that its trajectory is continuous, or bounded, requires that all time values be considered, whereas classical measure theory can only handle a countable infinity of time values. Thus, not only does probability depend on measure theory, but it also requires more of measure theory than the rest of analysis (Meyer, 2009).

The missing pieces of the puzzle, which will be the highlight of this and the next blog post, are the debut, section and projection theorems. These theorems are also indispensable in many applications, for instance in dynamic programming and stochastic control (Karoui and Tan, 2013).

To get a taste of these theorems, let's recall a famous error made by Lebesgue in the paper Sur les fonctions représentables analytiquement published in 1905. Consider the measurable space \((\mathbb R^2, \mathcal{B}(\mathbb R^2))\) and the projection map \(\pi\) given by \(\mathbb R^2 \ni (x,y) \mapsto \pi(x,y) = y \in \mathbb R.\) It is easy to see that for any open set \(O\) in \(\mathbb R^2\), the set \(\pi(O)\) is also open in \(\mathbb R\): Recall that the standard topology on \(\mathbb R^2\) is same as the product topology on \(\mathbb R^2.\) By the definition of the product topology on \(\mathbb R^2\), an open set \(O\) in \(\mathbb R^2\) is of the form \(O = \bigcup_{i \in I} \bigcap_{j \in J_i} U_{ij} \times V_{ij}\) for open \(U_{ij}, V_{ij}\) in \(\mathbb R\), \(I\) arbitrary and \(J_i\) finite. A simple argument gives \(\pi(O) = \bigcup_{i \in I} \bigcap_{j \in J_i} U_{ij}\) which is open in \(\mathbb R.\) In fact, more generally, projection from any product space (with product topology) is an open map. Now it seems reasonable to expect that for any Borel set \(B \in \mathcal{B}(\mathbb R^2)\) its projection is also a Borel set in \(\mathcal{B}(\mathbb R)\), and Lebesgue assumed this in his paper. But, in fact, this is FALSE! The error was spotted in around 1917 by Mikhail Suslin, who realised that the projection map need not be Borel, and this lead to his investigation of analytic sets and to begin the study of what is now known as descriptive set theory (Almost Sure blog).

The problem is projection doesn't commute with countable decreasing intersection. For example, consider the decreasing sequence of sets \(S_n = (0, 1/n) \times \mathbb R.\) Then \(\pi(S_n) = \mathbb R\) for all \(n\), giving \(\bigcap_{n \in \mathbb N} \pi(S_n) = \mathbb R\), but \(\bigcap_{n \in \mathbb N} S_n = \varnothing\), giving \(\pi \left( \bigcap_{n \in \mathbb N} S_n \right) = \varnothing.\) The measurable projection theorem stated next will be one of the highlights of this post.

Measurable Projection Theorem: Let \((\Omega, \mathcal{F}, \mathbb{P})\) be a complete probability space, let \((K, \mathcal{B}(K))\) be a locally compact separable metric space endowed with the collection of its Borel sets, and denote by \(\pi\) the projection of \(K \times \Omega\) onto \(\Omega.\) Then, for every \(B \in \mathcal{B}(K) \otimes \mathcal{F}\), the projection \(\pi(B) \in \mathcal{F}.\)

Why is proving such results difficult? As mentioned above it's because projection doesn't behave nicely with intersections. Nevertheless, let us try to see how one might try to prove the above theorem. A standard approach in measure theory is to construct a collection like

\[\begin{aligned} \mathscr{E} = \{S \subseteq K \times \Omega\,:\,\pi(S) \in \mathcal{F}\}\end{aligned}\]

which contains the sets satisfying the desired property, and show that it is a \(\sigma\)-algebra containing a simple collection, say \(\mathscr{A}\), such that it easy to show that elements of \(\mathscr{A}\) satisfy the desired property and \(\mathscr{A}\) generates \(\mathcal{B}(K) \otimes \mathcal{F}\), because then we will have \(\mathcal{B}(K) \otimes \mathcal{F} \subseteq \mathscr{E}.\) To this end, let

\[\begin{aligned} \mathscr{A} = \{S \subseteq K \times \Omega\,:\, S = E \times F \text{ for } E \in \mathcal{B}(K),\, F \in \mathcal{F}\}.\end{aligned}\]

Then it is easy to see that \(\mathscr{A} \subseteq \mathscr{E}\), \(\mathscr{A}\) generates \(\mathcal{B}(K) \otimes \mathcal{F}\) and that \(\mathscr{A}\) is an algebra. If we could show that \(\mathscr{E}\) is a monotone class, then we would be done on account of monotone class theorem. Increasing sequences are easily handled, in fact if \(\{S_n\} _ {n \in \mathbb N} \subseteq K \times \Omega\) is any sequence, then

\[\begin{aligned} \pi\left(\bigcup_{n=1}^\infty S_n\right) = \bigcup_{n=1}^\infty \pi(S_n).\end{aligned}\]

But if \(S_1 \supseteq S_2 \supseteq \cdots\), then in general we cannot say

\[\begin{aligned} \pi\left(\bigcap_{n=1}^\infty S_n\right) = \bigcap_{n=1}^\infty \pi(S_n).\end{aligned}\]

Enter Choquet's theory of capacities. It provides the language to prove results like these. We know that every Borel measure \(\mu\) on \(\mathbb R^d\) (in fact, any Polish space) has the interior regularity (also known as tightness) property

\[\begin{aligned} \mu(B) = \sup_{\substack{K \in \mathscr{K}(\mathbb R^d) \\ K \subseteq B}} \mu(K), \quad \forall \; B \in \mathcal{B}(\mathbb R^d),\end{aligned}\]

where \(\mathscr{K}(\mathbb R^d)\) is the collection of all compact sets in \(\mathbb R^d.\) The Choquet's theory of capacities generalizes this approximation-from-below property, and distills those properties of measure that allow for such approximation to hold in very general settings. As we will see, monotonicity and continuity from above and below properties are at play here, and notions corresponding to complements or differences will be absent.

The next few sections will be very abstract and it is easy to lose sight of our goal. Some people enjoy this mental gymnastics, but even if you find this dry, the reward at the end will be worth the initial struggle. We start with Choquet's theory of capacities. The highlight of this part will be Choquet's capacitability theorem. To prove this major result we will need to define a lot of new terminology and prove some major results like Sierpiński's theorem and Sion's theorem. Armed with Choquet's capacitability theorem we will prove Measurable Section theorem which in turn will form the backbone of various other results in measure theory. These results in measure theory will then help us prove results in general theory of processes, but we will discuss this part in the next blog post.

Pavings and Mosaics

Definition 1: Let \(E\) be a nonempty set. A collection \(\mathcal E\) of subsets of \(E\) is called a paving if it closed under finite unions and finite intersections. The pair \((E, \mathcal E)\) is then called a paved space.

The concept of paving generalizes the concept of algebra. As an example, if \(E\) is a topological space, then the collection of closed subsets of \(E\) forms a paving. As another example, if \(E\) is a Hausdorff space, then \((E, \mathscr K(E))\) is a paved space, where as before \(\mathscr{K}(E)\) denotes the collection of all compact sets in \(E.\)

It is easy to check that an arbitrary intersection of pavings is also a paving, and that the collection \(\mathfrak{P}(E)\) of all subsets of \(E\) is a paving. Thus for any collection \(\mathcal{A}\) of subsets of \(E\), we can define the notion of paving generated by \(\mathcal{A}\) as the smallest paving of subsets of \(E\) that contains \(\mathcal{A}\) by simply defining it to be the intersection of all pavings of \(E\) containing \(\mathcal{A}.\)

Definition 2: For two paved spaces \((E,\mathcal{E})\) and \((F,\mathcal{F})\), the product paving of \(\mathcal{E}\) and \(\mathcal{F}\), denoted by \(\mathcal{E} \otimes_p \mathcal{F}\), is a paving on \(E \times F\) generated by all rectangles \(\mathcal{R} = \{A \times B \,:\, A \in \mathcal{E}, B \in \mathcal{F}\}.\)

Using the fact that \((A_1 \times B_1) \cap (A_2 \times B_2) = (A_1 \cap A_2) \times (B_1 \cap B_2)\) we see that \(\mathcal{R}\) is stable under finite intersections. Therefore, any element of \(\mathcal{E} \otimes_p \mathcal{F}\) is of the form \(\bigcup_{i=1}^n A_i \times B_i\), where \(A_i \in \mathcal{E}\) and \(B_i \in \mathcal{F}\) for all \(i=1,\ldots,n.\)

Definition 3: Let \(E\) be a nonempty set. A collection of subsets of \(E\) which is closed under countable unions and countable intersections is called a mosaic.

The concept of mosaic generalizes the concept of \(\sigma\)-algebra. Just like paving, it is easy to define the notion of mosaic generated by a collection. They will always occur in the context of a paving \(\mathcal{E}\) on \(E.\) We denote by \(\widehat{\mathcal{E}}\) the mosaic generated by \(\mathcal{E}.\) \(\mathcal{E}\ \widehat{\otimes}_p \mathcal{F}\) will denote the mosaic generated by the product paving \(\mathcal{E} \otimes_p \mathcal{F}.\)

Just like with monotone class arguments in measure theory, the paving \(\mathcal E\) will be a simple collection of subsets of \(E\) for which it is easy to prove a property \(P.\) From here we will show that the elements of the mosaic \(\widehat{\mathcal E}\) also satisfy \(P.\) Often \(\widehat{\mathcal E}\) will be a \(\sigma\)-algebra.

Henceforth the notation \(\mathcal{E}\) will be used for a paving on \(E.\)

Properties of pavings and mosaics

Just like the results connecting algebra and \(\sigma\)-algebra, we have results connecting pavings and mosaics.

Lemma 1: If \(A \in \mathcal{E}\) implies \(A^\mathsf{c} = E \setminus A \in \widehat{\mathcal{E}}\), then \(\widehat{\mathcal{E}} = \sigma(\mathcal{E}).\)

Follows immediately from the monotone class theorem.

As an example, if \(E\) is a separable, locally compact metric space, and \(\mathcal{E} = \mathscr{K}(E)\) is the collection of compact subsets of \(E\), then this property holds. In fact, \(E\) can be a second-countable locally compact Hausdorff space. Then it is \(\sigma\)-compact and every compact subset is closed. This implies that the open set given by the complement of a compact set is a countable union of compact sets. Combine it with the fact that an open subset of a locally compact and \(\sigma\)-compact is itself \(\sigma\)-compact to get that if \(A \in \mathcal E\), then \(A^\mathsf{c} \in \widehat{\mathcal E}.\)

Lemma 2: The mosaic \(\widehat{\mathcal{E}}\) is the smallest collection of subsets of \(E\) that contains \(\mathcal{E}\) and is closed under countable increasing unions and under countable decreasing intersections. In other words, if \(\mathcal M(\mathcal{E})\) denotes the monotone class generated by \(\mathcal{E}\), then \(\widehat{\mathcal{E}} = \mathcal M(\mathcal{E}).\)

Since a mosaic is a monotone class and since \(\mathcal{E} \subseteq \widehat{\mathcal{E}}\), we have \(\mathcal M(\mathcal{E}) \subseteq \widehat{\mathcal{E}}.\) For the other side, we will be done if we show that \(\mathcal M(\mathcal{E})\) is a mosaic, since \(\mathcal{E} \subseteq \mathcal M(\mathcal{E}).\)

The first property to note is that a monotone paving is a mosaic. To see this, let \(\mathcal{R}\) be a monotone paving and \(A_1, A_2, \ldots \in \mathcal{R}.\) Then since \(\mathcal{R}\) is a paving, \(B_n = \bigcup_{i=1}^n A_i \in \mathcal{R}\) for all \({n \in \mathbb N}.\) But \(\{B_n\}_{n \in \mathbb N}\) is an increasing sequence and \(\mathcal{R}\) is a monotone class, and therefore their union \(\bigcup_{n \in \mathbb N} A_n = \bigcup_{n \in \mathbb N} B_n \in \mathcal{R}\), and hence \(\mathcal{R}\) is closed under countable unions. Similarly for countable intersections.

Therefore, we will be done if we show that \(\mathcal M(\mathcal{E})\) is a paving. To this end, for any \(B \subseteq E\), let

\[\begin{aligned} \mathcal{K}(B) := \{A \subseteq E\,:\, A \cup B,\, A \cap B \in \mathcal M(\mathcal{E})\}.\end{aligned}\]

Notice that by symmetry, \(A \in \mathcal{K}(B) \iff B \in \mathcal{K}(A).\) If \(A_1 \subseteq A_2 \subseteq \cdots \in \mathcal{K}(B)\) is an increasing sequence, then

\[\begin{aligned} \bigcup_{n \in \mathbb N} A_n \cup B &= \bigcup_{n \in \mathbb N} (A_n \cup B) \in \mathcal M(\mathcal{E}) \\ \bigcup_{n \in \mathbb N} A_n \cap B &= \bigcup_{n \in \mathbb N} (A_n \cap B) \in \mathcal M(\mathcal{E}),\end{aligned}\]

and similarly for decreasing sequences. Therefore, if \(\mathcal{K}(B) \neq \varnothing\), then it is a monotone class. If \(A, B \in \mathcal{E}\), then by the definition of a paving, \(A \in \mathcal{K}(B).\) Since this is true for every \(A \in \mathcal{E}\), we have \(\mathcal{E} \subseteq \mathcal{K}(B)\), and \(\mathcal{K}(B)\) is a monotone class containing \(\mathcal{E}.\) Since \(\mathcal M(\mathcal{E})\) is the smallest monotone class containing \(\mathcal{E}\), we have

\[\begin{aligned} \mathcal M(\mathcal{E}) \subseteq \mathcal{K}(B).\end{aligned}\]

Hence if \(A \in \mathcal M(\mathcal{E})\) and \(B \in \mathcal{E}\), then \(A \in \mathcal{K}(B)\), and therefore \(B \in \mathcal{K}(A).\) Since this is true for every \(B \in \mathcal{E}\), it follows that

\[\begin{aligned} \mathcal M(\mathcal{E}) \subseteq \mathcal{K}(A).\end{aligned}\]

The validity of this relation for every \(A \in \mathcal M(\mathcal{E})\) is equivalent to the assertion that \(\mathcal M(\mathcal{E})\) is a paving.

Compact pavings

Recall a standard result from topology: a topological space \(X\) is compact if and only if for every collection \(\mathcal C\) of closed subsets of \(X\) having the finite intersection property (i.e., every finite sub-collection of \(\mathcal C\) has nonempty intersection), the intersection \(\bigcap_{C \in \mathcal C} C\) is nonempty. For our use case, we consider countable sub-collections.

Definition 4: Consider an arbitrary collection \(\mathcal A\) of subsets of \(E.\) It is called a compact collection if every countable sub-collection of elements in \(\mathcal A\) having the finite intersection property has non-empty intersection. A paving \(\mathcal{E}\) is a compact paving, if every decreasing sequence of nonempty elements of \(\mathcal{E}\) has a nonempty intersection.

It is easy to verify that a compact paving is a compact collection. As an example, if \(E\) is a separable metric space, the collection, \(\mathscr{K}(E)\), of all compact subsets of \(E\) is a compact paving. This follows immediately from the finite intersection characterization of a compact topological space stated above.

We define

\[\begin{aligned} \mathcal{E}_\delta := \left\{\bigcap_{n \in \mathbb N} A_n\,:\,A_n \in \mathcal{E} \text{ for all } n \in \mathbb N\right\}\end{aligned}\]

and note that if \(\mathcal{E}\) is a compact paving then so is \(\mathcal{E}_\delta.\) Indeed, \(\mathcal{E}_\delta\) is a paving because for \(A = \bigcap_{n \in \mathbb N} A_n\) and \(B = \bigcap_{m \in \mathbb N} B_m\) in \(\mathcal{E}_\delta\),

\[\begin{aligned} A \cup B &= \bigcap_{n \in \mathbb N} \bigcap_{m \in \mathbb N} (A_n \cup B_m) \in \mathcal{E}_\delta \\ A \cap B &= \bigcap_{n \in \mathbb N} \bigcap_{m \in \mathbb N} (A_n \cap B_m) \in \mathcal{E}_\delta,\end{aligned}\]

and \(\mathcal{E}_\delta\) is a compact paving because intersection of a sequence of countable intersections is again a countable intersection.

The next lemma tells us when it acceptable to commute projection with countable intersections.

Lemma 3: Let \(K\) and \(E\) be two nonempty sets, and denote by \(\pi\) the projection of \(K \times E\) onto \(E.\) Suppose that \(\mathcal{H}\) is a paving of subsets of \(K \times E\) with the property that, for every \(x \in E\), the collection \(\mathcal{H}(x) := \{H(x) \,:\, H \in \mathcal{H}\}\) is a compact paving on \(K.\) Here \(H(x) := \{y \in K \,:\, (y,x) \in H\}\) denotes the \(x\)-section of \(H \subseteq K \times E.\) Then, for every decreasing sequence \(\{H_n\} _ {n \in \mathbb N}\) of sets in the paving \(\mathcal{H}_\delta\), we have

\[\begin{aligned} \pi\left(\bigcap _ {n \in \mathbb N} H_n\right) = \bigcap _ {n \in \mathbb N} \pi(H_n).\end{aligned}\]

\(\pi\left(\bigcap_{n \in \mathbb N} H_n\right) \subseteq \bigcap_{n \in \mathbb N} \pi(H_n)\) is easy to see because if \(x \in \pi\left(\bigcap_{n \in \mathbb N} H_n\right)\) then there exists \((y,x) \in K \times E\) such that \((y,x) \in \bigcap_{n \in \mathbb N} H_n\) which implies \(x \in \pi(H_n)\) for all \(n \in \mathbb N.\)

For the other side, let \(x \in \bigcap_{n \in \mathbb N} \pi(H_n).\) Then the sequence \(\{H_n(x)\}_{n \in \mathbb N}\) is decreasing whose elements are nonempty and they are in \(\mathcal{H}(x)_\delta.\) But since \(\mathcal{H}(x)_\delta\) is a compact paving by assumption, \(\bigcap_n H_n(x)\) must be nonempty, implying \(x \in \pi\left(\bigcap_{n \in \mathbb N} H_n\right).\)

Envelopes

Definition 5: Let \((E, \mathcal{E})\) be a paved space, and fix a subset \(A \subseteq E\) as well as a decreasing sequence \(\{A_k\} _ {k \in \mathbb N} \subseteq \mathfrak{P}(E).\) We say that \(A\) is an \(\mathcal{E}\)-envelope of \(\{A_k\} _ {k \in \mathbb N}\), if there exists a decreasing sequence \(\{C_k\}_{k \in \mathbb N} \subseteq \mathcal{E} \cup \{E\}\) such that \[\begin{aligned} A_k \subseteq C_k, \; \forall \; k \in \mathbb N \quad \text{ and } \quad \bigcap _ {k \in \mathbb N} C_k \subseteq A.\end{aligned}\]

Examples:

Let \(E\) be a separable metric space, and \(\mathcal{E}\) be the paving consisting of all closed subsets of \(E.\) Then a subset \(A\) of \(E\) is an \(\mathcal{E}\)-envelope of a given decreasing sequence \(\{A_k\} _ {k \in \mathbb N} \subseteq \mathfrak{P}(E)\) if, and only if, \(A\) contains \(\bigcap_{k \in \mathbb N} \overline{A}_k\), the intersection of the closures of the sets \(A_k, k \in \mathbb N\) in the sequence. Indeed, that \(A\) is an \(\mathcal{E}\)-envelope of \(\{A_k\} _ {k \in \mathbb N}\) (with the existence of a sequence \(\{C_k\}_{k \in \mathbb N}\) as in Definition 5 above) implies \(A\) contains \(\bigcap_{k \in \mathbb N} \overline{A}_k\) follows from the observation that \(\overline{A}_k \subseteq C_k\) for each \(k \in \mathbb N\), while the other side is easy to see if we let \(C_k = \overline{A}_k\) for each \(k \in \mathbb N.\)
An abstract version of the example above: Let \((E, \mathcal{E})\) be a paved space; for every subset \(A\) of \(E\), introduce the collection of sets \(\mathcal{A} := \{B \in \mathcal{E} \cup \{E\} \,:\, A \subseteq B\}\) and assume that the intersection \(\overline{A} := \bigcap_{B \in \mathcal{A}} B\), called the adherent of \(A\) in the paving \(\mathcal{E}\), belongs to \(\mathcal{E}_\delta \cup \{E\}\), i.e., \(\overline{A}\) is a countable intersection of sets in \(\mathcal{E} \cup \{E\}.\) We claim, and show next, that \(A\) is an \(\mathcal{E}\)-envelope of a given decreasing sequence \(\{A_k\}_{k \in \mathbb N} \subseteq \mathfrak{P}(E)\) if, and only if, \(A\) contains \(\bigcap_{k \in \mathbb N} \overline{A}_k.\)

Lemma 4: In the setting of the last example, a subset \(A\) of \(E\) is an \(\mathcal{E}\)-envelope of a given decreasing sequence \(\{A_k\} _ {k \in \mathbb N} \subseteq \mathfrak{P}(E)\) if, and only if, \(A\) contains \(\bigcap_{k \in \mathbb N} \overline{A}_k.\)

The necessity is clear: if there exists a decreasing sequence \(\{C_k\}_{k \in \mathbb N} \subseteq \mathcal{E} \cup \{E\}\) such that \((\text{Env})\) is satisfied then \(\overline{A}_k \subseteq C_k\) and therefore \(\bigcap_{k \in \mathbb N} \overline{A}_k \subseteq \bigcap_{k \in \mathbb N} C_k \subseteq A.\)

To see the sufficiency, for every \(k \in \mathbb N\), let \(\left\{B_n^k\right\}_{n \in \mathbb N} \subseteq \mathcal{E} \cup \{E\}\) be a decreasing sequence such that \(\overline{A}_k = \bigcap_{n \in \mathbb N} B_n^k\) (such a sequence exists because of the assumption in Example 2). Then

\[\begin{aligned} C_k := B_k^1 \cap B_k^2 \cap \cdots \cap B_k^k, \quad k \in \mathbb N\end{aligned}\]

defines a decreasing sequence of elements in \(\mathcal{E} \cup \{E\}\) with \(A_k \subseteq \overline{A}_k \subseteq C_k\) and \(\bigcap_{k \in \mathbb N} \overline{A}_k =\bigcap_{k \in \mathbb N} C_k.\) It follows that the set \(A\) envelops the sequence \(\{A_k\}_{k \in \mathbb N}\), if \(A\) contains the countable intersection \(\bigcap_{k \in \mathbb N} \overline{A}_k\), for then the decreasing sequence \(\{C_k\}_{k \in \mathbb N} \subseteq \mathcal{E} \cup \{E\}\) satisfies the requirements of (1).

Properties of envelopes

The next lemma lists some properties of envelopes which we will be using frequently.

Lemma 5:

If \(A\) is an envelope of a given decreasing sequence \(\{A_n\} _ {n \in \mathbb N} \subseteq \mathfrak{P}(E)\), then every subset of \(E\) that contains \(A\) is also an envelope of \(\{A_n\} _ {n \in \mathbb N}.\)
Two decreasing sequences of subsets of \(E\) that possess a common subsequence, admit the exact same envelopes.
The collection of envelopes of a given decreasing sequence of subsets of \(E\), is closed under countable intersections.

Parts 1. and 2. are trivial, and follow immediately from the definition.

For part 3., let \(\{A^k\} _ {k \in \mathbb N}\) be a sequence of envelopes of a given decreasing sequence \(\{A_n\} _ {n \in \mathbb N} \subseteq \mathfrak{P}(E).\) For each \(k \in \mathbb N\), let \(\{B_n^k\} _ {n \in \mathbb N} \subseteq \mathcal{E} \cup \{E\}\) be a decreasing sequence, such that \(A_n \subseteq B_n^k\) for all \(n \in \mathbb N\) and \(\bigcap_{n \in \mathbb N} B_n^k \subseteq A^k.\) Then

\[\begin{aligned} C_n := B_n^1 \cap B_n^2 \cap \cdots \cap B_n^n, \quad n \in \mathbb N\end{aligned}\]

defines a decreasing sequence of elements in \(\mathcal{E} \cup \{E\}\) that satisfies \(A_n \subseteq C_n\) and \(\bigcap_{n \in \mathbb N} C_n = \bigcap_{(k,n) \in \mathbb N^2} B_n^k \subseteq \bigcap_{k \in \mathbb N} A^k.\) It follows that \(\bigcap_{k \in \mathbb N} A^k\) is an \(\mathcal{E}\)-envelope of \(\{A_n\}_{n \in \mathbb N}.\)

Capacitance

Definition 6: Let \(E\) be a nonempty set. A collection \(\mathcal C\) of subsets of \(E\) is called a capacitance, if

whenever \(A \in \mathcal{C}\) and \(A \subseteq B\), then \(B \in \mathcal{C}\), and
whenever \(\{A_n\} _ {n \in \mathbb N}\) is an increasing sequence of subsets of \(E\) such that \(\bigcup_{n \in \mathbb N} A_n \in \mathcal{C}\), there is an integer \(m\) such that \(A_m \in \mathcal{C}.\)

Intuitively, a capacitance is a collection of “big” sets: the power set \(\mathfrak{P}(E)\) is a capacitance, and so are the collections of nonempty and of uncountable subsets of \(E.\) The notion of pre-capacity, defined next, gives a more useful example.

Definition 7: A function \(I \colon \mathfrak{P}(E) \to \overline{\mathbb R}\) is called a pre-capacity, if it is

monotone increasing, i.e., \(I(A) \le I(B)\) holds for every \(A \subseteq B\), and
ascending, i.e., for every increasing sequence \(\{A_n\} _ {n \in \mathbb N}\) we have

\[\begin{aligned} I\left(\bigcup_{n \in \mathbb N} A_n\right) = \sup_{n \in \mathbb N} I(A_n).\end{aligned}\]

If \(I \colon \mathfrak{P}(E) \to \overline{\mathbb R}\) is a pre-capacity, then for every given real number \(t\) the collection

\[\begin{aligned} \mathcal{C} = \{A \in \mathfrak{P}(E) \,:\, I(A) > t\}\end{aligned}\]

is a capacitance. Conversely, given a capacitance \(\mathcal{C}\), one can associate to it a pre-capacity by defining

\[\begin{aligned} I(A) = \begin{cases} 1 &\text{if } A \in \mathcal{C} \\ 0 &\text{if } A \notin \mathcal{C}. \end{cases}\end{aligned}\]

This then leads to the identification \(\mathcal{C} = \{A \in \mathfrak{P}(E) \,:\, I(A) > 0\}.\)

Henceforth assume that there is an underlying paved space \((E, \mathcal{E})\) and a capacitance \(\mathcal{C}\) of subsets of \(E.\)

Scrapers

Definition 8: A sequence \(\mathfrak{f} = \{f_n\} _ {n \in \mathbb N}\) of mappings \(f_n \colon \left(\mathfrak{P}(E)\right)^n \to \mathfrak{P}(E)\) is called a Sierpiński’s \(\mathcal{C}\)-scraper, or simply a \(\mathcal{C}\)-scraper, if

\(f_n(B_1, B_2, \ldots, B_n) \subseteq B_n\) for all \(n \in \mathbb N\) and for all sets \(B_1, \ldots, B_n \in \mathfrak{P}(E)\), and
whenever \(B_n \in \mathcal{C}\), then \(f_n(B_1, B_2, \ldots, B_n) \in \mathcal{C}.\)

The first property expresses the intuitive notion that \(f_n(B_1, B_2, \ldots, B_n)\) “scrapes” \(B_n\) and the second property ensures that “the scraping does not remove too big a chunk” from \(B_n.\) In French, scraper is called rabotage which can be translated also as planing. A simple example of a scraper is the identity scraper: \(f_n(B_1, \ldots B_n) = B_n\) for all \(n \in \mathbb N\) and for all sets \(B_1, \ldots, B_n \in \mathfrak{P}(E).\)

Definition 9: Given a \(\mathcal{C}\)-scraper \(\mathfrak{f} = \{f_n\} _ {n \in \mathbb N}\), a (necessarily decreasing) sequence \(\{B_n\} _ {n \in \mathbb N}\) of subsets of \(E\) will be called \(\mathfrak{f}\)-scraped, if for all \(n \in \mathbb N\) we have

\[\begin{aligned} B_{n+1} \subseteq f_n(B_1, B_2, \ldots, B_n) \quad \text{and} \quad B_n \in \mathcal{C}.\end{aligned}\]

Definition 10: For any \(B \in \mathfrak{P}(E)\) and \(\mathcal{C}\)-scraper \(\mathfrak{f} = \{f_n\} _ {n \in \mathbb N}\), the sequence \(\{P_n\} _ {n \in \mathbb N} \subseteq \mathcal{C}\) defined by \(P_1 := B\) and \(P_{n+1} := f_n(P_1, \ldots, P_n)\) for all \(n \in \mathbb N\) is \(\mathfrak{f}\)-scraped. It is called the \(\mathfrak{f}\)-scraped orbit of \(B\).

(Dellacherie, 1972) calls \(\{P_n\} _ {n \in \mathbb N}\) above as the \(\mathfrak f\)-scraped sequence deduced from \(B.\)

Definition 11: A \(\mathcal{C}\)-scraper \(\mathfrak{f} = \{f_n\}_{n \in \mathbb N}\) is called compatible with a given set \(A \in \mathfrak{P}(E)\), if \(A\) envelopes every \(\mathfrak{f}\)-scraped sequence \(\{B_n\} _ {n \in \mathbb N}\) with \(B_1 \subseteq A.\)

A set \(A \in \mathfrak{P}(E)\) is smooth for the capacitance \(\mathcal{C}\), if there exists a \(\mathcal{C}\)-scraper compatible with it.

If \(A \notin \mathcal C\), then no subset of \(A\) can be in \(\mathcal C\) either by the definition of a capacitance. This implies that there does not exist a \(\mathfrak f\)-scraped sequence \(\{B_n\}_{n \in \mathbb N}\) satisfying \(B_1 \subset A.\) On the other hand, with the identity scraper \(\mathfrak f\), \(A\) envelopes every \(\mathfrak f\)-scraped sequence \(\{B_n\}_{n \in \mathbb N}\) because \(B_1\) must equal \(A.\) Thus \(A\) is smooth.

If \(A \in \mathcal C\) and is smooth, then there always exists a sequence \(\{B_n\}_{n \in \mathbb N}\) of which \(A\) is an envelope. Indeed, by assumption there exists \(\mathfrak f\), a scraper, compatible with \(A.\) Let \(\{B_n\}_{n \in \mathbb N}\) be the \(\mathfrak f\)-scraped orbit of \(A.\) Then since \(A\) is smooth, \(A\) envelops \(\{B_n\}_{n \in \mathbb N}.\)

Sierpiński’s Theorem

The next result is central in this theory.

Theorem 1 [Sierpiński]: Let \((E, \mathcal{E})\) be a paved space, and \(\mathcal{C}\) a capacitance. The collection of subsets of \(E\) which are smooth for the capacitance, is closed under countable increasing unions and under countable intersections.

We will come back to its proof later. Let's prove some of its useful consequences first.

Theorem 2: Let \((E ,\mathcal{E})\) be a paved space, and \(\mathcal{C}\) a capacitance. The elements of the mosaic \(\widehat{\mathcal{E}}\) generated by \(\mathcal{E}\) are smooth.

An easy consequence of Theorem 1, Lemma 2 and the fact that every element of \(\mathcal{E}\) is smooth because they are compatible with the identity scraper. \(\square\)

This theorem is very useful in proving some important results. We will discuss two of them. The first one is the metric space version of Choquet's capacitability theorem. The proof of it will be very similar to the general Choquet's capacitability theorem, but we will need Sion's theorem for the general version, which will be the second result.

Definition 12: Let \(E\) be a compact metric space, endowed with the paving \(\mathcal{E} = \mathscr{K}(E)\) of its compact sets. \(I\) is called a metric capacity on \((E, \mathcal{E})\) if it a pre-capacity that "descends on compacts" in the sense that for every decreasing sequence \(\{K_n\} _ {n \in \mathbb N} \subseteq \mathscr{K}(E)\) it satisfies

\[\begin{aligned} I\left(\bigcap _ {n \in \mathbb N}K_n\right) = \inf _ {n \in \mathbb N} I(K_n).\end{aligned}\]

Theorem 3 [Metric space version of Choquet's capacitability theorem]: For every Borel subset \(B\) of a compact metric space \(E\), and any metric capacity \(I \colon \mathfrak{P}(E) \to \overline{\mathbb R}\), we have

\[\begin{aligned} I(B) = \sup_{\substack{K \in \mathscr{K}(E) \\ K \subseteq B}} I(K).\end{aligned}\]

Fix an arbitrary \(B \in \mathcal{B}(E).\) If \(I(B) = -\infty\) then \(I(K) = -\infty\) for any \(K \subseteq B\), and we have our desired equality trivially. Otherwise, we need to show that whenever \(I(B) > t\) holds for some given real number \(t\), there exists a compact set \(K \subseteq B\) such that \(I(K) \ge t.\) Recall that

\[\begin{aligned} \mathcal{C} = \{A \in \mathfrak{P}(E)\,:\,I(A) > t\}\end{aligned}\]

is a capacitance. Also recall that a subset \(A\) of \(E\) is an \(\mathcal{E}\)-envelope of a decreasing sequence \(\{A_n\}_{n \in \mathbb N} \subseteq \mathfrak{P}(E)\) if and only if \(A\) contains \(\bigcap_{n \in \mathbb N}\overline{A}_n.\)

Lemma 1 gives that the mosaic \(\widehat{\mathcal{E}}\) generated by \(\mathcal{E} = \mathscr{K}(E)\) coincides with the Borel \(\sigma\)-algebra \(\mathcal{B}(E).\) Hence by Theorem 2 every Borel set is smooth. Thus, there exists a \(\mathcal{C}\)-scraper \(\mathfrak{f} = \{f_n\}_{n \in \mathbb N}\) compatible with the set \(B.\)

Consider the \(\mathfrak{f}\)-scraped orbit \(\{P_n\} _ {n \in \mathbb N} \subseteq \mathcal{C}\) of \(B.\) By construction \(B\) is an envelope of \(\{P_n\} _ {n \in \mathbb N}\), and hence it contains \(K := \bigcap _ {n \in \mathbb N} \overline{P}_n.\) \(K\) is closed and hence also compact on account of being a subset of a compact set \(E\); similarly for \(\overline{P}_n\) for all \(n \in \mathbb N.\) But since \(\{P_n\} _ {n \in \mathbb N} \subseteq \mathcal{C}\) we have \(I(\overline{P}_n) > t\) for all \(n \in \mathbb N.\) Now use the descending on compacts property of \(I\) to get

\[\begin{aligned} I(K) = I\left(\bigcap _ {n \in \mathbb N} \overline{P}_n\right) = \inf _ {n \in \mathbb N} I(\overline{P}_n) \ge t.\end{aligned}\]

Theorem 4 [Sion's theorem]: Let \((E,\mathcal{E})\) be a paved space, and \(\mathcal{C}\) a capacitance. For every element \(B\) of \(\mathcal{C} \cap \widehat{\mathcal{E}}\), there exists a decreasing sequence \(\{K_n\} _ {n \in \mathbb N} \subseteq \mathcal{C} \cap \mathcal{E}\) such that \(\bigcap _ {n \in \mathbb N} K_n \subseteq B.\)

Theorem 2 implies that the set \(B\) is smooth, and thus there exists a \(\mathcal{C}\)-scraper \(\mathfrak{f} = \{f_n\} _ {n \in \mathbb N}\) compatible with it. Let \(\{P_n\} _ {n \in \mathbb N} \subseteq \mathcal{C}\) be the \(\mathfrak{f}\)-scraped orbit of \(B.\) Then \(B\) is an envelope of \(\{P_n\} _ {n \in \mathbb N}\), so there exists a decreasing sequence \(\{B_n\} _ {n \in \mathbb N}\) of subsets of \(\mathcal{E} \cup \{E\}\) such that \(\bigcap _ {n \in \mathbb N} B_n \subseteq \mathfrak{P}(B)\) and \(P_n \subseteq B_n\) for all \({n \in \mathbb N}.\) Notice \(B_n \in \mathcal{C}.\)

If the sets \(B_n\) belong to the paving \(\mathcal{E}\) from a certain index \(m\) onward, we take \(K_n := B_{m+n}, n \in \mathbb N\) as our sequence. Otherwise if \(B_n = E\) holds for all integers \(n\), the set \(B=E\) is the union of an increasing sequence of sets in \(\mathcal{E}\) because \(B \in \widehat{\mathcal{E}}\) and by Lemma 2. Therefore, the fact that \(B \in \mathcal{C}\) implies \(B\) contains a set \(K \in \mathcal{C} \cap \mathcal{E}\); it then suffices to take \(K_n = K\) for all integers \(n.\)

We now come back to the proof of Theorem 1. But first we will need the following clever operation on scrapers, and a couple of results.

Mixing of Scrapers

Definition 13: Consider a sequence \(\left\{\mathfrak{f}^k, k \in \mathbb N\right\} = \left\{\{f^k_n\} _ {n \in \mathbb N}, k \in \mathbb N\right\}\) of scrapers, and a bijection

\[\begin{aligned} \mathbb N^2 \ni (p,q) \mapsto \beta(p,q) = p \star q \in \mathbb N,\end{aligned}\]

which is strictly increasing in each of its arguments. For every integer \({n \in \mathbb N}\) and sets \(P_1, P_2, \ldots, P_n\), if \(n = p \star q\), let

\[\begin{aligned} f_n(P_1, P_2, \ldots, P_n) := f^p_q(P_{p \star 1}, P_{p \star 2}, \ldots, P_{p \star q}).\end{aligned}\]

It is easy to see that this defines a new scraper \(\mathfrak{f} = \{f_n\}_{n \in \mathbb N}\), called the mixing of the scrapers \(\left\{\mathfrak{f}^k, k \in \mathbb N\right\}\) via the bijection \(\beta\).

Theorem 5: Let \(\left\{\mathfrak{f}^k, k \in \mathbb N\right\}\) be a sequence of scrapers, and denote by \(\mathfrak{f}\) its mixing by a bijection \(\beta.\) In order for a subset \(A\) of \(E\) to be compatible with \(\mathfrak{f}\), it suffices that it be compatible with one of the scrapers \(\mathfrak{f}^k, k \in \mathbb N.\)

Let \(A \in \mathfrak{P}(E)\) be compatible with \(\mathfrak{f}^k\) for some arbitrary but fixed \(k.\) Consider also a sequence of sets \(\{P_n\} _ {n \in \mathbb N}\), which is \(\mathfrak{f}\)-scraped and whose first term \(P_1\) is contained in \(A.\) We need to show that the set \(A\) envelops \(\{P_n\} _ {n \in \mathbb N}.\)

To do this, we exploit Lemma 5 (ii) and construct a decreasing sequence \(\{Q_n\} _ {n \in \mathbb N} \subseteq \mathfrak{P}(E)\) which is a subsequence of \(\{P_n\} _ {n \in \mathbb N}\) and show that \(A\) envelops \(\{Q_n\} _ {n \in \mathbb N}.\) This will then imply \(A\) envelops \(\{P_n\} _ {n \in \mathbb N}.\) To this end, define

\[\begin{aligned} Q_n := P_{k \star n} \quad \forall \; {n \in \mathbb N}.\end{aligned}\]

Because \(Q_1 = P_{k \star 1} \subseteq P_{1 \star 1} = P_1 \subseteq A\) and \(A\) is compatible with \(\mathfrak{f}^k\), to show that \(A\) envelops \(\{Q_n\}_{n \in \mathbb N}\) it suffices to show \(\{Q_n\} _ {n \in \mathbb N}\) is \(\mathfrak{f}^k\)-scraped.

Now, \(Q_n = P_{k \star n} \in \mathcal{C}\) for all \({n \in \mathbb N}\), so all that remains to be shown is that \(Q_{n+1} \subseteq f^k_n(Q_1, Q_2, \ldots, Q_n)\) holds for all \({n \in \mathbb N}.\) Because \(\{P_n\} _ {n \in \mathbb N}\) is \(\mathfrak{f}\)-scraped we have

\[\begin{aligned} Q_{n+1} = P_{k \star (n+1)} \subseteq P_{1 + k\star n} \subseteq f_{k \star n}(P_1, P_2, \ldots, P_{k \star n}) = f^k_n(Q_1, Q_2, \ldots, Q_n),\end{aligned}\]

giving the desired result.

An immediate corollary of this theorem: If \(\{A_n\}_{n \in \mathbb N}\) is a sequence of smooth subsets of \(E\), there exists a scraper \(\mathfrak{f}\) which is compatible with all the sets \(A_n, {n \in \mathbb N}.\)

Proof of Sierpiński’s Theorem

We state Theorem 1 again:

Closure under countable intersections:

Suppose \(\left\{A^k\right\} _ {k \in \mathbb N}\) is a sequence of smooth sets, \(A = \bigcap _ {k \in \mathbb N} A^k\), and \(\mathfrak{f}\) is a \(\mathcal{C}\)-scraper compatible with all of the sets \(A^k, k \in \mathbb N.\) If \(\{P_n\} _ {n \in \mathbb N}\) is an \(\mathfrak{f}\)-scraped sequence of sets such that \(P_1 \subseteq A\), then \(P_1 \subseteq A^k\) for all \(k \in \mathbb N.\) Our construction then implies \(A^k\) is an \(\mathcal{E}\)-envelope of \(\{P_n\} _ {n \in \mathbb N}\) for all \(k \in \mathbb N.\) Lemma 5 (iii) now implies \(A\) is also an \(\mathcal{E}\)-envelope \(\{P_n\} _ {n \in \mathbb N}\), showing that \(A\) is compatible with \(\mathfrak{f}\), and hence smooth.

Closure under countable increasing unions:

Suppose \(\left\{A^k\right\}_{k \in \mathbb N}\) is an increasing sequence of smooth sets, \(A = \bigcup _ {k \in \mathbb N} A^k\), and \(\mathfrak{f}\) is a \(\mathcal{C}\)-scraper compatible with all of the sets \(A^k, k \in \mathbb N.\) The scraper \(\mathfrak{f}\) doesn't work for this case and so we create a new one. For any \(n \in \mathbb N\) and sets \(P_1, P_2, \ldots, P_n\) we define

\[\begin{aligned} \varphi_n(P_1, P_2, \ldots, P_n) = \begin{cases} P_n &\text{if } A \cap P_1 \notin \mathcal{C} \\ f_n(A^p \cap P_1, P_2, \ldots, P_n) &\text{if } A \cap P_1 \in \mathcal{C}, \end{cases}\end{aligned}\]

where \(p\) is the smallest integer such that \(A^p \cap P_1 \in \mathcal{C}.\) Such an integer does exist from part 2. of the definition of capacitance with the sequence being \(A^k \cap P_1 \uparrow A \cap P_1 \in \mathcal C.\) It is easy to see that \(\Phi = \{\varphi_n\} _ {n \in \mathbb N}\) is a \(\mathcal{C}\)-scraper. It is sufficient to show that \(\Phi\) is compatible with \(A\) to finish our proof.

Let \(\{P_n\} _ {n \in \mathbb N}\) be a \(\Phi\)-scraped sequence of sets such that \(P_1 \subseteq A.\) By definition \(P_1 \in \mathcal{C}\) and \(A \cap P_1 = P_1\), and so from our construction \(\varphi_n(P_1, P_2, \ldots, P_n) = f_n(A^p \cap P_1, P_2, \ldots, P_n).\) All elements of the sequence \(A^p \cap P_1, P_2, \ldots, P_n, \ldots\) are in \(\mathcal{C}\) and for all \(n \in \mathbb N\)

\[\begin{aligned} P_{n+1} \subseteq \varphi_n(P_1, P_2, \ldots, P_n) = f_n(A^p \cap P_1, P_2, \ldots, P_n),\end{aligned}\]

and thus it follows that this sequence is \(\mathfrak{f}\)-scraped. Now since \(A^p\) is compatible with \(\mathfrak{f}\), \(A^p\) is an envelope of this sequence, and also of \(\{P_n\} _ {n \in \mathbb N}\) by Lemma 5 (ii). It follows that \(A\) is an envelope of \(\{P_n\} _ {n \in \mathbb N}\) by Lemma 5 (i) because \(A^p \subseteq A.\)

Choquet Capacities

Definition 14: A mapping \(I \colon \mathfrak{P}(E) \to \overline{\mathbb R}\) is called a Choquet \(\mathcal{E}\)-capacity, or simply \(\mathcal{E}\)-capacity, if it is

monotone increasing, i.e., \(I(A) \le I(B)\) holds for every \(A \subseteq B\),
ascending, i.e., for every increasing sequence \(\{A_n\} _ {n \in \mathbb N} \subseteq \mathfrak{P}(E)\) we have

\[\begin{aligned} I\left(\bigcup _ {n \in \mathbb N} A_n\right) = \sup _ {n \in \mathbb N} I(A_n),\end{aligned}\]

descending on pavings, i.e., for every decreasing sequence \(\{E_n\} _ {n \in \mathbb N} \subseteq \mathcal{E}\) we have

\[\begin{aligned} I\left(\bigcap _ {n \in \mathbb N} E_n\right) = \inf _ {n \in \mathbb N} I(E_n).\end{aligned}\]

Definition 15: A set \(A \in \mathfrak{P}(E)\) is called \(I\)-capacitable if

\[\begin{aligned} I(A) = \sup_{\substack{K \in \mathcal{E}_\delta \\ K \subseteq A}} I(K).\end{aligned}\]

Examples:

Consider a paved space \((E, \mathcal{E})\), where \(\mathcal E\) is a compact paving, and define \(I(A) = 0\) if \(A = \varnothing\), \(I(A) = 1\) if \(A \neq \varnothing.\) Then \(I\) is a Choquet \(\mathcal{E}\)-capacity. The property 3. in the definition of a capacity reflects now the assumption that the paving \(\mathcal{E}\) is compact.
Consider a probability space \((\Omega, \mathcal{F}, \mathbb{P})\), then the outer measure
\[\begin{aligned} \mathbb{P}^*(A) := \inf_{\substack{B \in \mathcal{F} \\ A \subseteq B}} \mathbb{P}(B),\end{aligned}\]
which is well-known to be a monotone, countably sub-additive set function taking the value \(0\) on the empty set \(\varnothing\), is a Choquet \(\mathcal{F}\)-capacity. Proof of this is a standard exercise in measure theory, albeit with a different terminology of continuity from below.
Consider a locally compact, separable metric space \(K\), and the paving \(\mathscr{K}\) of its compact subsets. If \(\pi\) denotes the projection of \(K \times \Omega\) onto \(\Omega\) and
\[\begin{aligned} I(A) := \mathbb{P}^*(\pi(A)) \text{ for all } A \in \mathfrak{P}(K \times \Omega),\end{aligned}\]
then \(I\) is a Choquet \((\mathscr{K} \otimes_p \mathcal{F})\)-capacity: Monotone increasing property is trivial. To see the ascending property, note that increasing sequence \(\{A_n\} _ {n \in \mathbb N} \subseteq \mathfrak{P}(K \times \Omega)\) we have
\[\begin{aligned} x \in \pi \left( \bigcup_n A_n \right) \iff \exists \; y \in K \text{ such that } (y,x) \in \bigcup_n A_n \iff x \in \bigcup_n \pi(A_n),\end{aligned}\]
showing \(\pi \left( \bigcup_n A_n \right) = \bigcup_n \pi(A_n).\) Thus,
\[\begin{aligned} I\left( \bigcup_n A_n \right) = \mathbb{P}^*\left( \bigcup_n \pi(A_n) \right) = \sup_n \mathbb{P}^*(\pi(A_n)) = \sup_n I(A_n).\end{aligned}\]
Finally, for the descending on pavings property, compactness implies that for all \(\omega \in \Omega\), \((\mathscr{K} \otimes_p \mathcal{F})(\omega)\) is a compact paving, and thus we can apply Lemma 3 to get
\[\begin{aligned} \pi\left(\bigcap _ {n \in \mathbb N} E_n\right) = \bigcap _ {n \in \mathbb N} \pi(E_n)\end{aligned}\]
for every decreasing sequence \(\{E_n\} _ {n \in \mathbb N} \subseteq \mathscr{K} \otimes_p \mathcal{F}.\) Therefore,
\[\begin{aligned} I\left(\bigcap _ {n \in \mathbb N} E_n\right) = \mathbb{P}^* \left(\bigcap _ {n \in \mathbb N} \pi(E_n)\right).\end{aligned}\]
Since \(E_n\) is a finite union of rectangles, \(\pi(E_n) \in \mathcal{F}\) and therefore,
\[\begin{aligned} \mathbb{P}^* \left(\bigcap _ {n \in \mathbb N} \pi(E_n)\right) = \inf_n \mathbb{P}^* (\pi(E_n)) = \inf_n I(E_n).\end{aligned}\]

Theorem 6 [Choquet's capacitability theorem]: Consider a paved space \((E, \mathcal{E})\), and let \(I \colon \mathfrak{P}(E) \to \overline{\mathbb R}\) be a Choquet \(\mathcal{E}\)-capacity. Then every set \(A \in \widehat{\mathcal{E}}\) is \(I\)-capacitable.

Fix an arbitrary set \(A \in \widehat{\mathcal{E}}.\) If \(I(A) = -\infty\) then \(I(K) = -\infty\) for any \(K \subseteq A\), and we have our desired equality trivially. Otherwise, we need to show that whenever \(I(A) > t\) holds for some given real number \(t\), there exists a set \(K \in \mathcal{E}_\delta\) with \(K \subseteq A\) and \(I(K) \ge t.\)

Recall that

\[\begin{aligned} \mathcal{C} = \{B \in \mathfrak{P}(E)\,:\,I(B) > t\}\end{aligned}\]

is a capacitance. Then \(A \in \mathcal{C}\), and from Sion's Theorem (Theorem 4) there exists a decreasing sequence \(\{K_n\} _ {n \in \mathbb N}\) of elements in \(\mathcal{E} \cap \mathcal{C}\) such that \(\bigcap _ {n \in \mathbb N} K_n \subseteq A.\) But then

\[\begin{aligned} I\left(\bigcap _ {n \in \mathbb N} K_n\right) = \inf _ {n \in \mathbb N} I(K_n) \ge t,\end{aligned}\]

and thus we can take \(K = \bigcap_{n \in \mathbb N} K_n.\)

We are now ready to prove some major results in measure theory.

Measurable Projection

Theorem 7 [Measurable Projection]: Let \((\Omega, \mathcal{F}, \mathbb{P})\) be a complete probability space, let \((K, \mathcal{B}(K))\) be a locally compact separable metric space endowed with the collection of its Borel sets, and denote by \(\pi\) the projection of \(K \times \Omega\) onto \(\Omega.\) Then, for every \(B \in \mathcal{B}(K) \otimes \mathcal{F}\), the projection \(\pi(B) \in \mathcal{F}.\)

We start by noticing \(\mathcal{B}(K) \widehat{\otimes}_p \mathcal{F} = \mathcal{B}(K) \otimes \mathcal{F}.\) This follows from the fact that if \(A \in \mathcal{B}(K) \otimes_p \mathcal{F}\), then \(A = \bigcup _ {i=1}^n U_i \times V_i\) for some \(U_i \in \mathcal{B}(K)\) and \(V_i \in \mathcal{F}\), and thus it can be shown \(A^\mathsf{c} \in \mathcal{B}(K) \widehat{\otimes}_p \mathcal{F}\), and thus Lemma 1 gives

\[\begin{aligned} \mathcal{B}(K) \widehat{\otimes}_p \mathcal{F} = \sigma(\mathcal{B}(K) \otimes_p \mathcal{F}) = \mathcal{B}(K) \otimes \mathcal{F}.\end{aligned}\]

Consider the paving \(\mathscr{K}\) on \(K\) consisting of all compact subsets of \(K\), and introduce the \((\mathscr{K} \otimes_p \mathcal{F})\)-capacity \(I(A) = \mathbb{P}^*(\pi(A))\), \(A \in \mathcal{B}(K \times \Omega)\) we saw before. Now note

\[\begin{aligned} \mathcal{B}(K) \widehat{\otimes}_p \mathcal{F} = \mathscr{K} \widehat{\otimes}_p \mathcal{F},\end{aligned}\]

the mosaic generated by the paving \(\mathscr{K} \otimes_p \mathcal{F}.\) Showing \(\supseteq\) is trivial; for \(\subseteq\) let \(A \in \mathcal{B}(K) \otimes_p \mathcal{F}\) and show \(A \in \mathscr{K} \widehat{\otimes}_p \mathcal{F}\) by recalling \(\mathcal{B}(K) = \widehat{\mathscr{K}}.\)

Choquet's capacitability theorem (Theorem 6) thus guarantees that every set in \(\mathcal{B}(K) \otimes \mathcal{F}\) is \(I\)-capacitable. In particular, for every integer \(n \in \mathbb N\), there exists a set \(C_n \in (\mathscr{K} \otimes_p \mathcal{F})_\delta\) contained in \(B\) and such that \[\begin{aligned} I(C_n) \le I(B) \le I(C_n) + (1/n).\end{aligned}\] Because \(C_n \in (\mathscr{K} \otimes_p \mathcal{F})_\delta\), \(C_n\) is a countable intersection \(C_n = \bigcap _ {m \in \mathbb N} G_n^m\), where each \(G_n^m\) is a finite union of sets of the form \(U \times V\) with \(U \in \mathscr{K}, V \in \mathcal{F}.\) Letting \(H_n^m = \bigcap _ {i=1}^m G_n^i\), we see that \(H_n^1 \supseteq H_n^2 \supseteq \cdots\) and \(C_n = \bigcap _ {m \in \mathbb N} H_n^m\), where now \(H_n^m\) is also a finite union of sets of the form \(U \times V\) with \(U \in \mathscr{K}, V \in \mathcal{F}.\)

The form of \(H_n^m\) immediately implies \(\pi(H_n^m) \in \mathcal{F}\) for all \((m,n) \in \mathbb N^2.\) Lemma 3 implies

\[\begin{aligned} \pi(C_n) = \pi\left(\bigcap _ {m \in \mathbb N} H_n^m\right) = \bigcap _ {m \in \mathbb N} \pi(H_n^m) \in \mathcal{F}, \quad \forall \; n \in \mathbb N,\end{aligned}\]

which further implies

\[\begin{aligned} \pi\left(\bigcup _ {n \in \mathbb N} C_n\right) = \bigcup _ {n \in \mathbb N} \pi(C_n) \in \mathcal{F}.\end{aligned}\]

On the other hand, \(C_n \subseteq B\) for all \({n \in \mathbb N}\) and thus \(\pi\left(\bigcup _ {n \in \mathbb N} C_n\right) \subseteq B.\) But (2) implies that the difference of these two sets is a \(\mathbb{P}\)-null set, and the completeness of the probability space gives the desired conclusion \(\pi(B) \in \mathcal{F}.\)

It is not easy to construct an example of a Borel set in the product space whose projection is not Borel. It requires study of analytic sets. Check out Corollary 8.2.17 in (Cohn, 2013) for more details.

Measurable Graph

The next theorem is a very visual theorem, especially for the case where \(K\) and \(\Omega\) are \(\mathbb R.\)

Definition 16: For a map \(f \colon X \to Y\), we define its graph \(\llbracket f \rrbracket\) to be the product set

\[\begin{aligned} \llbracket f \rrbracket := \{(y,x) \in Y \times X \,:\, f(x) = y \}.\end{aligned}\]

We call a set \(G \in \mathcal{B}(K) \otimes \mathcal{F}\) a measurable graph, if for every \(\omega \in \Omega\) its section \(G(\omega) = \{y \in K\,:\,(y,\omega) \in G\}\) contains at most one point.

Theorem 8 [Measurable Graph]: A subset \(G\) of \(K \times \Omega\) is a measurable graph, if and only if, there exists a set \(\Xi \in \mathcal{F}\) and a measurable mapping \(g \colon \Xi \to K\), such that \(G = \llbracket g \rrbracket.\)

Sufficiency: If \(\Xi\) and \(g\) are as stated, the set \(G = \{(y,\omega) \in K \times \Xi\,:\,y = g(\omega)\}\) equals the pre-image \(\varphi^{-1}(\Delta)\) of the diagonal \(\Delta = \{(y,y) \in K \times K\,:\,y \in K\}\) under the mapping

\[\begin{aligned} K \times \Xi \ni (y,\omega) \mapsto \varphi(y,\omega) := (y, g(\omega)) \in K \times K.\end{aligned}\]

This mapping \(\varphi\) is \((\mathcal{B}(K) \otimes \mathcal{F})\)-measurable because of the facts that \(\mathcal{B}(K \times K) = \mathcal{B}(K) \otimes \mathcal{B}(K)\), on account of \(K\) being separable, and that \(g\) is measurable. Since \(\Delta\) is a closed set (because \(K\) is Hausdorff), \(\Delta \in \mathcal{B}(K \times K)\) and thus \(G\) is a measurable graph.

Necessity: Suppose that \(G\) is a measurable graph, and let \(\Xi := \pi(G).\) Then \(\Xi \in \mathcal{F}\) by the Measurable Projection theorem. For every \(\omega \in \Xi\), define \(g(\omega)\) to be the unique element of the set \(G(\omega).\) We want to show \(g \colon \Xi \to K\) is measurable. Indeed for any \(H \in \mathcal{B}(K)\), it is easy to see that

\[\begin{aligned} g^{-1}(H) = \pi\left(G \cap (H \times \Omega)\right) \in \mathcal{F},\end{aligned}\]

where the inclusion follows from Measurable Projection theorem.

Debut

Now let \(K = [0, \infty)\), the case important in stochastic processes.

Definition 17: Let \(A \subseteq [0, \infty) \times \Omega.\) The debut of \(A\) is the nonnegative function \(D_A \colon \Omega \to [0, \infty]\) defined as

\[\begin{aligned} D_A(\omega) = \inf\{t \in [0, \infty)\,:\,(t, \omega) \in A\}.\end{aligned}\]

Recall the convention \(\inf \varnothing = \infty.\) It is easy to see that \(\{D_A < \infty\} = \pi(A).\)

Theorem 9 [Measurable Debut]: Let \((\Omega, \mathcal{F}, \mathbb{P})\) be a complete probability space, and consider a measurable set \(A \in \mathcal{B}([0, \infty)) \otimes \mathcal{F}.\) Then the debut \(D_A\) of this set is a random variable.

For any given real number \(t > 0\), the set \(D_A^{-1}([0,t)) = \{\omega \in \Omega\,:\,D_A(\omega) < t\}\) is the projection onto \(\Omega\) of the measurable subset \(A \cap ([0, t) \times \Omega) \in \mathcal{B}([0, \infty)) \otimes \mathcal{F}.\) To see this, note

\[\begin{aligned} \omega \in D_A^{-1}([0,t)) \iff 0 \le D_A(\omega) < t \iff \exists \, s \in [0,t) \text{ such that } (s,\omega) \in A \iff \omega \in \pi(A \cap ([0, t) \times \Omega)).\end{aligned}\]

Measurable Projection theorem shows that this set is in \(\mathcal{F}.\)

Measurable Section

Let \((\Omega, \mathcal{F}, \mathbb{P})\) be a complete probability space, and consider a set \(A \subseteq [0, \infty) \times \Omega.\) Then for every \(\omega \in \pi(A),\) there exists a \(t \in [0, \infty)\) such that \((t, \omega) \in A.\) In other words, we can define a mapping \(Z \colon \pi(A) \to [0, \infty).\) It is convenient to extend \(Z\) to the whole of \(\Omega\) by setting \(Z = \infty\) on \(\Omega \setminus \pi(A).\) When is it possible to choose \(Z\) such that it is measurable? The measurable section theorem (also known as measurable selection theorem) says that it is possible to define \(Z\) to be measurable if \(A \in \mathcal{B}([0, \infty)) \otimes \mathcal{F}.\)

Recall that for \(Z \colon \Omega \to [0, \infty]\) its graph is the product set

\[\begin{aligned} \llbracket Z \rrbracket = \{(t,\omega) \in [0, \infty) \times \Omega \,:\, Z(\omega) = t\}.\end{aligned}\]

The condition \((Z(\omega), \omega) \in A\) whenever \(Z < \infty\) can then be expressed by saying \(\llbracket Z \rrbracket \subseteq A.\)

Other than stochastic processes, measurable section theorems have applications in optimal control and game theory.

Theorem 10 [Measurable Section]: Let \((\Omega, \mathcal{F}, \mathbb{P})\) be a complete probability space, and consider a measurable set \(A \in \mathcal{B}([0, \infty)) \otimes \mathcal{F}.\) Then there exists a random variable \(Z \colon \Omega \to [0, \infty]\) with \(\llbracket Z \rrbracket \subseteq A\) and \(\{Z < \infty\} = \pi(A).\)

We divide our analysis into two parts. We shall first show that for every \(\varepsilon > 0\) there exists a random variable \(Z_\varepsilon \colon \Omega \to [0, \infty]\) with \(\llbracket Z_\varepsilon \rrbracket \subseteq A\) and

\[\begin{aligned} \mathbb{P}(\pi(A)) \le \mathbb{P}(Z_\varepsilon < \infty) + \varepsilon.\end{aligned}\]

To this end, recall that if \(K = [0, \infty)\) and \(\mathscr{K}\) is the paving of all compact subsets of \(K\), then we have the \(\mathscr{K} \otimes_p \mathcal{F}\)-capacity \(I(A) = \mathbb{P}^*(\pi(A)), A \in \mathfrak{P}(K \times \Omega).\) Therefore, every \(A \in \mathcal{B}(K) \otimes \mathcal{F}\) is \(I\)-capacitable, and fixing \(A\), for every \(\varepsilon > 0,\) there exists a set

\[\begin{aligned} C_\varepsilon \in (\mathscr{K} \otimes_p \mathcal{F})_\delta \text{ such that } C_\varepsilon \subseteq A, \quad I(C_\varepsilon) \le I(A) \le I(C_\varepsilon) + \varepsilon.\end{aligned}\]

Let

\[\begin{aligned} Z_\varepsilon := D_{C_\varepsilon}\end{aligned}\]

be the debut of this set \(C_\varepsilon.\) Then the Measurable Debut theorem (Theorem 9) implies \(Z_\varepsilon\) is a random variable.

For every \(\omega \in \Omega\), the section \(C_\varepsilon(\omega)\) is a compact subset of \(K\) (use the facts that compact \(\iff\) closed and bounded here, and \(C_\varepsilon \in (\mathscr{K} \otimes_p \mathcal{F})_\delta\)). Notice that if \((t, \omega) \in \llbracket D_{C_\varepsilon} \rrbracket\) then \(D_{C_\varepsilon}(\omega) = t\) which is same as saying \(t = \inf \{s \in [0, \infty)\,:\,(s,\omega) \in C_\varepsilon\}\), but since \(C_\varepsilon(\omega)\) is closed this implies \((t, \omega) \in C_\varepsilon.\) Therefore, \(\llbracket Z_\varepsilon \rrbracket \subseteq C_\varepsilon\), showing the first requirement.

The second requirement \(\mathbb{P}\left(\pi(A)\right) \le \mathbb{P}\left(Z_\varepsilon < \infty\right) + \varepsilon\) is true because \(\pi(A) \in \mathcal{F}\) by the Measurable Projection theorem (Theorem 7) and \(\{Z_\varepsilon < \infty\} = \pi(C_\varepsilon) \in \mathcal{F}\), and now use (2).

Let us now construct a random variable \(Z\) to satisfy the properties claimed in the theorem. Define

\[\begin{aligned} A_1 := A.\end{aligned}\]

Then from above there exists a random variable

\[\begin{aligned} Z_1 \colon \Omega \to [0, \infty] \text{ with } \llbracket Z_1 \rrbracket \subseteq A_1 \text{ and } \mathbb{P}(Z_1 < \infty) \ge \frac{1}{2}\mathbb{P}(\pi(A_1)),\end{aligned}\]

by taking \(\varepsilon = \mathbb{P}\left(Z_1 < \infty\right) > 0\); if it happens that \(\mathbb{P}\left(Z_1 < \infty\right) = 0\), then \(\mathbb{P}\left(\pi(A_1)\right) = 0\) and the inequality is still true. Now define

\[\begin{aligned} A_2 := A_1 \setminus \left([0, \infty) \times \{Z_1 < \infty\}\right),\end{aligned}\]

and, reasoning as before, construct a random variable

\[\begin{aligned} Z_2 \colon \Omega \to [0, \infty] \text{ with } \llbracket Z_2 \rrbracket \subseteq A_2 \text{ and } \mathbb{P}(Z_2 < \infty) &\ge \frac{1}{2}\mathbb{P}(\pi(A_2)) \\ &= \frac{1}{2} \left( \mathbb{P}(\pi(A_1)) - \mathbb{P}(Z_1 < \infty) \right).\end{aligned}\]

Continuing this way, we obtain a sequence \(\{Z_n\} _ {n \in \mathbb N}\) such that \(\llbracket Z_n \rrbracket \subseteq A\), the projections \(\pi(\llbracket Z_n \rrbracket)\) are disjoint, and we have \[\begin{aligned} \sum_{k=1}^n \mathbb{P}(Z_k < \infty) \ge \left( \frac{1}{2} + \frac{1}{2^2} + \cdots + \frac{1}{2^n} \right) \mathbb{P}(\pi(A_1)) = (1 - 2^{-n})\mathbb{P}(\pi(A)), \quad \forall n \in \mathbb N.\end{aligned}\] Therefore, the random variable \(Z\) defined as

\[\begin{aligned} Z(\omega) = \begin{cases} Z_k(\omega) &\text{if } \omega \in \{Z_k < \infty\} \text{ for some } k \in \mathbb N \\ \infty &\text{otherwise,} \end{cases}\end{aligned}\]

satisfies \(\llbracket Z \rrbracket \subseteq A\), thus also \(\{Z < \infty\} \subseteq \pi(A).\)

On the other hand, letting \(n \to \infty\) in (3), we see that \(\{Z < \infty\}\) and \(\pi(A)\) have the same probability. Therefore, the completeness of the probability space implies these two sets can be made equal.

Epilogue

With this we are done laying the foundations. In the next blog post, we will discuss applications of these results in the general theory of processes.

References

Cohn, Donald L.. Measure Theory, second edition, Birkhäuser, 2013.
Dellacherie, Claude. Capacités et processus stochastiques, Springer-Verlag, 1972.
Dellacherie, Claude and Meyer, Paul-André. Probabilities and Potential, North-Holland Publishing Company, 1978.
Dellacherie, Claude. Capacities and analytic sets, Part of the Lecture Notes in Mathematics book series (LNM, volume 839), Springer-Verlag, 1981.
Karoui, Nicole El and Tan, Xiaolu. Capacities, Measurable Selection and Dynamic Programming Part I: Abstract Framework, 2013.
Lowther, George. Almost Sure blog.
Meyer, Paul-André. Stochastic Processes from 1950 to the Present (Translated from French by Jeanine Sedjro), 2009.

[1]	If you steal from one author, it’s plagiarism; if you steal from many, it’s research.