General Theory of Processes

General Theory of Processes - Part 2

Introduction
Setting The Stage
Debut of a Progressive Set
Optional and Predictable Processes
Properties of Debuts
Section Theorems
1. Applications of Section Theorems
Projection Theorems
References

This is part 2 of the sequence of blog posts on general theory of processes. For part 1 see here. I will be building upon the content presented there.

Introduction

“Mathematics exists solely for the honour of the human mind.” — Jacobi in a letter to Legendre after the death of Fourier. Fourier had the opinion that the principal aim of mathematics was public utility and explanation of natural phenomena.

We previously discussed Choquet's theory of capacities and its applications in measure theory. The goal of this blog post is to discuss the debut, section and projection theorems in stochastic processes. These theorems form the core of the “general theory of processes”, developed primarily by the Strasbourg school of probability under the tutelage of Paul-André Meyer. At the risk of being overly simplistic, general theory of processes is the study of filtrations and stopping times—martingales will be absent in our discussion.

Unlike part 1 where there were no prerequisites other than basic measure theory, this part assumes a good amount of familiarity with stochastic processes, at the level of (Karatzas and Shreve, 1998a), (Le Gall, 2016). Without that the material here will feel unmotivated and difficult. On the other hand, the theorems proved here are skipped even in advanced courses in stochastic processes as they aren't the most useful for applications, but rather are necessary to fill in gaps if a rigorous treatment is warranted. Nevertheless the theory presented is extremely profound and beautiful, and reading it will be an honour for the mind, if nothing else.

Setting The Stage

Completion of Measure Spaces and Filtrations

Definition 18: A measure space \((E, \mathscr{E}, \mu)\) is called complete if \(B \in \mathscr{E}\) and \(\mu(B) = 0\) implies that every subset of \(B\) is in \(\mathscr E.\) We call \(A \subseteq E\) a \(\mu\)-negligible or a \(\mu\)-null set if there exists \(B \in \mathscr E\) such that \(A \subseteq B\) and \(\mu(B) = 0.\) A given statement is said to hold \(\mu\)-almost everywhere, or simply \(\mu\)-a.e., if the set on which it fails is \(\mu\)-negligible.

Thus the measure space \((E, \mathscr{E}, \mu)\) is complete if and only if every \(\mu\)-negligible subset of \(E\) belongs to \(\mathscr E.\)

Recall that an outer measure on \(E\) is a function \(\mu^* \colon \mathfrak{P}(E) \to [0, \infty]\) such that

\(\mu^*(\varnothing) = 0\),
if \(A \subseteq B \subseteq E\), then \(\mu^*(A) \le \mu^*(B)\), and
if \(\{A_n\}_{n \in \mathbb N}\) is a sequence of subsets of \(E\), then
\[\begin{aligned} \mu^*\left( \bigcup_{n \in \mathbb N} A_n \right) \le \sum_{n \in \mathbb N} \mu^*(A_n).\end{aligned}\]

Also recall that a subset \(B \subseteq E\) is called \(\mu^*\)-measurable if

\[\begin{aligned} \mu^*(A) = \mu^*(A \cap B) + \mu^*(A \cap B^{\mathsf{c}}), \quad \forall \; A \subseteq E.\end{aligned}\]

It is a simple exercise to show that if \(B \subseteq E\) is such that either \(\mu^*(B) = 0\) or \(\mu^*(B^\mathsf{c}) = 0\), then \(B\) is \(\mu^*\)-measurable. Finally, recall that if \(\mathscr{M}_{\mu^*}\) denotes the collection of all \(\mu^*\)-measurable subsets of \(E\), then \(\mathscr{M}_{\mu^*}\) is a \(\sigma\)-algebra, and the restriction of \(\mu^*\) to \(\mathscr{M}_{\mu^*}\) is a measure on \(\mathscr{M}_{\mu^*}.\) Therefore, it follows that the measure space \((E, \mathscr{M}_{\mu^*}, \mu^*)\) is complete. In particular, the Lebesgue measure on the \(\sigma\)-algebra of Lebesgue subsets of \(\mathbb R\) is complete. It can be shown that the restriction of Lebesgue measure to the \(\sigma\)-algebra of Borel subsets of \(\mathbb R\) is not complete. We next discuss a result that allows us to complete any measure space.

Theorem 11: Let \((E, \mathscr{E}, \mu)\) be an arbitrary measure space. Define the collection \(\mathscr{E}_\mu\) to consist of all sets \(B \subseteq E\) for which there exist \(B_1, B_2 \in \mathscr E\) such that \[\begin{aligned} B_1 \subseteq B \subseteq B_2 \text{ and } \mu(B_2 \setminus B_1) = 0.\end{aligned}\]

Then \(\mathscr{E}_\mu\) is a \(\sigma\)-algebra on \(E\) that includes \(\mathscr E.\) Define the function

\[\begin{aligned} \overline{\mu} \colon \mathscr E_\mu \to [0, \infty]\end{aligned}\]

by letting \(\overline{\mu}(B) = \mu(B_1)\) in the notation above. Then \(\overline \mu\) is well-defined and is a measure on the \(\sigma\)-algebra \(\mathscr E_\mu\) whose restriction to \(\mathscr E\) is simply \(\mu.\) Finally, the measure space \((E, \mathscr E_\mu, \overline \mu)\) is complete, and is called the completion of \((E, \mathscr{E}, \mu).\)

Let us start by showing that \(\mathscr{E}_\mu\) is a \(\sigma\)-algebra. That \(\mathscr{E}_\mu\) includes \(\mathscr{E}\) is clear by taking \(B_1 = B_2 = B\) for any \(B \in \mathscr{E}.\) This, in particular, means that \(\varnothing \in \mathscr{E}_\mu.\) Now let \(B \in \mathscr{E}_\mu\) with \(B_1, B_2 \in \mathscr{E}\) as in the theorem statement. (1) implies

\[\begin{aligned} B_2^\mathsf{c} \subseteq B^\mathsf c \subseteq B_1^\mathsf{c} \text{ and } \mu(B_1^\mathsf{c} \setminus B_2^\mathsf{c}) = 0,\end{aligned}\]

showing that \(B^\mathsf{c} \in \mathscr{E}_\mu.\) Finally, suppose that \(\{B^n\}_{n \in \mathbb N}\) is a sequence of sets in \(\mathscr{E}_\mu\), such that for each \(n \in \mathbb N\), \(B^n_1, B^n_2 \in \mathscr{E}\) be the sets satisfying (1). Then \(\bigcup_{n \in \mathbb N} B^n_1\) and \(\bigcup_{n \in \mathbb N} B^n_2\) both belong to \(\mathscr E\), and satisfy

\[\begin{aligned} \begin{gather*} \bigcup_{n \in \mathbb N} B^n_1 \subseteq \bigcup_{n \in \mathbb N} B^n \subseteq \bigcup_{n \in \mathbb N} B^n_2, \text{ and} \\ \mu \left( \bigcup_{n \in \mathbb N} B^n_2 \setminus \bigcup_{n \in \mathbb N} B^n_1 \right) \le \mu \left( \bigcup_{n \in \mathbb N} (B^n_2 \setminus B^n_1) \right) \le \sum_{n \in \mathbb N} \mu(B^n_2 \setminus B^n_1) = 0, \end{gather*}\end{aligned}\]

showing that \(\bigcup_{n \in \mathbb N} B^n \in \mathscr{E}_\mu.\)

Next, we show that \(\overline \mu\) is well-defined. Using the notation in the statement of the theorem, it follows immediately that \(\mu(B_1) = \mu(B_2).\) Furthermore, if \(A \subseteq B\) such that \(A \in \mathscr{E}\), then

\[\begin{aligned} \mu(A) \le \mu(B_2) = \mu(B_1).\end{aligned}\]

Hence

\[\begin{aligned} \mu(B_1) = \sup \{\mu(A) \,:\, A \in \mathscr{E} \text{ and } A \subseteq B\},\end{aligned}\]

and so the common value of \(\mu(B_1)\) and \(\mu(B_2)\) depends only on the set \(B\) and not on the choice of \(B_1\) and \(B_2.\)

Next, we show that \(\overline \mu\) is a measure on \(\mathscr E_\mu.\) \(\overline \mu\) is clearly an extension of \(\mu\) by letting \(B_1 = B_2 = B\) for any \(B \in \mathscr{E}.\) This, in particular, implies that \(\overline \mu(\varnothing) = 0.\) Non-negativity of the measure \(\mu\) implies the non-negativity of \(\overline \mu.\) Finally, we check the countable additivity. Let \(\{B^n\}_{n \in \mathbb N}\) be a sequence of disjoint sets in \(\mathscr E_\mu\), such that for each \(n \in \mathbb N\), \(B^n_1, B^n_2 \in \mathscr{E}\) be the sets satisfying (1). The disjointness of the sets \(\{B^n\}_{n \in \mathbb N}\) implies the disjointness of the sets \(\{B^n_1\}_{n \in \mathbb N}\), and so we get

\[\begin{aligned} \overline{\mu}\left( \bigcup_{n \in \mathbb N} B^n \right) = \mu\left( \bigcup_{n \in \mathbb N} B^n_1 \right) = \sum_{n \in \mathbb N} \mu(B^n_1) = \sum_{n \in \mathbb N} \overline\mu(B^n).\end{aligned}\]

Lastly, we need to check that the measure space \((E, \mathscr E_\mu, \overline \mu)\) is complete. Suppose \(B \in \mathscr E_\mu\) is such that \(\overline\mu(B) = 0,\) and suppose \(A \subseteq B.\) We need to show that \(A \in \mathscr E_\mu.\) Let \(B_1, B_2 \in \mathscr E\) be sets from (1). For \(A_1 = \varnothing\) and \(A_2 = B_2\), the conditions (1) are satisfied for the set \(A,\) and we are done.

If we denote by \(\mathscr{N}\) the collection of all \(\mu\)-negligible sets for the measure space \((E, \mathscr{E}, \mu)\), then it is easy to see that \[\begin{aligned} \mathscr E_\mu = \{B \cup N \,:\, B \in \mathscr E \text{ and } N \in \mathscr N\},\end{aligned}\] and \(\overline \mu (B \cup N) = \mu(B).\) From this it follows that if the measure space \((E, \mathscr{E}, \mu)\) is complete, then it is its own completion.

A very nice property that holds if we assume that the measure space \((E, \mathscr{E}, \mu)\) is complete is that if \(f\) and \(g\) are real-valued functions defined \(E\) such that \(f\) is measurable and and \(f = g\) holds \(\mu\)-a.e., then \(g\) is also measurable. To see this, let \(B \in \mathcal{B}(\mathbb R)\) be any Borel set. Then \(\{f \in B\} \in \mathscr E\) and

\[\begin{aligned} A := \{g \in B\} \cap \{f \neq g\} \subseteq \{f \neq g\} \in \mathscr N \subseteq \mathscr E\end{aligned}\]

by our assumptions. Thus, \(\{f = g\} = \{f \neq g \}^\mathsf{c} \in \mathscr E\), and

\[\begin{aligned} \{g \in B\} = \left(\{f \in B\} \cap \{f = g\}\right) \cup A \in \mathscr E\end{aligned}\]

allows us to conclude that \(g\) is measurable.

The above property can fail if the measure space is not complete. Similar problems can arise in the study of stochastic processes, and therefore completeness assumptions are made. For this whole blog post, we shall place ourselves on a complete probability space \((\Omega, \mathcal{F}, \mathbb{P})\) endowed with a filtration \(\mathbb{F} = \{\mathcal{F}_t\} _ {0 \le t < \infty}\), i.e., a family of sub-\(\sigma\)-algebras of \(\mathcal{F}\) which is increasing in the sense that

\[\begin{aligned} \mathcal{F}_s \subseteq \mathcal{F}_t \subseteq \mathcal{F}_\infty := \sigma\left(\bigcup_{u \in [0, \infty)} \mathcal{F}_u\right), \quad 0 \le s \le t < \infty.\end{aligned}\]

We sometimes denote this setup conveniently as \((\Omega, \mathcal{F}, \mathbb{F}, \mathbb{P})\) and call it a filtered probability space. We shall assume that this filtration satisfies the usual conditions:

Right-continuity, i.e., \(\mathcal{F}_t = \mathcal{F}_{t+} := \bigcap_{s > t} \mathcal{F}_s\) for all \(t \in [0, \infty)\), and
\(\mathcal{F}_0\) contains all the \(\mathbb{P}\)-negligible events in \(\mathcal{F}\).

The structure of the filtration implies that we can write \(\mathcal{F}_{t+}\) equivalently as \(\bigcap_{n \in \mathbb N} \mathcal{F}_{t + 1/n}.\)

Often we will need to start with the natural filtration \[\begin{aligned} \mathcal{F}_t^X := \sigma(X_s; \;0 \le s \le t), \quad t \ge 0,\end{aligned}\] generated by the process \(X = \{X_t\}_{t \ge 0},\) which is the smallest filtration with respect to which the process \(X\) is adapted, i.e., \(X_t\) is \(\mathcal{F}_t^X\)-measurable for each \(t \ge 0.\) And then we will need to augment it to make it satisfy the usual conditions. To that end, we define the minimal augmented filtration generated by \(X\) to be the smallest filtration that is right continuous and complete and with respect to which the process \(X\) is adapted. This can be constructed in the following three steps:

First, let \(\left\{\mathcal{F}_t^{X}\right\}_{t \ge 0}\) be the natural filtration as in (3).
Let \(\mathscr{N}\) be the collection of all \(\mathbb{P}\)-negligible sets for the complete probability space \((\Omega, \mathcal{F}, \mathbb{P})\) (we can always complete it using Theorem 11 if it isn’t). For each \(t \ge 0,\) let \[\begin{aligned} \left(\mathcal{F}^X_t\right)_{\mathbb{P}} = \left\{F \cup N \,:\, F \in \mathcal{F}_t^X \text{ and } N \in \mathscr N\right\}\end{aligned}\] be the completion of \(\mathcal{F}_t^X\) just like in (2).
Finally, for each \(t \ge 0,\) let \[\begin{aligned} \mathcal{F}_t := \left( \left(\mathcal{F}^X_t\right)_{\mathbb{P}} \right)_+ = \bigcap_{s > t} \left(\mathcal{F}^X_s\right)_{\mathbb{P}}\end{aligned}\] define the right-continuous filtration. To show right-continuity of \(\{\mathcal{F}_t\}_{t \ge 0},\) we need to show that
\[\begin{aligned} \bigcap_{s > t} \left(\mathcal{F}^X_s\right)_{\mathbb{P}} = \mathcal{F}_t = \mathcal{F}_{t+} = \bigcap_{r > t} \bigcap_{s > r} \left(\mathcal{F}^X_s\right)_{\mathbb{P}}.\end{aligned}\]
But this is obvious since \(F \in \left(\mathcal{F}^X_s\right)_{\mathbb{P}}\) for every \(s > r > t\) is same as saying \(F \in \left(\mathcal{F}^X_s\right)_{\mathbb{P}}\) for every \(s > t.\)

These steps give a filtration \(\{\mathcal{F}_t\}_{t \ge 0},\) which is right continuous and complete and with respect to which the process \(X\) is adapted. It is clear that it is also the smallest such filtration. Does it matter if we do step 3 before step 2, i.e., is it true that

\[\begin{aligned} \left( \left(\mathcal{F}^X_t\right)_{\mathbb{P}} \right)_+ = \left( \left(\mathcal{F}^X_t\right)_{+} \right)_{\mathbb{P}} ?\end{aligned}\]

It would be very disappointing if they weren’t equal. Let us start by showing

\[\begin{aligned} \left( \left(\mathcal{F}^X_t\right)_{+} \right)_{\mathbb{P}} \subseteq \left( \left(\mathcal{F}^X_t\right)_{\mathbb{P}} \right)_+.\end{aligned}\]

From (2), if \(F \in \left( \left(\mathcal{F}^X_t\right)_{+} \right)_{\mathbb{P}},\) then \(F = B \cup N\) for some \(B \in \left(\mathcal{F}^X_t\right)_{+}\) and \(N \in \mathscr N.\) Then \(B \in \mathcal{F}^X_s\) for every \(s > t,\) which means that \(B \cup N \in \left(\mathcal{F}^X_s\right)_{\mathbb{P}},\) showing \(F \in \left(\mathcal{F}^X_s\right)_{\mathbb{P}}\) for every \(s > t.\)

For the other side,

\[\begin{aligned} \left( \left(\mathcal{F}^X_t\right)_{\mathbb{P}} \right)_+ \subseteq \left( \left(\mathcal{F}^X_t\right)_{+} \right)_{\mathbb{P}},\end{aligned}\]

let \(F \in \left( \left(\mathcal{F}^X_t\right)_{\mathbb{P}} \right)_+.\) We need to construct \(B \in \left(\mathcal{F}^X_t\right)_{+}\) and \(N \in \mathscr N\) such that \(F = B \cup N.\) By the definition of right-continuity, \(F \in \left(\mathcal{F}^X_{t + 1/n}\right)_{\mathbb{P}}\) for every \(n \in \mathbb N,\) which means there exist sequences \(\{B_n\}_{n \in \mathbb N}\) and \(\{N_n\}_{n \in \mathbb N}\) satisfying \(B_n \in \mathcal{F}^X_{t + 1/n},\) \(N_n \in \mathscr N,\) and \(F = B_n \cup N_n\) for each \(n \in \mathbb N.\) Define

\[\begin{aligned} B := \liminf_{n \to \infty} B_n = \{B_n, \text{eventually}\} = \bigcup_{k \in \mathbb N} \bigcap_{n \ge k} B_n.\end{aligned}\]

Then note that \(\bigcap_{n \ge k} B_n \in \mathcal{F}_{t + 1/k}^X,\) which implies that for each \(m \in \mathbb N,\)

\[\begin{aligned} B = \bigcup_{k \in \mathbb N} \bigcap_{n \ge k} B_n = \bigcup_{k \ge m} \bigcap_{n \ge k} B_n \in \mathcal{F}_{t + 1/m}^X.\end{aligned}\]

This says that \(B \in \left(\mathcal{F}^X_t\right)_{+}.\) Define \(N := F \setminus B,\) and note

\[\begin{aligned} F \setminus B = F \cap \bigcap_{k \in \mathbb N} \bigcup_{n \ge k} B_n^\mathsf{c} = \bigcap_{k \in \mathbb N} \bigcup_{n \ge k} (F \setminus B_n) = \bigcap_{k \in \mathbb N} \bigcup_{n \ge k} N_n \in \mathscr{N}.\end{aligned}\]

This shows the desired inclusion, and we are done.

Stochastic Processes

For our purposes, a stochastic process is simply a collection of real-valued random variables \(X = \{X_t\}_{t \ge 0}\) on \((\Omega, \mathcal{F}).\) The sample paths of \(X\) are the mapping \([0, \infty) \ni t \mapsto X_t( \omega) \in \mathbb R\) obtained when fixing \(\omega \in \Omega.\)

We need terminology to talk about when two stochastic processes are the “same”.

Definition 19: Consider two stochastic process \(X = \left\{ X_t\right\}_{t \ge 0}\) and \(Y = \left\{ Y_t\right\}_{t \ge 0}\) defined on the same probability space \((\Omega, \mathcal{F}, \mathbb{P}).\) We say

\(X\) and \(Y\) have the same finite-dimensional distributions if, for any integer \(n \ge 1\), real numbers \(0 \le t_1 < t_2 < \cdots < t_n < \infty,\) and \(A \in \mathcal{B}(\mathbb R^n)\), we have

\[\begin{aligned} \mathbb{P}\{(X_{t_1}, \ldots, X_{t_n}) \in A\} = \mathbb{P}\{(Y_{t_1}, \ldots, Y_{t_n}) \in A\};\end{aligned}\]

\(Y\) is a modification or a version of \(X\) if, for every \(t \in [0, \infty)\), we have \(\mathbb{P}\{X_t = Y_t\} = 1;\)
\(X\) and \(Y\) are indistinguishable if almost all their sample paths agree, i.e.,
\[\begin{aligned} \mathbb{P}\{X_t = Y_t \;\; \forall \, t \in [0, \infty)\} = 1.\end{aligned}\]
The set \(\{X_t = Y_t \;\; \forall \, t \in [0, \infty)\}\) is measurable because of our completeness assumption.

We will need to make stronger measurability assumptions for the random variables \(\{X_t\}_{t \ge 0}\) than just assuming that each \(X_t\) is measurable.

Definition 20: A stochastic process \(X = \{X_t\}_{t \ge 0}\) is called

adapted, if \(X_t\) is \(\mathcal{F}_t\)-measurable for every \(t \in [0, \infty)\);
measurable, if the mapping \[\begin{aligned} [0, \infty) \times \Omega \ni (t,\omega) \mapsto X_t(\omega) \in \mathbb R\end{aligned}\] is \(\mathcal{B}([0, \infty)) \otimes \mathcal{F}\)-measurable, when \(\mathbb R\) is endowed with its Borel \(\sigma\)-algebra;
progressively measurable, if for every \(t \in [0, \infty)\) the mapping
\[\begin{aligned} [0,t] \times \Omega \ni (s,\omega) \mapsto X_s(\omega) \in \mathbb R\end{aligned}\]
is \(\mathcal{B}([0,t]) \otimes \mathcal{F}_t\)-measurable, when \(\mathbb R\) is endowed with its Borel \(\sigma\)-algebra.

Recall that every progressively measurable process is both measurable and adapted; and every adapted process with right-continuous or with left-continuous paths, is progressively measurable (Karatzas and Shreve, 1998a), (Le Gall, 2016). On the other hand, we have the following result due to Meyer (1966).

Theorem 12: Every measurable and adapted process \(X = \{X_t\} _ {t \ge 0}\) has a progressively measurable modification \(Y = \{Y_t\} _ {t \ge 0}.\)

Consider the space \(\mathbb{L}^0\) of equivalence classes of measurable functions \(f : \Omega \to \mathbb R\), and endow it with the topology of convergence in probability, for example with the pseudo-metrics \(\rho(f, g) = \mathbb E (1 \wedge |f-g|)\) or \(\rho(f,g) = \mathbb E \left( \frac{|f-g|}{1 + |f-g|} \right).\) Recall that if a sequence \(\{f_n\} _ {n \in \mathbb N} \subseteq \mathbb{L}^0\) satisfies \(\sum _ {n \in \mathbb N} \rho(f _ n, f _ {n+1}) < \infty,\) then it converges in probability “fast”, thus also almost surely.

Every process \(Y = \{Y_t\} _ {t \ge 0}\) can be thought of as a mapping from \([0, \infty)\) into \(\mathbb{L}^0\), which associates to each element in \([0, \infty)\) the equivalence class \(\mathcal{Y} _ t\) of \(Y_t.\) Consider now the collection \(\mathscr{H}\) of processes \(Y\) such that the mapping \([0, \infty) \ni t \mapsto \mathcal{Y}_t \in \mathbb{L}^0\) satisfies

it takes values in a separable subspace of \(\mathbb{L}^0\);
under it, the inverse image of every open ball of \(\mathbb{L}^0\) is a Borel subset of \([0, \infty)\); and
it is the uniform limit of a sequence of simple measurable functions with values in the space \(\mathbb{L}^0.\)

Then note that property 3 implies the other two, and that conversely, properties 1 and 2 together imply property 3, because a real-valued function is measurable if and only if it is the increasing limit of a sequence of simple measurable functions.

Next, note that \(\mathscr{H}\) is a real vector space on account of property 3, and is closed under sequential pointwise convergence on account of properties 1 and 2. It is easy to see that \(\mathscr{H}\) contains all processes of the form \(Y(t,\omega) = K(t) \xi(\omega)\), where \(K\) is the indicator of an interval in \([0, \infty)\) and \(\xi : \Omega \to \mathbb R\) is a bounded measurable function. Thus, monotone class theorem implies \(\mathscr{H}\) contains all bounded measurable processes. Now using a standard slicing argument it is easily seen that every measurable process \(Y\) has properties 1, 2 and 3.

Consider now a measurable and adapted process \(X = \{X_t\} _ {t \ge 0}.\) For each \({n \in \mathbb N}\), there exists a process \(X^{(n)} = \{X^{(n)}_t\} _ {t \ge 0}\) which is simple and measurable (in the sense that there exists a partition \(\left\{A^{(n)} _ k\right\} _ {k \in \mathbb N}\) of \([0, \infty)\) into Borel sets, and a sequence of random variable \(\left\{H^{(n)} _ k\right\} _ {k \in \mathbb N}\), such that \(X^{(n)}_t = H^{(n)} _ k\) for \(t \in A^{(n)} _ k\)), and satisfies

\[\begin{aligned} \rho(X_t, X^{(n)}_t) \le 2^{-(n+1)}, \quad \forall \; t \in [0, \infty).\end{aligned}\]

Next, we define a sequence of random variables \(\left\{G^{(n)} _ k\right\} _ {k \in \mathbb N}\) for each \({n \in \mathbb N}\) using the sequence \(\left\{H^{(n)} _ k\right\} _ {k \in \mathbb N}\) to get “enhanced” measurability properties. To this end, define \(s^{(n)} _ k = \inf A^{(n)} _ k\), and define \(G^{(n)} _ k = X(s^{(n)} _ k)\) if \(s^{(n)} _ k \in A^{(n)} _ k\); if, on the other hand \(s^{(n)} _ k \notin A^{(n)} _ k\), we pick any \(\mathcal{F}_{s^{(n)} _ k}-\)measurable \(G^{(n)} _ k\) that satisfies

\[\begin{aligned} \rho\left(G^{(n)} _ k, H^{(n)} _ k\right) \le 2^{-(n+1)}.\end{aligned}\]

Such a \(G^{(n)} _ k\) always exists since we may take a decreasing sequence \(\{\theta_m\} _ {m \in \mathbb N} \subseteq A^{(n)} _ k\) with \(\theta_m \downarrow s^{(n)} _ k\), and set \(G^{(n)} _ k = \liminf _ {m \to \infty} X(\theta_m)\); and the above inequality holds for all integers \(k.\)

We now define the process \(Y^{(n)} = \{Y^{(n)}_t\} _ {t \ge 0}\) by \(Y^{(n)}_t := G^{(n)} _ k, \, t \in A^{(n)} _ k\), and check that it is progressively measurable and satisfies \(\rho\left(X_t, Y^{(n)}_t\right) \le 2^{-(n+1)}\) for all \(t \in [0, \infty)\) and \({n \in \mathbb N}.\)

Finally, we construct a progressively measurable process \(Y\) by

\[\begin{aligned} Y_t(\omega) := \lim_{n \to \infty} Y^{(n)}_t(\omega), \quad \forall \; (t,\omega) \in [0, \infty) \times \Omega \text{ such that this limit exists,}\end{aligned}\]

and \(Y_t(\omega) := 0\) otherwise. The properties of \(\rho\) then implies \(\mathbb{P}\{X_t = Y_t\} = 1\) holds for all \(t \in [0, \infty)\), so \(Y\) is a modification of \(X.\)

Definition 21: The \(\sigma\)-algebra of progressively measurable sets, denoted by \(\mathscr{P} _ \star\), is the smallest \(\sigma\)-algebra on the product space \([0, \infty) \times \Omega\), with respect to which the mappings as in (6) are measurable for all progressively measurable processes \(X.\)

It can be verified that \(\mathscr{P} _ \star\) is the collection of all sets \(A \in \mathcal{F} \otimes \mathcal{B}([0, \infty))\) such that the process \(X_t(\omega) = \mathbf{1}_A(\omega, t)\) is progressively measurable. This characterization gives the useful result that a subset \(A\) of \([0, \infty) \times \Omega\) belongs to \(\mathscr{P} _ \star\) if and only if, for every \(t \in [0, \infty)\), \(A \cap ([0,t] \times \Omega) \in \mathcal{B}([0,t]) \otimes \mathcal{F}_t.\)

Stopping Times

Definition 22: A random variable \(T : \Omega \to [0, \infty]\) is called a stopping time if \(\{T \le t\} \in \mathcal{F}_t\) for all \(t \in [0, \infty).\)

Because the filtration \(\mathbb{F}\) is right-continuous this is equivalent to \(\{T < t\} \in \mathcal{F}_t\) for all \(t \in [0, \infty)\) as can be seen by writing

\[\begin{aligned} \{T \le t\} = \bigcap_{\substack{q \in \mathbb{Q}, \, t < q < s}} \{T < q\} \in \mathcal{F}_s.\end{aligned}\]

It is easily checked that if \(T\) and \(S\) are stopping times, then so are

\[\begin{aligned} T \wedge S,\, T \vee S \text{ and } T+S.\end{aligned}\]

Likewise, if \(\{T_n\}_{n \in \mathbb N}\) is a sequence of stopping times, then

\[\begin{aligned} \sup_n T_n,\, \inf_n T_n,\, \limsup_{n \to \infty} T_n \text{ and } \liminf_{n \to \infty} T_n\end{aligned}\]

are also stopping times. Stopping times go hand in glove with progressive measurability. Suppose \(X = \{X_t\}_{t \ge 0}\) is \(\mathbb{F}\)-progressively measurable and \(T\) an \(\mathbb{F}\)-stopping . Then, the stopped process

\[\begin{aligned} \left\{X_{T \wedge t}\right\}_{t \ge 0}\end{aligned}\]

is \(\mathbb{F}\)-progressively measurable.

Definition 23: For a stopping time \(T\), we define

\(\sigma\)-algebra of events prior to it by

\[\begin{aligned} \mathcal{F}(T) := \{A \in \mathcal{F}\,:\, A \cap \{T \le t\} \in \mathcal{F}(t), \quad \forall \; t \in [0, \infty)\};\end{aligned}\]

\(\sigma\)-algebra of events strictly prior to it by

\[\begin{aligned} \mathcal{F}(T-) := \sigma(\mathcal{F}(0) \cup \left\{A \cap \{T > t\} \,:\, t \in [0, \infty), \, A \in \mathcal{F}(t)\right\}).\end{aligned}\]

The following intuitive properties are useful and their proofs can be found in any standard textbook (Karatzas and Shreve, 1998a), (Le Gall, 2016).

Lemma 6: Let \(T\) and \(S\) be any two stopping times. Then

\(\mathcal{F}(T-) \subseteq \mathcal{F}(T)\);
\(T\) is measurable with respect to both \(\mathcal{F}(T)\) and \(\mathcal{F}(T-)\);
\(\mathcal{F}(T) \cap \mathcal{F}(S) = \mathcal{F}(T \wedge S)\);
\(A \cap \{T \le S\} \in \mathcal{F}(S)\), for all \(A \in \mathcal{F}(T)\);
\(A \cap \{T < S\} \in \mathcal{F}(S-)\), for all \(A \in \mathcal{F}(T)\);
If the stopping times satisfy \(T \le S\), then \(\mathcal{F}(T) \subseteq \mathcal{F}(S)\) and \(\mathcal{F}(T-) \subseteq \mathcal{F}(S-)\);
If the stopping times satisfy \(T < S\) on the set \(\{0 < S < \infty\}\), then \(\mathcal{F}(T) \subseteq \mathcal{F}(S-)\);
If \(\{T_n\} _ {n \in \mathbb N}\) is an increasing sequence of stopping times and \(T = \lim _ {n \in \mathbb N} \uparrow T_n\), then
\[\begin{aligned} \mathcal{F}(T-) = \sigma\left(\bigcup_{n \in \mathbb N} \mathcal{F}(T_n-)\right); \quad \text{as well as } \mathcal{F}(T-) = \sigma\left(\bigcup_{n \in \mathbb N} \mathcal{F}(T_n)\right)\end{aligned}\]
provided that \(T_n < T\) holds on the set \(\{0 < T < \infty\}\) for every \({n \in \mathbb N};\)
If \(Z\) is an integrable random variable, we have

\[\begin{aligned} \mathbb{E}\left[ Z \mid \mathcal{F}(T) \right] &= \mathbb{E}\left[ Z \mid \mathcal{F}(S \wedge T)\right], \quad \mathbb{P}\text{-a.e. on } \{T \le S\} \text{,}\\ \mathbb{E}\left[ \mathbb{E}\left[ Z \mid \mathcal{F}(T) \right] \mid \mathcal{F}(S) \right] &= \mathbb{E}\left[ Z \mid \mathcal{F}(S \wedge T) \right], \quad \mathbb{P}\text{-a.e.};\end{aligned}\]

If \(\{T_n\}_{n \in \mathbb N}\) is a sequence of stopping times and \(T:= \inf_n T_n\), then

\[\begin{aligned} \mathcal{F}(T) = \bigcap_{n \in \mathbb N} \mathcal{F}(T_n).\end{aligned}\]

Predictable Stopping Times

A very important result states that every stopping time is the decreasing limit of a sequence of discrete stopping time. More concretely, if \(T\) is a stopping time, and we define

\[\begin{aligned} T_n(\omega) = \begin{cases} k 2^{-n} &\text{on } \left\{\frac{k-1}{2^n} \le T < \frac{k}{2^n}\right\} \\ T(\omega) &\text{on } \{T = \infty\}, \end{cases}\end{aligned}\]

for \(n,k \in \mathbb N,\) then each \(T_n\) is a discrete stopping time and we have \(T_n \downarrow T.\) When can we approximate a stopping with an increasing sequence of stopping times? Of course, being able to do that means that our stopping time is a very special one. We give a name to such stopping times.

Definition 24: A stopping time \(T\) is called predictable, if there exists an increasing (“announcing”) sequence \(\{T_n\} _ {n \in \mathbb N}\) of stopping times with \(T = \lim _ {n \in \mathbb N} \uparrow T_n\), and \(T_n < T\) holds on the set \(\{T > 0\}\) for all \({n \in \mathbb N}.\)

Note \(\{T \le t\} = \bigcap_{n \in \mathbb N} \{T_n \le t\} \in \mathcal{F}(t)\) and therefore if in the definition above we had not required \(T\) to be a stopping time, it would still be a stopping time. A canonical example of a predictable stopping time is the first time Brownian motion \(W\) hits or exceeds a certain level \(b.\) It is announced by the sequence of first times \(W\) hits or exceeds the levels \(b - 1/n\), for \({n \in \mathbb N}.\) In fact, we have (taken from (Almost Sure blog)):

Theorem 13: Let \(X\) be a continuous adapted process and \(b\) be a real number. Then

\[\begin{aligned} T = \inf \{t \in [0, \infty)\,:\,X(t) \ge b\}\end{aligned}\]

is a predictable stopping time.

Let

\[\begin{aligned} T_n = \inf \{t \in [0, \infty)\,:\,X(t) \ge b - 1/n\},\end{aligned}\]

which, by the Debut Theorem below (Theorem 14) is a stopping time. This gives an increasing sequence \(\{T_n\} _ {n \in \mathbb N}\) of stopping times bounded above by \(T.\) Also, \(X(T _ n) \ge b - 1/n\) whenever \(T _ n < \infty\) and, by left-continuity, setting \(\tau = \lim _ n T _ n\) gives \(X(\tau) \ge b\) whenever \(\tau < \infty.\) So \(T \ge \tau\), showing that \(T = \lim _ {n \in \mathbb N} \uparrow T_n.\) If \(0 < T _ n \le T < \infty\) then, by continuity, \(X(T_n) = b - 1/n < b = X(T).\) So, \(T_n < T\) on the set \(\{0 < T < \infty\}\) and the sequence \(\{T_n \wedge n\} _ {n \in \mathbb N}\) announces \(T.\)

Of course, in the above proof we cheated by assuming the Debut Theorem for stopping times, proving which is our next agenda. In the same flavor as Lemma-6 we have

Lemma 7: If \(T\) is a predictable stopping time and \(S\) an arbitrary stopping time, then

\(A \cap \{T \le S\} \in \mathcal{F}(S-)\) for all \(A \in \mathcal{F}(T-)\);
\(A \cap \{S = \infty\} \in \mathcal{F}(S-)\) for all \(A \in \mathcal{F}(\infty)\).

In particular, both the events \(\{T \le S\}\) and \(\{T = S\}\) belong to \(\mathcal{F}(S-).\)

Debut of a Progressive Set

Definition 25: For any measurable mapping \(Z : \Omega \to [0, \infty]\) and any \(A \in \mathcal{F}\), we define the restriction of \(Z\) on \(A\) as

\[\begin{aligned} Z_A := Z \mathbf{1}_{A} + \infty \mathbf{1}_{A^\mathsf{c}}.\end{aligned}\]

Notice that we can write \(\{Z_A \le Z\} = A \cup \left(A^\mathsf{c} \cap \{Z = \infty\}\right)\) as a disjoint union. If \(T\) is a stopping time and \(A \in \mathcal{F}(T)\), then since \(\{T_A \le t\} = \{T \le t\} \cap A \in \mathcal{F}(t)\) for all \(t \in [0, \infty)\), \(T_A\) is also a stopping time.

Theorem 14 [Debut of a Progressive Set]: Under the usual conditions, if \(A\) is a progressively measurable set, i.e., \(A \in \mathscr{P}_\star\), then the debut \(D_A\) is a stopping time.

Since \(A \in \mathscr{P}_\star \subseteq \mathcal{B}([0, \infty)) \otimes \mathcal{F}\), Measurable Debut theorem (Theorem 9) implies \(D_A\) is a random variable.

Just like in the proof of Theorem 9, for any real number \(t > 0\), the set \(\{D_A < t\}\) is the projection onto \(\Omega\) of the set \(A_t := A \cap ([0, t) \times \Omega).\) Recall that \(A \in \mathscr{P}_\star\) implies \(A_t \in \mathcal{B}([0,t]) \otimes \mathcal{F}(t).\) Therefore, Measurable Projection theorem (Theorem 7) implies \(\{D_A < t\} = \pi(A_t) \in \mathcal{F}(t).\) Right continuity of the filtration now implies \(D_A\) is an \(\mathbb{F}\)-stopping time.

The following important theorem now becomes an easy corollary.

Theorem 15 [First Hitting Times]: In the context above, if \(X\) is a progressively measurable process and \(\Gamma \in \mathcal{B}(\mathbb R)\), the first hitting time

\[\begin{aligned} H_\Gamma := \inf \{t \in [0, \infty)\,:\,X(t) \in \Gamma\}\end{aligned}\]

is a stopping time.

Since \(X\) is a progressively measurable process, the set \(A = \{(t,\omega) \in [0, \infty) \times \Omega\,:\,X(t, \omega) \in \Gamma\}\) is a progressively measurable set. Theorem 14 then implies the debut \(D_A\) is a stopping time, but it is easy to see \(D_A = H_\Gamma.\)

Optional and Predictable Processes

Definition 26: The predictable \(\sigma\)-algebra, denoted by \(\mathscr{P}\), is the smallest \(\sigma\)-algebra on \([0, \infty) \times \Omega\) with respect to which left-continuous \(\mathbb{F}\)-adapted processes are measurable. A stochastic process is said to predictable if it is \(\mathscr{P}\)-measurable.

Definition 27: The optional \(\sigma\)-algebra, denoted by \(\mathscr{O}\), is the smallest \(\sigma\)-algebra on \([0, \infty) \times \Omega\) with respect to which right-continuous \(\mathbb{F}\)-adapted processes are measurable. A stochastic process is said to optional if it is \(\mathscr{O}\)-measurable.

Since all right-continuous or left-continuous adapted processes are progressively measurable, all optional or predictable processes are adapted. We need the concept of stochastic intervals to ease some notation and see some examples of optional and predictable sets.

Definition 28: If \(U,V : \Omega \to [0, \infty]\) are two random variables, and \(U \le V\), then define the stochastic intervals as

\[\begin{aligned} \llbracket U,V \rrbracket &:= \{(t,\omega) \in [0, \infty) \times \Omega\,:\,U(\omega) \le t \le V(\omega)\} \\ \llbracket U,V \llbracket &:= \{(t,\omega) \in [0, \infty) \times \Omega\,:\,U(\omega) \le t < V(\omega)\} \\ \rrbracket U,V \rrbracket &:= \{(t,\omega) \in [0, \infty) \times \Omega\,:\,U(\omega) < t \le V(\omega)\} \\ \rrbracket U,V \llbracket &:= \{(t,\omega) \in [0, \infty) \times \Omega\,:\,U(\omega) < t < V(\omega)\}.\end{aligned}\]

Also, for \(A \in \mathcal{F}(0)\), define

\[\begin{aligned} 0_A(\omega) = \begin{cases} 0 &\text{ if } \omega \in A \\ \infty &\text{ if } \omega \in A^\mathsf{c}. \end{cases}\end{aligned}\]

Notice that \(\llbracket U \rrbracket = \llbracket U,U \rrbracket = \llbracket 0, U \rrbracket \setminus \llbracket 0, U \llbracket\), and that \(0_A\) is predictable with the announcing sequence \(\{0_A \wedge n\} _ {n \in \mathbb N}.\) Denote by \(\mathfrak{S}\) the collection of all stopping times and by \(\mathfrak{S}^{(\mathscr{P})}\) the collection of all predictable stopping times (\(\mathfrak{S}\) is S in fraktur font).

Lemma 8: We collect some useful properties of stochastic intervals in relation to optional and predictable \(\sigma\)-algebras.

If \(S,T \in \mathfrak{S}\) such that \(S \le T\), then the stochastic intervals \(\llbracket S,T \rrbracket, \llbracket S,T \llbracket, \rrbracket S,T \rrbracket, \rrbracket S,T \llbracket\) and the graphs \(\llbracket S \rrbracket, \llbracket T \rrbracket\) are optional.
If \(S,T \in \mathfrak{S}\) such that \(S \le T\), then the stochastic interval \(\rrbracket S,T \rrbracket\) is predictable. If \(S \in \mathfrak{S}^{(\mathscr{P})}\), then the stochastic interval \(\llbracket S, \infty \llbracket\) is predictable.
The \(\sigma\)-algebra \(\mathscr{O}\) of optional sets can be generated from stochastic intervals in multiple equivalent ways

\[\begin{aligned} \mathscr{O} = \sigma\left(\{\llbracket S,\infty \llbracket \,:\, S \in \mathfrak{S}\}\right) = \sigma\left(\{\llbracket U,V \llbracket \,:\, S, T \in \mathfrak{S}\}\right) = \sigma\left(\{\llbracket S,T \rrbracket \,:\, S,T \in \mathfrak{S}\}\right).\end{aligned}\]

Similarly, \(\sigma\)-algebra \(\mathscr{P}\) of predictable sets can be generated from stochastic intervals in multiple equivalent ways

\[\begin{aligned} \mathscr{P} = \sigma\left(\{\rrbracket S,T \rrbracket, \llbracket 0_A \rrbracket \,:\, S,T \in \mathfrak{S} \text{ and } A \in \mathcal{F}(0)\}\right) = \sigma\left(\left\{\llbracket S,\infty \llbracket\,:\,S \in \mathfrak{S}^{(\mathscr{P})}\right\}\right) = \sigma\left(\left\{\llbracket S,T \rrbracket\,:\,S,T \in \mathfrak{S}^{(\mathscr{P})}\right\}\right).\end{aligned}\]

I will only be proving some results. For a complete picture see (He, Wang and Yan, 1992), (Almost Sure blog).

It is easy to see that \(\mathbf{1}_{\llbracket S,T \llbracket}\) is right-continuous and adapted, hence \(\llbracket S,T \llbracket\) is optional. If \(T_n = S + 1/n\) for \({n \in \mathbb N}\), then \(\llbracket S \rrbracket = \bigcap _ {n \in \mathbb N} \llbracket S, T_n\llbracket\), and thus \(\llbracket S \rrbracket\) is also optional. Similarly, for \(\llbracket T \rrbracket.\) These facts along with simple manipulation of sets immediately imply \(\llbracket S,T \rrbracket, \rrbracket S,T \rrbracket, \rrbracket S,T \llbracket\) are also optional.
It is easy to see that \(\mathbf{1}_{\rrbracket S,T \rrbracket}\) is left-continuous and adapted, hence \(\rrbracket S,T \rrbracket\) is predictable.

It is a simple exercise to show

Lemma 9: Every predictable process is optional, and every optional process is progressively measurable. In other words

\[\begin{aligned} \mathscr{P} \subseteq \mathscr{O} \subseteq \mathscr{P} _ \star \subseteq \mathcal{B}([0, \infty)) \otimes \mathcal{F}.\end{aligned}\]

Properties of Debuts

Recall the setting of section "Measurable Section" from part 1. We showed there that if \(A \in \mathcal{B}([0, \infty)) \otimes \mathcal{F},\) then we can define a measurable mapping \(T : \Omega \to [0, \infty]\) such that for all \(\omega \in \pi(A)\) we have \((T(\omega), \omega) \in A\), or stated more succinctly, \(\llbracket T \rrbracket \subseteq A.\) Now, what if we want something stronger, say, \(T\) should be a stopping time? As we will see, we will have to make stronger assumptions on the set \(A\) and even then if we want \(T\) to be a stopping time we will have to let go of the nice property that \(\{T < \infty\}= \pi(A).\)

But first, we need to prove a certain property of debuts which we will need in proving the aforementioned section theorem, and that's the goal of this section.

It isn't difficult to see that \(\mathscr{P}\) coincides with the mosaic generated by the paving on \([0, \infty) \times \Omega\) that consists of finite unions of stochastic intervals of the form \(\llbracket S,T \rrbracket\), where \(S,T \in \mathfrak{S}^{(\mathscr{P})}.\) Generalizing this, let us consider a collection \(\mathfrak{A}\) (\(\mathfrak{A}\) is A in fraktur font) of stopping times that contains \(0\) and \(\infty\), and which is closed under a.s. equality and under finitely many lattice operations (\(\wedge\) and \(\vee\)). We denote by \(\mathcal{J} := \{\llbracket U,V \llbracket\,:\,S,T \in \mathfrak{A}\}\), and by \(\mathcal{T}\) the collection of finite unions of elements of \(\mathcal{J}.\)

Simple observations like \(\llbracket S,T \llbracket^\mathsf{c} = \llbracket 0, S \llbracket \, \cup \, \llbracket T, \infty \llbracket\) or \(\llbracket S,T \llbracket \, \cap \, \llbracket U,V \llbracket = \llbracket S \vee U, (S \vee U) \vee (T \wedge V) \llbracket\) can be used to show that \(\mathcal{J}\) is a paving on \([0, \infty) \times \Omega\) and \(\mathcal{T}\) is an algebra on \([0, \infty) \times \Omega.\)

For more structure, let us impose the following properties on the collection \(\mathfrak{A}\):

For any pair \(S,T \in \mathfrak{A}\), the stopping time \(S _ {\{S < T\}}\) (recall the notation of Definition 25 and the observation right after it) belongs to \(\mathfrak{A}\);
For any increasing sequence \(\{S_n\} _ {n \in \mathbb N} \subseteq \mathfrak{A}\), the limit \(\lim _ {n \in \mathbb N} \uparrow S_n\) belongs to \(\mathfrak{A}.\)

Property 1 ensures that the debut of an element of \(\mathcal{T}\) belongs to \(\mathfrak{A}.\) This is because \(S_{\{S < T\}} = D_{\llbracket S,T \llbracket}\) and because the debut of union of a finite collection of elements from \(\mathcal{J}\) is equal to to the minimum of the debuts of those elements.

Property 2 ensures that the debut of an element of \(\mathcal{T} _ \delta\) (which, recall, equals \(\left\{\bigcup _ {n \in \mathbb N} A_n \,:\, A_n \in \mathcal{T} \text{ for all } {n \in \mathbb N}\right\}\)) belongs to to \(\mathfrak{A}.\) This is the main result of this section, and is proved below as Theorem 16.

Coming out of the abstraction for a bit, we note that the collection \(\mathfrak{S}\) of all stopping times satisfies all the conditions imposed above on \(\mathfrak{A}.\) Similarly for the collection \(\mathfrak{S}^{(\mathscr{P})}\) of all predictable stopping times. The first claim is trivial to see. To show the second claim, we first prove the following lemma.

Lemma 10: For any stopping time \(T\) and set \(A \in \mathcal{F}(T)\), the random variable \(T_A\) is a stopping time as we saw above. For \(T_A\) to be a predictable stopping time, it is necessary that \(A \in \mathcal{F}(T-)\); this condition is also sufficient if \(T\) is predictable.

If \(T_A\) is a predictable stopping time, then since \(A = \{T_A \le T\} \setminus \left(A^\mathsf{c} \cap \{T = \infty\}\right)\) and by Lemma 7 both \(\{T_A \le T\}, \left(A^\mathsf{c} \cap \{T = \infty\}\right) \in \mathcal{F}(T-)\), we get \(A \in \mathcal{F}(T-).\)

Conversely, suppose \(T\) is a predictable stopping time. Consider the collection

\[\begin{aligned} \mathscr{A} := \{A \in \mathcal{F}(T-) \,:\, T_A \text{ and } T _ {A^\mathsf{c}} \text{ are both predictable}\}.\end{aligned}\]

It is a \(\sigma\)-algebra: \(\Omega \in \mathscr{A}\) clearly and it is closed under complements by definition; if \(A_1, A_2, \ldots \in \mathscr{A}\) with \(A = \bigcup _ {n \in \mathbb N} A_n\), note \(T_A = \inf _ {n \in \mathbb N} T _ {A _ n}.\) Let \(\{T _ n ^ m\} _ {m \in \mathbb N}\) be the announcing sequence for \(T _ {A _ n}\) for each \({n \in \mathbb N}\), and define

\[\begin{aligned} \tau _ n := \min \{T _ j ^ k \,:\, j,k \le n\}.\end{aligned}\]

Then \(\{\tau _ n\} _ {n \in \mathbb N}\) announces \(T_A\) and hence \(A \in \mathscr{A}.\)

Thus, it suffices to show that \(T_A\) is predictable for any \(A\) in a collection of sets that generates \(\mathcal{F}(T-).\) To this end, suppose \(\{T^n\} _ {n \in \mathbb N}\) announces \(T\), and fix an arbitrary \(n \in \mathbb N\) and an arbitrary \(A \in \mathcal{F}(T^n).\) For every integer \(m \ge n\), the restriction \(T _ A ^ m\) is a stopping time and so is \(R _ m := T ^ m _ A \wedge m.\) The sequence \(\{R_m\} _ {m \in \mathbb N}\) then announces \(T_A\), showing \(T_A\) is predictable; similarly for \(T _ {A^\mathsf{c}}.\) We conclude \(T_A, T _ {A^\mathsf{c}}\) are predictable for every \(A \in \bigcup _ {n \in \mathbb N} \mathcal{F}(T^n).\) But recall

\[\begin{aligned} \mathcal{F}(T-) = \sigma\left(\bigcup _ {n \in \mathbb N} \mathcal{F}(T^n)\right),\end{aligned}\]

since \(T^n < T\) on \(\{T > 0\}\), so we obtain the property for all \(A \in \mathcal{F}(T-).\)

Coming back to the second claim above, if \(S,T \in \mathfrak{S}^{(\mathscr{P})}\), then by Lemma 7, \(\{S < T\} \in \mathcal{F}(S-)\), and thus by Lemma 10, \(S _ {\{S < T\}} \in \mathfrak{S}^{(\mathscr{P})}.\) The rest of the conditions are trivial to check.

Theorem 16: Suppose \(\mathfrak{A}\) is a collection of stopping times with the properties postulated above, and let \(B \in \mathcal{T}_\delta.\) Then the debut \(D_B\) of this set is a stopping time in \(\mathfrak{A}\), and its graph \(\llbracket D_B\rrbracket \subseteq B.\)

We first show that \(\llbracket D_B\rrbracket \subseteq B.\) If \((s, \omega) \in\llbracket D_B\rrbracket\), then \(D_B(\omega) = s\) which is same as saying

\[\begin{aligned} s = \inf \{t \in [0, \infty)\,:\,(t,\omega) \in B\}.\end{aligned}\]

But we constructed our collection \(\mathcal{T}_\delta\) using stochastic intervals of the form \(\llbracket S,T \llbracket\), and thus

\[\begin{aligned} B(\omega) = \{t \in [0, \infty)\,:\,(t,\omega) \in B\}\end{aligned}\]

is closed, implying \((s, \omega) \in B.\)

To show that \(D_B \in \mathfrak{A}\), consider the collection

\[\begin{aligned} \mathfrak{B} := \{S \in \mathfrak{A}\,:\,S \le D_B\},\end{aligned}\]

and note that \(\mathfrak{B}\) contains \(0\), is closed under pairwise maximization, i.e., \(S,T \in \mathfrak{B}\) implies \(S \vee T \in \mathfrak{B}\), and is closed under countable increasing limits, i.e., if \(\{S _ n\} _ {n \in \mathbb N} \subseteq \mathfrak{B}\) is an increasing sequence then \(\lim _ {n \in \mathbb N} \uparrow S_n \in \mathfrak{B}.\) Then \(\mathfrak{B}\) contains a representative \(T := \operatorname{ess\,sup} \mathfrak{B}\) of its essential supremum (see (He, Wang and Yan, 1992), (Karatzas and Shreve, 1998b)).

Since \(B \in \mathcal{T}_\delta\) we can find a decreasing sequence \(\{B_n\} _ {n \in \mathbb N} \subseteq \mathcal{T}\) with \(\bigcap _ {n \in \mathbb N} B_n = B.\) Define

\[\begin{aligned} T_n := D_{B_n \cap \llbracket T, \infty \llbracket} \text{ for all } n \in \mathbb N,\end{aligned}\]

and note \(\llbracket T_n \rrbracket \subseteq B_n\) for each \(n \in \mathbb N.\) As \(B_n \cap \llbracket T, \infty \llbracket \, \in \mathcal{T}\) and thus can be expressed as a finite union \(\bigcup _ {k=1} ^ m \llbracket U_k, V_k \llbracket\) for \(U_k, V_k \in \mathcal{T}\), we have

\[\begin{aligned} T_n = \min_{1 \le k \le m} D_{\llbracket U_k, V_k \llbracket} = \min_{1 \le k \le m} (U_k)_{\{U_k < V_k\}} \in \mathcal{T}.\end{aligned}\]

Since \(B \subseteq B_n\), we have \(T \le T_n \le D_B\), and therefore \(T_n \in \mathfrak{B}.\) Thus, by definition of essential supremum \(T_n = T\) a.s., and as

\[\begin{aligned} \llbracket T \rrbracket \subseteq \bigcap _ {n \in \mathbb N} B_n = B,\end{aligned}\]

we conclude that \(T = D_B.\)

The next result is Proposition 1.2.26 from (Karatzas and Shreve, 1998a), and is left unproven there.

Theorem 17: Under the usual conditions, let \(X\) be an adapted process with paths that are RCLL. The jumps of \(X\) are exhausted by a sequence of stopping times \(\{T_n\}_{n \in \mathbb N}\),

\[\begin{aligned} A := \{(t,\omega) \in (0, \infty) \times \Omega \,:\, X(t, \omega) \neq X(t-, \omega)\} \subseteq \bigcup_{n \in \mathbb N} \llbracket T_n \rrbracket.\end{aligned}\]

Write

\[\begin{aligned} A = \bigcup_{n \in \mathbb N} A_n \text{ for }A_n := \{(t,\omega) \in (0, \infty) \times \Omega \,:\, |X(t, \omega) - X(t-, \omega)| > 1/n\}.\end{aligned}\]

For each \(n \in \mathbb N\), the set \(A_n\) has no accumulation point in \([0, \infty)\) because \(1/n > 0.\) Therefore, it is easy to verify

\[\begin{aligned} A_n = \bigcup_{p \in \mathbb N} \llbracket D_{A_n}^p \rrbracket \text{ for } D_{A_n}^p(\omega) := \inf \{t \in [0, \infty) \,:\, \#([0,t] \cap A_n(\omega)) \ge p\},\end{aligned}\]

where the notation \(\#B\) denotes cardinaility of a set \(B.\)

We claim that all elements of the sequence \(\left\{D_{A_n}^p\right\}_{p,n \in \mathbb N}\) are stopping times. This is sufficient to prove our desired result.

Start by noting that the process \(X\) is progressively measurable and the process \((t, \omega) \mapsto X(t-, \omega)\) is left-continuous and adapted, and thus predictable. Thus, the sets \(\{A_n\}_{n \in \mathbb N}\) are progressively measurable. Fix any arbitrary \(n \in \mathbb N\), and note that for \(p=1\), the mapping

\[\begin{aligned} \omega \mapsto D_{A_n}^1(\omega) = \inf \{t \in [0, \infty) \,:\, |X(t, \omega) - X(t-, \omega)| > 1/n\}\end{aligned}\]

is a stopping time by Theorem 14 (debut of a progressive set). Arguing by induction, for each \(p \in \mathbb N\) the random variable \(D_{A_n}^{p+1}\) is also a stopping time, as it is the debut of the progressively measurable set

\[\begin{aligned} A_n \,\cap \; \rrbracket D_{A_n}^p, \infty \rrbracket\end{aligned}\]

by Lemma 8.

Section Theorems

We are now ready to come back to proving section theorems. We start by proving a general section theorem.

Theorem 18 [General Section Theorem]: In the context of Theorem 16, let \(\mathscr{H}\) be the \(\sigma\)-algebra generated by the algebra \(\mathcal{T}.\) For every set \(A \in \mathscr{H}\) and \(\varepsilon > 0\), there exists a stopping time \(T_\varepsilon \in \mathfrak{A}\) with

\[\begin{aligned} \llbracket T_\varepsilon \rrbracket \subseteq A \text{ and } \mathbb{P}\left[\pi(A)\right] \le \mathbb{P}(T_\varepsilon < \infty) + \varepsilon.\end{aligned}\]

Fix a set \(A \in \mathscr H\) and \(\varepsilon > 0.\) Measurable Section Theorem (Theorem 10 in part 1) implies there exists a random variable \(Z : \Omega \to [0, \infty]\) with \(\llbracket Z \rrbracket \subseteq A\) and \(\pi(A) = \{Z < \infty\},\) where recall \(\pi \colon [0, \infty) \times \Omega \to \Omega\) is the canonical projection map. We denote by \(\nu\) the measure on the measurable space \(\left([0, \infty) \times \Omega, \mathscr{H}\right)\) defined by

\[\begin{aligned} \int_{[0, \infty) \times \Omega} f(t, \omega) \, \nu(\mathrm{d} t, \mathrm{d} \omega) = \int_{\{Z < \infty\}} f(Z(\omega), \omega) \, \mathbb{P}(\mathrm{d} \omega)\end{aligned}\]

for every \(\mathscr{H}\)-measurable \(f : [0, \infty) \times \Omega \to [0, \infty).\) Notice that \((Z(\omega), \omega) \in A\) for all \(\omega \in \{Z < \infty\}.\)

By first taking \(f\) to be the indicator \(\mathbf{1}_{A}\) for \(A\) and then taking \(f\) to be the indicator \(\mathbf{1}_{[0, \infty) \times \Omega}\) for \([0, \infty) \times \Omega\) in the equation above, we see that

\[\begin{aligned} \nu(A) = \mathbb{P}(Z < \infty) \equiv \mathbb{P}\left[\pi(A)\right] = \nu([0, \infty) \times \Omega).\end{aligned}\]

Thus, the measure \(\nu\) is carried by the set \(A.\) Similarly, taking \(f\) to be \(\mathbf{1}_{B}\) for any \(B \in \mathscr{H},\) we get

\[\begin{aligned} \nu(B) = \int_{\pi(A)} \mathbf{1}_{B}(Z(\omega), \omega) \, \mathbb{P}(\mathrm{d} \omega) \le \mathbb{P}\left[\pi(A \cap B)\right].\end{aligned}\]

Choquet’s capacitability theorem (Theorem 6 in part 1) applied to the paving \(\mathcal{T}\) and to the capacity

\[\begin{aligned} \nu^*(C) := \inf _ {B \in \mathscr{H},\, C \subseteq B} \nu(B), \quad C \subseteq [0, \infty) \times \Omega,\end{aligned}\]

we obtain the existence of a subset \(B_\varepsilon \in \mathcal{T}_\delta\) of \(A\), such that

\[\begin{aligned} \mathbb{P}\left[\pi(A)\right] = \nu(A) \le \nu(B_\varepsilon) + \varepsilon \le \mathbb{P}\left[\pi(A \cap B_\varepsilon)\right] + \varepsilon \le \mathbb{P}\left[\pi(B_\varepsilon)\right] + \varepsilon.\end{aligned}\]

Now take \(T_\varepsilon\) to be the debut \(D_{B_\varepsilon}\) which by Theorem 16 is a stopping time in \(\mathfrak{A}.\)

Theorem 19 [Optional and Predictable Sections]: Let \(A\) be an optional (respectively, predictable) subset of the product space \([0, \infty) \times \Omega.\) For every \(\varepsilon > 0\), there exists a stopping time (respectively, a predictable stopping time) \(T_\varepsilon\) with \[\begin{aligned} \llbracket T_\varepsilon \rrbracket \subseteq A \text{ and } \mathbb{P}\left[\pi(A)\right] \le \mathbb{P}(T_\varepsilon < \infty) + \varepsilon.\end{aligned}\]

Both statements follow from the General Section Theorem (Theorem 17) in conjunction with the observations before Lemma 10, by taking \(\mathfrak{A}\) to be either \(\mathfrak{S}\) or \(\mathfrak{S}^{(\mathscr{P})}.\)

As mentioned in the beginning of the section “Properties of Debuts”, we didn't get the nice equality \[\begin{aligned} \llbracket T \rrbracket \subseteq A \text{ and } \{T < \infty\} = \pi(A)\end{aligned}\] as we got in Measurable Section theorem, but instead obtained an approximation (7). The condition \(\llbracket T_\varepsilon \rrbracket \subseteq A\) makes sure that \(\{T_\varepsilon < \infty\} \subseteq \pi(A)\), and the \(\mathbb{P}\left[\pi(A)\right] \le \mathbb{P}(T_\varepsilon < \infty) + \varepsilon\) part ensures that the measure of the difference \(\mathbb{P}\left[ \pi(A) \setminus \{T_\varepsilon < \infty\} \right]\) of these two events can be made as small as desired.

To see that it is not always possible to choose a stopping time \(T\) that satisfies (8) if \(A \in \mathscr{O}\), we will construct a filtration \(\{\mathcal{F}_t\}_{t \ge 0}\) and choose a set \(A \in \mathscr{O}\) that forces \(\pi(A) \setminus \{T < \infty\} \neq \varnothing\) for every stopping time \(T\) (the argument is from (Almost Sure blog)).

To this end, let \(\tau : \Omega \to (0, \infty)\) be a random variable such that \(\mathbb{P}\{\tau < t\} > 0\) for all \(t > 0.\) For example, let \(\tau\) be such that its distribution is uniform on \((0,1).\) Let \(\{\mathcal{F}_t\}_{t \ge 0}\) be the completed filtration such that \(\mathcal{F}_t\) is generated by \(\{\{\tau \le s\} \,:\, s \le t\}\), and thus \(\tau\) becomes a stopping time. Let \(A = \rrbracket 0, \tau \llbracket\,,\) which we know from Lemma 8 belongs to \(\mathscr{O}.\) Note that \(\mathbb{P}\{\pi(A)\} = 1\) by our construction.

It is easy to see that \(\mathcal{F}_t\) is trivial when restricted to \(\{\tau > t\}\), i.e., contains only sets of measure \(0\) or \(1.\) So, every \(\mathcal{F}_t\)-measurable random variable is a.s. constant on the event \(\{\tau > t\}.\) Therefore, any stopping time \(T\) is deterministic on the event \(\{T < \tau\}.\) So, if \(\llbracket T \rrbracket \subseteq A\), we have

\[\begin{aligned} T = \begin{cases} s &\text{on } \{\tau > s\} \\ \infty &\text{on } \{\tau \le s\} \end{cases}\end{aligned}\]

a.s. for some fixed \(s > 0.\) But then

\[\begin{aligned} \mathbb{P}\{T < \infty\} = \mathbb{P}\{\tau > s\} < 1 = \mathbb{P}\{\pi(A)\}.\end{aligned}\]

A similar argument can be used to show that it is not possible to choose a predictable stopping time \(T\) that satisfies (8) if \(A \in \mathscr{P}.\)

Applications of Section Theorems

Recall the measurable graph theorem (Theorem 8 in part 1) which implies that a map \(T : \Omega \to [0, \infty]\) is measurable if and only if its graph \(\llbracket T \rrbracket\) is measurable. We also have the following neat result:

Theorem 20: A random variable \(T : \Omega \to [0, \infty]\) is a stopping time (respectively, a predictable stopping time), if and only if its graph \(\llbracket T \rrbracket\) is an optional (respectively, predictable) set.

The necessity follows from the characterization of \(\mathscr{O}\) and \(\mathscr{P}\) in Lemma 8.

In the optional case, sufficiency follows from Theorem 14 and the fact that \(\mathscr{O} \subseteq \mathscr{P}_\star.\)

For the sufficiency in the predictable case, suppose \(\llbracket T \rrbracket\) is predictable. Apply Predictable Section theorem (Theorem 18) for \(A = \llbracket T \rrbracket\) to construct a sequence \(\{T_n\} _ {n \in \mathbb N}\) of predictable stopping times such that

\[\begin{aligned} \llbracket T_n \rrbracket \subseteq \llbracket T \rrbracket \text{ and } \mathbb{P}(T < \infty) \le \mathbb{P}(T_n < \infty) + 2^{-n}, \quad \forall \; {n \in \mathbb N}.\end{aligned}\]

Replacing \(T_n\) by \(T_1 \vee \cdots \vee T_n\) if necessary, we may assume that this sequence is increasing. It follows then that the limit

\[\begin{aligned} T = \lim _ {n \in \mathbb N} \uparrow T_n = \sup _ {n \in \mathbb N} T_n\end{aligned}\]

is predictable: if \(\{T_n^m\} _ {m \in \mathbb N}\) announces \(T_n\), then

\[\begin{aligned} \tau_n = \max \{T_j^k \,: \, j,k \le n\}\end{aligned}\]

announces \(T.\)

Suppose \(X\) and \(Y\) are measurable processes such that

\[\begin{aligned} \mathbb{P}\left\{X_T \mathbf{1}_{\{T < \infty\}} = Y_T \mathbf{1}_{\{T < \infty\}}\right\} = 1\end{aligned}\]

for each \(\mathcal{F}\)-measurable \(T : \Omega \to [0, \infty].\) Then what can we say about \(X\) and \(Y?\) We will show that \(X\) and \(Y\) are indistinguishable! To this end, let

\[\begin{aligned} F = \{(t,\omega) \in [0, \infty) \times \Omega \,:\, X(t,\omega) \neq Y(t, \omega)\}.\end{aligned}\]

It is measurable since \(X\) and \(Y\) are measurable and due to our completeness assumption. We want to show that \(\mathbb{P}\{\pi(F)\} = 0\), so suppose to the contrary that \(\mathbb{P}\{\pi(F)\} > 0.\) Any \(\mathcal{F}\)-measurable \(T : \Omega \to [0, \infty]\) satisfying \(\llbracket T \rrbracket \subseteq F\) must also satisfy \(X_T \mathbf{1}_{\{T < \infty\}} \neq Y_T \mathbf{1}_{\{T < \infty\}}.\) Now Measurable Section theorem (Theorem 10) says there exists a random variable \(T : \Omega \to [0,\infty]\) such that \(\llbracket T \rrbracket \subseteq F\) and \(\{T < \infty\} = \pi(F)\), and therefore

\[\begin{aligned} \mathbb{P}\left\{X_T \mathbf{1}_{\{T < \infty\}} \neq Y_T \mathbf{1}_{\{T < \infty\}} \right\} \ge \mathbb{P}\{T < \infty\} = \mathbb{P}\{\pi(F)\} > 0,\end{aligned}\]

contradicting our initial assumption.

Remarkably, we have similar results for optional and predictable processes.

Theorem 21: Suppose two optional (respectively, predictable) processes \(X\) and \(Y\) agree at finite stopping times (respectively, predictable stopping times), i.e.,

\[\begin{aligned} \mathbb{P}\left\{X_T \mathbf{1}_{\{T < \infty\}} = Y_T \mathbf{1}_{\{T < \infty\}}\right\} = 1, \quad \forall \; T \in \mathfrak{S} \; \left(\text{respectively, } T \in \mathfrak{S}^{(\mathscr{P})}\right).\end{aligned}\]

Then the two processes \(X\) and \(Y\) are indistinguishable, i.e.,

\[\begin{aligned} \mathbb{P}\{X_t = Y_t \;\; \forall \, t \in [0, \infty)\} = 1.\end{aligned}\]

Consider the optional case—the predictable case is similar. Just like above we have to show that for the optional set

\[\begin{aligned} F = \{(t,\omega) \in [0, \infty) \times \Omega \,:\, X(t,\omega) \neq Y(t, \omega)\}, \; \mathbb{P}\{\pi(F)\} = 0.\end{aligned}\]

Suppose to the contrary that \(\mathbb{P}\{\pi(F)\} = 2\varepsilon > 0\) for some \(0 < \varepsilon \le 1/2.\) Then the Optional Section theorem (Theorem 19) implies there exists a stopping time \(T_\varepsilon\) with the properties

\[\begin{aligned} \llbracket T_\varepsilon \rrbracket \subseteq F \text{ and } 2\varepsilon = \mathbb{P}\{\pi(F)\} \le \mathbb{P}\{T_\varepsilon < \infty\} + \varepsilon.\end{aligned}\]

But then

\[\begin{aligned} \mathbb{P}\left\{X_{T_\varepsilon} \mathbf{1}_{\{T_\varepsilon < \infty\}} \neq Y_{T_\varepsilon} \mathbf{1}_{\{T_\varepsilon < \infty\}}\right\} \ge \mathbb{P}\{T_\varepsilon < \infty\} \ge \varepsilon > 0,\end{aligned}\]

contradicting our initial assumption.

Projection Theorems

To motivate optional and predictable projections, let us start with a fundamental problem in filtering theory: Assume an underlying complete probability space \((\Omega, \mathcal{F}, \mathbb{P}).\) There is an underlying signal \(X = \{X_t\} _ {t \ge 0}\) which is modelled as a stochastic process and which we are interested in studying. Our observation process has noise and therefore instead of observing \(X\) we observe a process \(Y = \{Y_t\} _ {t \ge 0}\) such that

\[\begin{aligned} Y_t = f_t(X_t, W_t),\end{aligned}\]

for Wiener process \(W = \{W_t\}_{t \ge 0}.\) Therefore, it makes sense to define \(\mathbb{F} = \{\mathcal{F}_t\} _ {t \ge 0}\) to be the minimal augmented filtration generated by \(Y.\) The problem, then, is to compute an estimate for \(X\) based on the observable data at time \(t\), viz. \(\mathcal{F}_t.\)

We could look at \[\begin{aligned} Z_t := \mathbb{E} \left( X_t \mid \mathcal{F}_t \right),\end{aligned}\] as an estimate for \(X_t\) at each time \(t \ge 0.\) The process \(Z = \{Z_t\}_{t \ge 0}\) is, of course, adapted. However, since conditional expectation is defined only up to \(\mathbb{P}\)-a.s., what version of \(Z\) should we choose? (9) does not fix the paths of the process \(Z\), which requires specifying its values at the uncountable set of time in \([0, \infty).\) We would be very lucky if it were possible for us to choose a version of \(Z\) such that (9) holds not only for all \(t \in [0, \infty)\) but also for all finite stopping times. And indeed this is possible! This is part of the statement of the optional projection theorem.

On the other hand, if we want an estimate of \(X\) based on observable data before time \(t\), then our estimate would be

\[\begin{aligned} Z_t = \mathbb{E}(X_t \mid \mathcal{F}_{t-})\end{aligned}\]

where recall \(\mathcal{F}_{t-} := \sigma\left(\bigcup _ {s < t} \mathcal{F}_s\right).\) Again it is possible to choose a version of \(Z\) such that this equality holds not only for all \(t \in [0, \infty)\) but also for all finite predictable stopping times, and this is part of the statement of the predictable projection theorem.

Theorem 22 [Optional and Predictable Projections]: Let \(X\) be a bounded, measurable (though not necessarily adapted) process.

There is a unique, modulo indistinguishability, optional process \(X^o\), called optional projection of \(X\), that satisfies for all stopping times \(T \in \mathfrak{S}\) the identity

\[\begin{aligned} \mathbb{E}\left( X_T \mathbf{1}_{\{T < \infty\}} \mid \mathcal{F}_T \right) = X^o_T \mathbf{1}_{\{T < \infty\}}.\end{aligned}\]

There is a unique, modulo indistinguishability, predictable process \(X^p\), called predictable projection of \(X\), that satisfies for all predictable stopping times \(T \in \mathfrak{S}^{(\mathscr{P})}\) the identity

\[\begin{aligned} \mathbb{E}\left( X_T \mathbf{1}_{\{T < \infty\}} \mid \mathcal{F}_{T-} \right) = X^p_T \mathbf{1}_{\{T < \infty\}}.\end{aligned}\]

Remarks:

Taking expectations in (10) we get \[\begin{aligned} \mathbb{E} \left(X_T \mathbf{1}_{\{T < \infty\}}\right) = \mathbb{E} \left(X_T^o \mathbf{1}_{\{T < \infty\}}\right)\end{aligned}\] for all stopping times \(T \in \mathfrak{S}.\) Now suppose (12) holds for all stopping times \(T \in \mathfrak{S}.\) Fix an arbitrary stopping time \(S \in \mathfrak{S}\) and a set \(A \in \mathcal{F}_S\), and write (12) for \(T = S_A\) to get
\[\begin{aligned} \mathbb{E} \left(X_S \mathbf{1}_{\{S < \infty\}} \mathbf{1}_A\right) = \mathbb{E} \left(X_S^o \mathbf{1}_{\{S < \infty\}} \mathbf{1}_A\right).\end{aligned}\]
But this immediately implies (10) for stopping time \(S.\) Therefore, requiring (10) to hold for all stopping times is equivalent to requiring (12) to hold for for all stopping times.
Similarly, requiring (11) to hold for all predictable stopping times is equivalent to requiring \[\begin{aligned} \mathbb{E}\left( X_T \mathbf{1}_{\{T < \infty\}} \right) = \mathbb{E}\left(X^p_T \mathbf{1}_{\{T < \infty\}}\right)\end{aligned}\] to hold for all predictable stopping times.
Operators \(^o\) and \(^p\) are linear operators, i.e., for bounded, measurable processes \(X,Y\) and \(a,b \in \mathbb R\),
\[\begin{aligned} (aX + bY)^o &= a X^o + bY^o, \text{ and} \\ (aX + bY)^p &= a X^p + bY^p,\end{aligned}\]
as can be easily verified by using the properties of conditional expectation. The formulation of (12) and (13) makes it obvious that \(\left(X^o\right)^o = X^o\) and \(\left(X^p\right)^p = X^p.\) This explains the terminology of “projection”.

[of Theorem 22] Uniqueness is immediate from Theorem 21, so let's focus on existence, first for optional projection and then for predictable projection.

We will employ the monotone class theorem for functions. We need a simple class of processes for which finding an optional projection is easy to summon. To this end, consider the processes of the form
\[\begin{aligned} X_t(\omega) = \mathbf{1}_{B}(\omega) \mathbf{1}_{[u,v)}(t), \quad 0 \le u < v < \infty \text{ and } B \in \mathcal{F}.\end{aligned}\]
We claim that the candidate for the optional projection is
\[\begin{aligned} X^o_t(\omega) := M_t(\omega) \mathbf{1}_{[u,v)}(t),\end{aligned}\]
where \(M\) is the right-continuous version of the bounded, thus also uniformly integrable, martingale \(\mathbb{E}\left( \mathbf{1}_B \mid \mathcal{F}_t \right)\) (see (Karatzas and Shreve, 1998a), (Le Gall, 2016) for the proof of existence of such a martingale; it is here that the usual conditions on the filtration \(\mathbb{F}\) become crucial).

With these choices and an arbitrary stopping time \(T\), the left-hand side of (12) becomes \(\mathbb{P}\{B \cap \{u \le T < v\}\}\), whereas optional stopping theorem (Karatzas and Shreve, 1998a), (Le Gall, 2016) shows that its right-hand side is
\[\begin{aligned} \mathbb{E} \left( \mathbb{E} \left(\mathbf{1}_B \mid \mathcal{F}_T \right) \mid \mathbf{1}_{[u,v)}(T) \right).\end{aligned}\]
Recalling now that \(T\) is \(\mathcal{F}_T\)-measurable shows that the two sides are equal.

Finally, use linearity and monotone class arguments to establish existence for arbitrary bounded, measurable \(X.\)
Similar ideas work in the predictable case. We first consider processes of the form
\[\begin{aligned} X_t(\omega) = \mathbf{1}_{B}(\omega) \mathbf{1}_{(u,v]}(t), \quad 0 \le u < v < \infty \text{ and } B \in \mathcal{F}.\end{aligned}\]
We claim that the candidate for the predictable projection is
\[\begin{aligned} X^p_t(\omega) := M_{t-}(\omega) \mathbf{1}_{(u,v]}(t),\end{aligned}\]
where \(M_{t-}\) is the left-continuous version of the martingale \(M\) defined previously. Again using \(\mathcal{F}_{T-}\)-measurability of \(T\) we see the two sides of (13) are equal. Monotone class arguments now allow us to finish the proof.

We end our discussion with a result on time change. We call it a “time change” because given an adapted increasing process \(A\) with right-continuous paths and \(A_0 = 0,\) we imagine a clock which runs according to \(A\) in the sense that at time \(t\) this clock shows time \(A_t.\)

Theorem 23: Suppose \(A\) is an adapted increasing process with right-continuous paths and \(A_0 = 0.\)

For any two bounded, measurable processes \(X\) and \(Y\) that satisfy \[\begin{aligned} \mathbb{E} \left(X_T \mathbf{1}_{\{T < \infty\}}\right) = \mathbb{E} \left(Y_T \mathbf{1}_{\{T < \infty\}}\right), \quad \forall \; T \in \mathfrak{S},\end{aligned}\] we have \[\begin{aligned} \mathbb{E} \int_0^T X_t \,\mathrm{d}A_t = \mathbb{E} \int_0^T Y_t \,\mathrm{d}A_t, \quad \forall \; T \in \mathfrak{S}.\end{aligned}\]
For any non-negative, RCLL and uniformly integrable martingale \(M,\) we have

\[\begin{aligned} \mathbb{E} \int_0^T M_t \,\mathrm{d}A_t = \mathbb{E} \left( M_T A_T\right), \quad \forall \; T \in \mathfrak{S}.\end{aligned}\]

Only the first part requires any real effort.

Introduce the time change
\[\begin{aligned} C(s) := \inf \{t \ge 0 \,:\, A_t > s\},\end{aligned}\]
and check that \(C(s)\) is a stopping time for each \(s \ge 0.\) Recall that \(C(s)\) is increasing, right-continuous, and the “functional inverse” of \(A_t\) in the sense that
\[\begin{aligned} A_t = \inf \{s \ge0 \,:\, C(s) > t\}.\end{aligned}\]
Therefore, \(C\) gives us a way to read the correct time off our \(A\)-clock—if the \(A\)-clock shows time \(s,\) the actual time is \(C(s).\)

We also check that the properties
\[\begin{aligned} \begin{gather*} \int_0^\infty g(A_t) \, \mathrm{d}A_t = \int_0^{A(\infty)} g(u) \, \mathrm{d}u, \text{ and } \\ \int_0^\infty g(t) \, \mathrm{d}A_t = \int_0^{A(\infty)} g(C(s)) \, \mathrm{d}s = \int_0^\infty g(C(s)) \mathbf{1}_{\{C(s) < \infty\}}\, \mathrm{d}s \end{gather*}\end{aligned}\]
hold for any Borel-measurable \(g \colon [0, \infty) \to [0, \infty).\) It follows from these considerations and Fubini’s theorem, that
\[\begin{aligned} \mathbb{E} \int_0^T X_t \,\mathrm{d}A_t = \mathbb{E} \int_0^T X(C(s)) \mathbf{1}_{\{C(s) < \infty\}}\, \mathrm{d}s = \int_0^\infty \mathbb{E}\left[X(C(s)) \mathbf{1}_{\{C(s) < \infty\}} \right] \, \mathrm{d}s\end{aligned}\]
holds, with a similar expression also holding for \(Y.\)

The claim (15) now follows directly from our assumptions, in the case \(T = \infty.\) For general \(T \in \mathfrak S\) we apply the above result to the increasing, right-continuous and adapted process
\[\begin{aligned} \tilde{A}_t := A_t \mathbf{1}_{\llbracket 0, T\llbracket}(t) + A_T \mathbf{1}_{\llbracket T, \infty \llbracket}(t), \quad t \ge 0.\end{aligned}\]
This follows immediately from part-1 by letting
\[\begin{aligned} X := M \mathbf{1}_{\llbracket 0, T \rrbracket} \text{ and } Y := M_T \mathbf{1}_{\llbracket 0, T \rrbracket},\end{aligned}\]
and using Doob’s optional sampling theorem (Karatzas and Shreve, 1998a), (Le Gall, 2016) to verify that these processes satisfy (14).

References

Dellacherie, Claude. Capacités et processus stochastiques, Springer-Verlag, 1972.
Dellacherie, Claude and Meyer, Paul-André. Probabilities and Potential, North-Holland Publishing Company, 1978.
He, Sheng-wu and Wang, Jia-gang and Yan, Jia-an. Semimartingale Theory and Stochastic Calculus, CRC Press, 1992.
Karatzas, Ioannis and Shreve, Steven. Brownian Motion and Stochastic Calculus, Graduate Texts in Mathematics Volume 113, Springer-Verlag New York, 1998.
Karatzas, Ioannis and Shreve, Steven. Methods of Mathematical Finance, Probability Theory and Stochastic Modelling Volume 39, Springer-Verlag New York, 1998.
Le Gall, Jean-François. Brownian Motion, Martingales, and Stochastic Calculus, Graduate Texts in Mathematics Volume 274, Springer International Publishing, 2016.
Lowther, George. Almost Sure blog.