This is part 2 of the sequence of blog posts on general theory of processes. For part 1 see here. I will be building upon the content presented there.
“Mathematics exists solely for the honour of the human mind.” — Jacobi in a letter to Legendre after the death of Fourier. Fourier had the opinion that the principal aim of mathematics was public utility and explanation of natural phenomena.
We previously discussed Choquet's theory of capacities and its applications in measure theory. The goal of this blog post is to discuss the debut, section and projection theorems in stochastic processes. These theorems form the core of the “general theory of processes”, developed primarily by the Strasbourg school of probability under the tutelage of Paul-André Meyer. At the risk of being overly simplistic, general theory of processes is the study of filtrations and stopping times—martingales will be absent in our discussion.
Unlike part 1 where there were no prerequisites other than basic measure theory, this part assumes a good amount of familiarity with stochastic processes, at the level of (Karatzas and Shreve, 1998a), (Le Gall, 2016). Without that the material here will feel unmotivated and difficult. On the other hand, the theorems proved here are skipped even in advanced courses in stochastic processes as they aren't the most useful for applications, but rather are necessary to fill in gaps if a rigorous treatment is warranted. Nevertheless the theory presented is extremely profound and beautiful, and reading it will be an honour for the mind, if nothing else.
Thus the measure space \((E, \mathscr{E}, \mu)\) is complete if and only if every \(\mu\)-negligible subset of \(E\) belongs to \(\mathscr E.\)
Recall that an outer measure on \(E\) is a function \(\mu^* \colon \mathfrak{P}(E) \to [0, \infty]\) such that
\(\mu^*(\varnothing) = 0\),
if \(A \subseteq B \subseteq E\), then \(\mu^*(A) \le \mu^*(B)\), and
if \(\{A_n\}_{n \in \mathbb N}\) is a sequence of subsets of \(E\), then
Also recall that a subset \(B \subseteq E\) is called \(\mu^*\)-measurable if
It is a simple exercise to show that if \(B \subseteq E\) is such that either \(\mu^*(B) = 0\) or \(\mu^*(B^\mathsf{c}) = 0\), then \(B\) is \(\mu^*\)-measurable. Finally, recall that if \(\mathscr{M}_{\mu^*}\) denotes the collection of all \(\mu^*\)-measurable subsets of \(E\), then \(\mathscr{M}_{\mu^*}\) is a \(\sigma\)-algebra, and the restriction of \(\mu^*\) to \(\mathscr{M}_{\mu^*}\) is a measure on \(\mathscr{M}_{\mu^*}.\) Therefore, it follows that the measure space \((E, \mathscr{M}_{\mu^*}, \mu^*)\) is complete. In particular, the Lebesgue measure on the \(\sigma\)-algebra of Lebesgue subsets of \(\mathbb R\) is complete. It can be shown that the restriction of Lebesgue measure to the \(\sigma\)-algebra of Borel subsets of \(\mathbb R\) is not complete. We next discuss a result that allows us to complete any measure space.
Theorem 11: Let \((E, \mathscr{E}, \mu)\) be an arbitrary measure space. Define the collection \(\mathscr{E}_\mu\) to consist of all sets \(B \subseteq E\) for which there exist \(B_1, B_2 \in \mathscr E\) such that \[\begin{aligned} B_1 \subseteq B \subseteq B_2 \text{ and } \mu(B_2 \setminus B_1) = 0.\end{aligned}\]
Then \(\mathscr{E}_\mu\) is a \(\sigma\)-algebra on \(E\) that includes \(\mathscr E.\) Define the function
Let us start by showing that \(\mathscr{E}_\mu\) is a \(\sigma\)-algebra. That \(\mathscr{E}_\mu\) includes \(\mathscr{E}\) is clear by taking \(B_1 = B_2 = B\) for any \(B \in \mathscr{E}.\) This, in particular, means that \(\varnothing \in \mathscr{E}_\mu.\) Now let \(B \in \mathscr{E}_\mu\) with \(B_1, B_2 \in \mathscr{E}\) as in the theorem statement. (1) implies
Next, we show that \(\overline \mu\) is well-defined. Using the notation in the statement of the theorem, it follows immediately that \(\mu(B_1) = \mu(B_2).\) Furthermore, if \(A \subseteq B\) such that \(A \in \mathscr{E}\), then
Next, we show that \(\overline \mu\) is a measure on \(\mathscr E_\mu.\) \(\overline \mu\) is clearly an extension of \(\mu\) by letting \(B_1 = B_2 = B\) for any \(B \in \mathscr{E}.\) This, in particular, implies that \(\overline \mu(\varnothing) = 0.\) Non-negativity of the measure \(\mu\) implies the non-negativity of \(\overline \mu.\) Finally, we check the countable additivity. Let \(\{B^n\}_{n \in \mathbb N}\) be a sequence of disjoint sets in \(\mathscr E_\mu\), such that for each \(n \in \mathbb N\), \(B^n_1, B^n_2 \in \mathscr{E}\) be the sets satisfying (1). The disjointness of the sets \(\{B^n\}_{n \in \mathbb N}\) implies the disjointness of the sets \(\{B^n_1\}_{n \in \mathbb N}\), and so we get
If we denote by \(\mathscr{N}\) the collection of all \(\mu\)-negligible sets for the measure space \((E, \mathscr{E}, \mu)\), then it is easy to see that \[\begin{aligned} \mathscr E_\mu = \{B \cup N \,:\, B \in \mathscr E \text{ and } N \in \mathscr N\},\end{aligned}\] and \(\overline \mu (B \cup N) = \mu(B).\) From this it follows that if the measure space \((E, \mathscr{E}, \mu)\) is complete, then it is its own completion.
A very nice property that holds if we assume that the measure space \((E, \mathscr{E}, \mu)\) is complete is that if \(f\) and \(g\) are real-valued functions defined \(E\) such that \(f\) is measurable and and \(f = g\) holds \(\mu\)-a.e., then \(g\) is also measurable. To see this, let \(B \in \mathcal{B}(\mathbb R)\) be any Borel set. Then \(\{f \in B\} \in \mathscr E\) and
The above property can fail if the measure space is not complete. Similar problems can arise in the study of stochastic processes, and therefore completeness assumptions are made. For this whole blog post, we shall place ourselves on a complete probability space \((\Omega, \mathcal{F}, \mathbb{P})\) endowed with a filtration \(\mathbb{F} = \{\mathcal{F}_t\} _ {0 \le t < \infty}\), i.e., a family of sub-\(\sigma\)-algebras of \(\mathcal{F}\) which is increasing in the sense that
Right-continuity, i.e., \(\mathcal{F}_t = \mathcal{F}_{t+} := \bigcap_{s > t} \mathcal{F}_s\) for all \(t \in [0, \infty)\), and
\(\mathcal{F}_0\) contains all the \(\mathbb{P}\)-negligible events in \(\mathcal{F}\).
The structure of the filtration implies that we can write \(\mathcal{F}_{t+}\) equivalently as \(\bigcap_{n \in \mathbb N} \mathcal{F}_{t + 1/n}.\)
Often we will need to start with the natural filtration \[\begin{aligned} \mathcal{F}_t^X := \sigma(X_s; \;0 \le s \le t), \quad t \ge 0,\end{aligned}\] generated by the process \(X = \{X_t\}_{t \ge 0},\) which is the smallest filtration with respect to which the process \(X\) is adapted, i.e., \(X_t\) is \(\mathcal{F}_t^X\)-measurable for each \(t \ge 0.\) And then we will need to augment it to make it satisfy the usual conditions. To that end, we define the minimal augmented filtration generated by \(X\) to be the smallest filtration that is right continuous and complete and with respect to which the process \(X\) is adapted. This can be constructed in the following three steps:
First, let \(\left\{\mathcal{F}_t^{X}\right\}_{t \ge 0}\) be the natural filtration as in (3).
Let \(\mathscr{N}\) be the collection of all \(\mathbb{P}\)-negligible sets for the complete probability space \((\Omega, \mathcal{F}, \mathbb{P})\) (we can always complete it using Theorem 11 if it isn’t). For each \(t \ge 0,\) let \[\begin{aligned} \left(\mathcal{F}^X_t\right)_{\mathbb{P}} = \left\{F \cup N \,:\, F \in \mathcal{F}_t^X \text{ and } N \in \mathscr N\right\}\end{aligned}\] be the completion of \(\mathcal{F}_t^X\) just like in (2).
Finally, for each \(t \ge 0,\) let \[\begin{aligned} \mathcal{F}_t := \left( \left(\mathcal{F}^X_t\right)_{\mathbb{P}} \right)_+ = \bigcap_{s > t} \left(\mathcal{F}^X_s\right)_{\mathbb{P}}\end{aligned}\] define the right-continuous filtration. To show right-continuity of \(\{\mathcal{F}_t\}_{t \ge 0},\) we need to show that
These steps give a filtration \(\{\mathcal{F}_t\}_{t \ge 0},\) which is right continuous and complete and with respect to which the process \(X\) is adapted. It is clear that it is also the smallest such filtration. Does it matter if we do step 3 before step 2, i.e., is it true that
For the other side,
For our purposes, a stochastic process is simply a collection of real-valued random variables \(X = \{X_t\}_{t \ge 0}\) on \((\Omega, \mathcal{F}).\) The sample paths of \(X\) are the mapping \([0, \infty) \ni t \mapsto X_t( \omega) \in \mathbb R\) obtained when fixing \(\omega \in \Omega.\)
We need terminology to talk about when two stochastic processes are the “same”.
Definition 19: Consider two stochastic process \(X = \left\{ X_t\right\}_{t \ge 0}\) and \(Y = \left\{ Y_t\right\}_{t \ge 0}\) defined on the same probability space \((\Omega, \mathcal{F}, \mathbb{P}).\) We say
\(X\) and \(Y\) have the same finite-dimensional distributions if, for any integer \(n \ge 1\), real numbers \(0 \le t_1 < t_2 < \cdots < t_n < \infty,\) and \(A \in \mathcal{B}(\mathbb R^n)\), we have
\(Y\) is a modification or a version of \(X\) if, for every \(t \in [0, \infty)\), we have \(\mathbb{P}\{X_t = Y_t\} = 1;\)
\(X\) and \(Y\) are indistinguishable if almost all their sample paths agree, i.e.,
We will need to make stronger measurability assumptions for the random variables \(\{X_t\}_{t \ge 0}\) than just assuming that each \(X_t\) is measurable.
Definition 20: A stochastic process \(X = \{X_t\}_{t \ge 0}\) is called
adapted, if \(X_t\) is \(\mathcal{F}_t\)-measurable for every \(t \in [0, \infty)\);
measurable, if the mapping \[\begin{aligned} [0, \infty) \times \Omega \ni (t,\omega) \mapsto X_t(\omega) \in \mathbb R\end{aligned}\] is \(\mathcal{B}([0, \infty)) \otimes \mathcal{F}\)-measurable, when \(\mathbb R\) is endowed with its Borel \(\sigma\)-algebra;
progressively measurable, if for every \(t \in [0, \infty)\) the mapping
Recall that every progressively measurable process is both measurable and adapted; and every adapted process with right-continuous or with left-continuous paths, is progressively measurable (Karatzas and Shreve, 1998a), (Le Gall, 2016). On the other hand, we have the following result due to Meyer (1966).
Consider the space \(\mathbb{L}^0\) of equivalence classes of measurable functions \(f : \Omega \to \mathbb R\), and endow it with the topology of convergence in probability, for example with the pseudo-metrics \(\rho(f, g) = \mathbb E (1 \wedge |f-g|)\) or \(\rho(f,g) = \mathbb E \left( \frac{|f-g|}{1 + |f-g|} \right).\) Recall that if a sequence \(\{f_n\} _ {n \in \mathbb N} \subseteq \mathbb{L}^0\) satisfies \(\sum _ {n \in \mathbb N} \rho(f _ n, f _ {n+1}) < \infty,\) then it converges in probability “fast”, thus also almost surely.
Every process \(Y = \{Y_t\} _ {t \ge 0}\) can be thought of as a mapping from \([0, \infty)\) into \(\mathbb{L}^0\), which associates to each element in \([0, \infty)\) the equivalence class \(\mathcal{Y} _ t\) of \(Y_t.\) Consider now the collection \(\mathscr{H}\) of processes \(Y\) such that the mapping \([0, \infty) \ni t \mapsto \mathcal{Y}_t \in \mathbb{L}^0\) satisfies
it takes values in a separable subspace of \(\mathbb{L}^0\);
under it, the inverse image of every open ball of \(\mathbb{L}^0\) is a Borel subset of \([0, \infty)\); and
it is the uniform limit of a sequence of simple measurable functions with values in the space \(\mathbb{L}^0.\)
Then note that property 3 implies the other two, and that conversely, properties 1 and 2 together imply property 3, because a real-valued function is measurable if and only if it is the increasing limit of a sequence of simple measurable functions.
Next, note that \(\mathscr{H}\) is a real vector space on account of property 3, and is closed under sequential pointwise convergence on account of properties 1 and 2. It is easy to see that \(\mathscr{H}\) contains all processes of the form \(Y(t,\omega) = K(t) \xi(\omega)\), where \(K\) is the indicator of an interval in \([0, \infty)\) and \(\xi : \Omega \to \mathbb R\) is a bounded measurable function. Thus, monotone class theorem implies \(\mathscr{H}\) contains all bounded measurable processes. Now using a standard slicing argument it is easily seen that every measurable process \(Y\) has properties 1, 2 and 3.
Consider now a measurable and adapted process \(X = \{X_t\} _ {t \ge 0}.\) For each \({n \in \mathbb N}\), there exists a process \(X^{(n)} = \{X^{(n)}_t\} _ {t \ge 0}\) which is simple and measurable (in the sense that there exists a partition \(\left\{A^{(n)} _ k\right\} _ {k \in \mathbb N}\) of \([0, \infty)\) into Borel sets, and a sequence of random variable \(\left\{H^{(n)} _ k\right\} _ {k \in \mathbb N}\), such that \(X^{(n)}_t = H^{(n)} _ k\) for \(t \in A^{(n)} _ k\)), and satisfies
Next, we define a sequence of random variables \(\left\{G^{(n)} _ k\right\} _ {k \in \mathbb N}\) for each \({n \in \mathbb N}\) using the sequence \(\left\{H^{(n)} _ k\right\} _ {k \in \mathbb N}\) to get “enhanced” measurability properties. To this end, define \(s^{(n)} _ k = \inf A^{(n)} _ k\), and define \(G^{(n)} _ k = X(s^{(n)} _ k)\) if \(s^{(n)} _ k \in A^{(n)} _ k\); if, on the other hand \(s^{(n)} _ k \notin A^{(n)} _ k\), we pick any \(\mathcal{F}_{s^{(n)} _ k}-\)measurable \(G^{(n)} _ k\) that satisfies
We now define the process \(Y^{(n)} = \{Y^{(n)}_t\} _ {t \ge 0}\) by \(Y^{(n)}_t := G^{(n)} _ k, \, t \in A^{(n)} _ k\), and check that it is progressively measurable and satisfies \(\rho\left(X_t, Y^{(n)}_t\right) \le 2^{-(n+1)}\) for all \(t \in [0, \infty)\) and \({n \in \mathbb N}.\)
Finally, we construct a progressively measurable process \(Y\) by
It can be verified that \(\mathscr{P} _ \star\) is the collection of all sets \(A \in \mathcal{F} \otimes \mathcal{B}([0, \infty))\) such that the process \(X_t(\omega) = \mathbf{1}_A(\omega, t)\) is progressively measurable. This characterization gives the useful result that a subset \(A\) of \([0, \infty) \times \Omega\) belongs to \(\mathscr{P} _ \star\) if and only if, for every \(t \in [0, \infty)\), \(A \cap ([0,t] \times \Omega) \in \mathcal{B}([0,t]) \otimes \mathcal{F}_t.\)
Because the filtration \(\mathbb{F}\) is right-continuous this is equivalent to \(\{T < t\} \in \mathcal{F}_t\) for all \(t \in [0, \infty)\) as can be seen by writing
It is easily checked that if \(T\) and \(S\) are stopping times, then so are
Definition 23: For a stopping time \(T\), we define
\(\sigma\)-algebra of events prior to it by
\(\sigma\)-algebra of events strictly prior to it by
The following intuitive properties are useful and their proofs can be found in any standard textbook (Karatzas and Shreve, 1998a), (Le Gall, 2016).
Lemma 6: Let \(T\) and \(S\) be any two stopping times. Then
\(\mathcal{F}(T-) \subseteq \mathcal{F}(T)\);
\(T\) is measurable with respect to both \(\mathcal{F}(T)\) and \(\mathcal{F}(T-)\);
\(\mathcal{F}(T) \cap \mathcal{F}(S) = \mathcal{F}(T \wedge S)\);
\(A \cap \{T \le S\} \in \mathcal{F}(S)\), for all \(A \in \mathcal{F}(T)\);
\(A \cap \{T < S\} \in \mathcal{F}(S-)\), for all \(A \in \mathcal{F}(T)\);
If the stopping times satisfy \(T \le S\), then \(\mathcal{F}(T) \subseteq \mathcal{F}(S)\) and \(\mathcal{F}(T-) \subseteq \mathcal{F}(S-)\);
If the stopping times satisfy \(T < S\) on the set \(\{0 < S < \infty\}\), then \(\mathcal{F}(T) \subseteq \mathcal{F}(S-)\);
If \(\{T_n\} _ {n \in \mathbb N}\) is an increasing sequence of stopping times and \(T = \lim _ {n \in \mathbb N} \uparrow T_n\), then
If \(Z\) is an integrable random variable, we have
If \(\{T_n\}_{n \in \mathbb N}\) is a sequence of stopping times and \(T:= \inf_n T_n\), then
A very important result states that every stopping time is the decreasing limit of a sequence of discrete stopping time. More concretely, if \(T\) is a stopping time, and we define
for \(n,k \in \mathbb N,\) then each \(T_n\) is a discrete stopping time and we have \(T_n \downarrow T.\) When can we approximate a stopping with an increasing sequence of stopping times? Of course, being able to do that means that our stopping time is a very special one. We give a name to such stopping times.
Note \(\{T \le t\} = \bigcap_{n \in \mathbb N} \{T_n \le t\} \in \mathcal{F}(t)\) and therefore if in the definition above we had not required \(T\) to be a stopping time, it would still be a stopping time. A canonical example of a predictable stopping time is the first time Brownian motion \(W\) hits or exceeds a certain level \(b.\) It is announced by the sequence of first times \(W\) hits or exceeds the levels \(b - 1/n\), for \({n \in \mathbb N}.\) In fact, we have (taken from (Almost Sure blog)):
Of course, in the above proof we cheated by assuming the Debut Theorem for stopping times, proving which is our next agenda. In the same flavor as Lemma-6 we have
Lemma 7: If \(T\) is a predictable stopping time and \(S\) an arbitrary stopping time, then
\(A \cap \{T \le S\} \in \mathcal{F}(S-)\) for all \(A \in \mathcal{F}(T-)\);
\(A \cap \{S = \infty\} \in \mathcal{F}(S-)\) for all \(A \in \mathcal{F}(\infty)\).
In particular, both the events \(\{T \le S\}\) and \(\{T = S\}\) belong to \(\mathcal{F}(S-).\)
Notice that we can write \(\{Z_A \le Z\} = A \cup \left(A^\mathsf{c} \cap \{Z = \infty\}\right)\) as a disjoint union. If \(T\) is a stopping time and \(A \in \mathcal{F}(T)\), then since \(\{T_A \le t\} = \{T \le t\} \cap A \in \mathcal{F}(t)\) for all \(t \in [0, \infty)\), \(T_A\) is also a stopping time.
Since \(A \in \mathscr{P}_\star \subseteq \mathcal{B}([0, \infty)) \otimes \mathcal{F}\), Measurable Debut theorem (Theorem 9) implies \(D_A\) is a random variable.
Just like in the proof of Theorem 9, for any real number \(t > 0\), the set \(\{D_A < t\}\) is the projection onto \(\Omega\) of the set \(A_t := A \cap ([0, t) \times \Omega).\) Recall that \(A \in \mathscr{P}_\star\) implies \(A_t \in \mathcal{B}([0,t]) \otimes \mathcal{F}(t).\) Therefore, Measurable Projection theorem (Theorem 7) implies \(\{D_A < t\} = \pi(A_t) \in \mathcal{F}(t).\) Right continuity of the filtration now implies \(D_A\) is an \(\mathbb{F}\)-stopping time.
The following important theorem now becomes an easy corollary.
Since all right-continuous or left-continuous adapted processes are progressively measurable, all optional or predictable processes are adapted. We need the concept of stochastic intervals to ease some notation and see some examples of optional and predictable sets.
Notice that \(\llbracket U \rrbracket = \llbracket U,U \rrbracket = \llbracket 0, U \rrbracket \setminus \llbracket 0, U \llbracket\), and that \(0_A\) is predictable with the announcing sequence \(\{0_A \wedge n\} _ {n \in \mathbb N}.\) Denote by \(\mathfrak{S}\) the collection of all stopping times and by \(\mathfrak{S}^{(\mathscr{P})}\) the collection of all predictable stopping times (\(\mathfrak{S}\) is S in fraktur font).
Lemma 8: We collect some useful properties of stochastic intervals in relation to optional and predictable \(\sigma\)-algebras.
If \(S,T \in \mathfrak{S}\) such that \(S \le T\), then the stochastic intervals \(\llbracket S,T \rrbracket, \llbracket S,T \llbracket, \rrbracket S,T \rrbracket, \rrbracket S,T \llbracket\) and the graphs \(\llbracket S \rrbracket, \llbracket T \rrbracket\) are optional.
If \(S,T \in \mathfrak{S}\) such that \(S \le T\), then the stochastic interval \(\rrbracket S,T \rrbracket\) is predictable. If \(S \in \mathfrak{S}^{(\mathscr{P})}\), then the stochastic interval \(\llbracket S, \infty \llbracket\) is predictable.
The \(\sigma\)-algebra \(\mathscr{O}\) of optional sets can be generated from stochastic intervals in multiple equivalent ways
Similarly, \(\sigma\)-algebra \(\mathscr{P}\) of predictable sets can be generated from stochastic intervals in multiple equivalent ways
I will only be proving some results. For a complete picture see (He, Wang and Yan, 1992), (Almost Sure blog).
It is easy to see that \(\mathbf{1}_{\llbracket S,T \llbracket}\) is right-continuous and adapted, hence \(\llbracket S,T \llbracket\) is optional. If \(T_n = S + 1/n\) for \({n \in \mathbb N}\), then \(\llbracket S \rrbracket = \bigcap _ {n \in \mathbb N} \llbracket S, T_n\llbracket\), and thus \(\llbracket S \rrbracket\) is also optional. Similarly, for \(\llbracket T \rrbracket.\) These facts along with simple manipulation of sets immediately imply \(\llbracket S,T \rrbracket, \rrbracket S,T \rrbracket, \rrbracket S,T \llbracket\) are also optional.
It is easy to see that \(\mathbf{1}_{\rrbracket S,T \rrbracket}\) is left-continuous and adapted, hence \(\rrbracket S,T \rrbracket\) is predictable.
It is a simple exercise to show
Recall the setting of section "Measurable Section" from part 1. We showed there that if \(A \in \mathcal{B}([0, \infty)) \otimes \mathcal{F},\) then we can define a measurable mapping \(T : \Omega \to [0, \infty]\) such that for all \(\omega \in \pi(A)\) we have \((T(\omega), \omega) \in A\), or stated more succinctly, \(\llbracket T \rrbracket \subseteq A.\) Now, what if we want something stronger, say, \(T\) should be a stopping time? As we will see, we will have to make stronger assumptions on the set \(A\) and even then if we want \(T\) to be a stopping time we will have to let go of the nice property that \(\{T < \infty\}= \pi(A).\)
But first, we need to prove a certain property of debuts which we will need in proving the aforementioned section theorem, and that's the goal of this section.
It isn't difficult to see that \(\mathscr{P}\) coincides with the mosaic generated by the paving on \([0, \infty) \times \Omega\) that consists of finite unions of stochastic intervals of the form \(\llbracket S,T \rrbracket\), where \(S,T \in \mathfrak{S}^{(\mathscr{P})}.\) Generalizing this, let us consider a collection \(\mathfrak{A}\) (\(\mathfrak{A}\) is A in fraktur font) of stopping times that contains \(0\) and \(\infty\), and which is closed under a.s. equality and under finitely many lattice operations (\(\wedge\) and \(\vee\)). We denote by \(\mathcal{J} := \{\llbracket U,V \llbracket\,:\,S,T \in \mathfrak{A}\}\), and by \(\mathcal{T}\) the collection of finite unions of elements of \(\mathcal{J}.\)
Simple observations like \(\llbracket S,T \llbracket^\mathsf{c} = \llbracket 0, S \llbracket \, \cup \, \llbracket T, \infty \llbracket\) or \(\llbracket S,T \llbracket \, \cap \, \llbracket U,V \llbracket = \llbracket S \vee U, (S \vee U) \vee (T \wedge V) \llbracket\) can be used to show that \(\mathcal{J}\) is a paving on \([0, \infty) \times \Omega\) and \(\mathcal{T}\) is an algebra on \([0, \infty) \times \Omega.\)
For more structure, let us impose the following properties on the collection \(\mathfrak{A}\):
For any pair \(S,T \in \mathfrak{A}\), the stopping time \(S _ {\{S < T\}}\) (recall the notation of Definition 25 and the observation right after it) belongs to \(\mathfrak{A}\);
For any increasing sequence \(\{S_n\} _ {n \in \mathbb N} \subseteq \mathfrak{A}\), the limit \(\lim _ {n \in \mathbb N} \uparrow S_n\) belongs to \(\mathfrak{A}.\)
Property 1 ensures that the debut of an element of \(\mathcal{T}\) belongs to \(\mathfrak{A}.\) This is because \(S_{\{S < T\}} = D_{\llbracket S,T \llbracket}\) and because the debut of union of a finite collection of elements from \(\mathcal{J}\) is equal to to the minimum of the debuts of those elements.
Property 2 ensures that the debut of an element of \(\mathcal{T} _ \delta\) (which, recall, equals \(\left\{\bigcup _ {n \in \mathbb N} A_n \,:\, A_n \in \mathcal{T} \text{ for all } {n \in \mathbb N}\right\}\)) belongs to to \(\mathfrak{A}.\) This is the main result of this section, and is proved below as Theorem 16.
Coming out of the abstraction for a bit, we note that the collection \(\mathfrak{S}\) of all stopping times satisfies all the conditions imposed above on \(\mathfrak{A}.\) Similarly for the collection \(\mathfrak{S}^{(\mathscr{P})}\) of all predictable stopping times. The first claim is trivial to see. To show the second claim, we first prove the following lemma.
If \(T_A\) is a predictable stopping time, then since \(A = \{T_A \le T\} \setminus \left(A^\mathsf{c} \cap \{T = \infty\}\right)\) and by Lemma 7 both \(\{T_A \le T\}, \left(A^\mathsf{c} \cap \{T = \infty\}\right) \in \mathcal{F}(T-)\), we get \(A \in \mathcal{F}(T-).\)
Conversely, suppose \(T\) is a predictable stopping time. Consider the collection
Thus, it suffices to show that \(T_A\) is predictable for any \(A\) in a collection of sets that generates \(\mathcal{F}(T-).\) To this end, suppose \(\{T^n\} _ {n \in \mathbb N}\) announces \(T\), and fix an arbitrary \(n \in \mathbb N\) and an arbitrary \(A \in \mathcal{F}(T^n).\) For every integer \(m \ge n\), the restriction \(T _ A ^ m\) is a stopping time and so is \(R _ m := T ^ m _ A \wedge m.\) The sequence \(\{R_m\} _ {m \in \mathbb N}\) then announces \(T_A\), showing \(T_A\) is predictable; similarly for \(T _ {A^\mathsf{c}}.\) We conclude \(T_A, T _ {A^\mathsf{c}}\) are predictable for every \(A \in \bigcup _ {n \in \mathbb N} \mathcal{F}(T^n).\) But recall
Coming back to the second claim above, if \(S,T \in \mathfrak{S}^{(\mathscr{P})}\), then by Lemma 7, \(\{S < T\} \in \mathcal{F}(S-)\), and thus by Lemma 10, \(S _ {\{S < T\}} \in \mathfrak{S}^{(\mathscr{P})}.\) The rest of the conditions are trivial to check.
We first show that \(\llbracket D_B\rrbracket \subseteq B.\) If \((s, \omega) \in\llbracket D_B\rrbracket\), then \(D_B(\omega) = s\) which is same as saying
To show that \(D_B \in \mathfrak{A}\), consider the collection
Since \(B \in \mathcal{T}_\delta\) we can find a decreasing sequence \(\{B_n\} _ {n \in \mathbb N} \subseteq \mathcal{T}\) with \(\bigcap _ {n \in \mathbb N} B_n = B.\) Define
The next result is Proposition 1.2.26 from (Karatzas and Shreve, 1998a), and is left unproven there.
Write
We claim that all elements of the sequence \(\left\{D_{A_n}^p\right\}_{p,n \in \mathbb N}\) are stopping times. This is sufficient to prove our desired result.
Start by noting that the process \(X\) is progressively measurable and the process \((t, \omega) \mapsto X(t-, \omega)\) is left-continuous and adapted, and thus predictable. Thus, the sets \(\{A_n\}_{n \in \mathbb N}\) are progressively measurable. Fix any arbitrary \(n \in \mathbb N\), and note that for \(p=1\), the mapping
We are now ready to come back to proving section theorems. We start by proving a general section theorem.
Fix a set \(A \in \mathscr H\) and \(\varepsilon > 0.\) Measurable Section Theorem (Theorem 10 in part 1) implies there exists a random variable \(Z : \Omega \to [0, \infty]\) with \(\llbracket Z \rrbracket \subseteq A\) and \(\pi(A) = \{Z < \infty\},\) where recall \(\pi \colon [0, \infty) \times \Omega \to \Omega\) is the canonical projection map. We denote by \(\nu\) the measure on the measurable space \(\left([0, \infty) \times \Omega, \mathscr{H}\right)\) defined by
By first taking \(f\) to be the indicator \(\mathbf{1}_{A}\) for \(A\) and then taking \(f\) to be the indicator \(\mathbf{1}_{[0, \infty) \times \Omega}\) for \([0, \infty) \times \Omega\) in the equation above, we see that
Choquet’s capacitability theorem (Theorem 6 in part 1) applied to the paving \(\mathcal{T}\) and to the capacity
As mentioned in the beginning of the section “Properties of Debuts”, we didn't get the nice equality \[\begin{aligned} \llbracket T \rrbracket \subseteq A \text{ and } \{T < \infty\} = \pi(A)\end{aligned}\] as we got in Measurable Section theorem, but instead obtained an approximation (7). The condition \(\llbracket T_\varepsilon \rrbracket \subseteq A\) makes sure that \(\{T_\varepsilon < \infty\} \subseteq \pi(A)\), and the \(\mathbb{P}\left[\pi(A)\right] \le \mathbb{P}(T_\varepsilon < \infty) + \varepsilon\) part ensures that the measure of the difference \(\mathbb{P}\left[ \pi(A) \setminus \{T_\varepsilon < \infty\} \right]\) of these two events can be made as small as desired.
To see that it is not always possible to choose a stopping time \(T\) that satisfies (8) if \(A \in \mathscr{O}\), we will construct a filtration \(\{\mathcal{F}_t\}_{t \ge 0}\) and choose a set \(A \in \mathscr{O}\) that forces \(\pi(A) \setminus \{T < \infty\} \neq \varnothing\) for every stopping time \(T\) (the argument is from (Almost Sure blog)).
To this end, let \(\tau : \Omega \to (0, \infty)\) be a random variable such that \(\mathbb{P}\{\tau < t\} > 0\) for all \(t > 0.\) For example, let \(\tau\) be such that its distribution is uniform on \((0,1).\) Let \(\{\mathcal{F}_t\}_{t \ge 0}\) be the completed filtration such that \(\mathcal{F}_t\) is generated by \(\{\{\tau \le s\} \,:\, s \le t\}\), and thus \(\tau\) becomes a stopping time. Let \(A = \rrbracket 0, \tau \llbracket\,,\) which we know from Lemma 8 belongs to \(\mathscr{O}.\) Note that \(\mathbb{P}\{\pi(A)\} = 1\) by our construction.
It is easy to see that \(\mathcal{F}_t\) is trivial when restricted to \(\{\tau > t\}\), i.e., contains only sets of measure \(0\) or \(1.\) So, every \(\mathcal{F}_t\)-measurable random variable is a.s. constant on the event \(\{\tau > t\}.\) Therefore, any stopping time \(T\) is deterministic on the event \(\{T < \tau\}.\) So, if \(\llbracket T \rrbracket \subseteq A\), we have
Recall the measurable graph theorem (Theorem 8 in part 1) which implies that a map \(T : \Omega \to [0, \infty]\) is measurable if and only if its graph \(\llbracket T \rrbracket\) is measurable. We also have the following neat result:
The necessity follows from the characterization of \(\mathscr{O}\) and \(\mathscr{P}\) in Lemma 8.
In the optional case, sufficiency follows from Theorem 14 and the fact that \(\mathscr{O} \subseteq \mathscr{P}_\star.\)
For the sufficiency in the predictable case, suppose \(\llbracket T \rrbracket\) is predictable. Apply Predictable Section theorem (Theorem 18) for \(A = \llbracket T \rrbracket\) to construct a sequence \(\{T_n\} _ {n \in \mathbb N}\) of predictable stopping times such that
Suppose \(X\) and \(Y\) are measurable processes such that
Remarkably, we have similar results for optional and predictable processes.
To motivate optional and predictable projections, let us start with a fundamental problem in filtering theory: Assume an underlying complete probability space \((\Omega, \mathcal{F}, \mathbb{P}).\) There is an underlying signal \(X = \{X_t\} _ {t \ge 0}\) which is modelled as a stochastic process and which we are interested in studying. Our observation process has noise and therefore instead of observing \(X\) we observe a process \(Y = \{Y_t\} _ {t \ge 0}\) such that
We could look at \[\begin{aligned} Z_t := \mathbb{E} \left( X_t \mid \mathcal{F}_t \right),\end{aligned}\] as an estimate for \(X_t\) at each time \(t \ge 0.\) The process \(Z = \{Z_t\}_{t \ge 0}\) is, of course, adapted. However, since conditional expectation is defined only up to \(\mathbb{P}\)-a.s., what version of \(Z\) should we choose? (9) does not fix the paths of the process \(Z\), which requires specifying its values at the uncountable set of time in \([0, \infty).\) We would be very lucky if it were possible for us to choose a version of \(Z\) such that (9) holds not only for all \(t \in [0, \infty)\) but also for all finite stopping times. And indeed this is possible! This is part of the statement of the optional projection theorem.
On the other hand, if we want an estimate of \(X\) based on observable data before time \(t\), then our estimate would be
Theorem 22 [Optional and Predictable Projections]: Let \(X\) be a bounded, measurable (though not necessarily adapted) process.
There is a unique, modulo indistinguishability, optional process \(X^o\), called optional projection of \(X\), that satisfies for all stopping times \(T \in \mathfrak{S}\) the identity
There is a unique, modulo indistinguishability, predictable process \(X^p\), called predictable projection of \(X\), that satisfies for all predictable stopping times \(T \in \mathfrak{S}^{(\mathscr{P})}\) the identity
Remarks:
Taking expectations in (10) we get \[\begin{aligned} \mathbb{E} \left(X_T \mathbf{1}_{\{T < \infty\}}\right) = \mathbb{E} \left(X_T^o \mathbf{1}_{\{T < \infty\}}\right)\end{aligned}\] for all stopping times \(T \in \mathfrak{S}.\) Now suppose (12) holds for all stopping times \(T \in \mathfrak{S}.\) Fix an arbitrary stopping time \(S \in \mathfrak{S}\) and a set \(A \in \mathcal{F}_S\), and write (12) for \(T = S_A\) to get
Similarly, requiring (11) to hold for all predictable stopping times is equivalent to requiring \[\begin{aligned} \mathbb{E}\left( X_T \mathbf{1}_{\{T < \infty\}} \right) = \mathbb{E}\left(X^p_T \mathbf{1}_{\{T < \infty\}}\right)\end{aligned}\] to hold for all predictable stopping times.
Operators \(^o\) and \(^p\) are linear operators, i.e., for bounded, measurable processes \(X,Y\) and \(a,b \in \mathbb R\),
[of Theorem 22] Uniqueness is immediate from Theorem 21, so let's focus on existence, first for optional projection and then for predictable projection.
We will employ the monotone class theorem for functions. We need a simple class of processes for which finding an optional projection is easy to summon. To this end, consider the processes of the form
With these choices and an arbitrary stopping time \(T\), the left-hand side of (12) becomes \(\mathbb{P}\{B \cap \{u \le T < v\}\}\), whereas optional stopping theorem (Karatzas and Shreve, 1998a), (Le Gall, 2016) shows that its right-hand side is
Finally, use linearity and monotone class arguments to establish existence for arbitrary bounded, measurable \(X.\)
Similar ideas work in the predictable case. We first consider processes of the form
We end our discussion with a result on time change. We call it a “time change” because given an adapted increasing process \(A\) with right-continuous paths and \(A_0 = 0,\) we imagine a clock which runs according to \(A\) in the sense that at time \(t\) this clock shows time \(A_t.\)
Theorem 23: Suppose \(A\) is an adapted increasing process with right-continuous paths and \(A_0 = 0.\)
For any two bounded, measurable processes \(X\) and \(Y\) that satisfy \[\begin{aligned} \mathbb{E} \left(X_T \mathbf{1}_{\{T < \infty\}}\right) = \mathbb{E} \left(Y_T \mathbf{1}_{\{T < \infty\}}\right), \quad \forall \; T \in \mathfrak{S},\end{aligned}\] we have \[\begin{aligned} \mathbb{E} \int_0^T X_t \,\mathrm{d}A_t = \mathbb{E} \int_0^T Y_t \,\mathrm{d}A_t, \quad \forall \; T \in \mathfrak{S}.\end{aligned}\]
For any non-negative, RCLL and uniformly integrable martingale \(M,\) we have
Only the first part requires any real effort.
Introduce the time change
We also check that the properties
The claim (15) now follows directly from our assumptions, in the case \(T = \infty.\) For general \(T \in \mathfrak S\) we apply the above result to the increasing, right-continuous and adapted process
This follows immediately from part-1 by letting
Dellacherie, Claude. Capacités et processus stochastiques, Springer-Verlag, 1972.
Dellacherie, Claude and Meyer, Paul-André. Probabilities and Potential, North-Holland Publishing Company, 1978.
He, Sheng-wu and Wang, Jia-gang and Yan, Jia-an. Semimartingale Theory and Stochastic Calculus, CRC Press, 1992.
Karatzas, Ioannis and Shreve, Steven. Brownian Motion and Stochastic Calculus, Graduate Texts in Mathematics Volume 113, Springer-Verlag New York, 1998.
Karatzas, Ioannis and Shreve, Steven. Methods of Mathematical Finance, Probability Theory and Stochastic Modelling Volume 39, Springer-Verlag New York, 1998.
Le Gall, Jean-François. Brownian Motion, Martingales, and Stochastic Calculus, Graduate Texts in Mathematics Volume 274, Springer International Publishing, 2016.
Lowther, George. Almost Sure blog.