Summer 2020

Uniform integrability

Motivation

  • Convergence in probability is easy to establish, e.g.
    • WLLN for independent RVs
    • Ergodic theorem for dependent RVs (discussed last semester in recursive TAVC)
    • Dominated convergence theorem
  • Convergence in \(\mathcal{L}^p\)-norm is harder to establish on the other hand
  • Uniform integrability is a necessary and sufficient condition to link them

An “absolute continuity” property

  • Lemma 13.1.1
    • Suppose that \(X \in \mathcal{L}^1 = \mathcal{L}^1 (\Omega, \mathcal{F}, \mathbb{P})\)
    • Then, given \(\epsilon > 0\), \(\exists \delta>0\) s.t. for \(F \in \mathcal{F}\), \(P(F)<\delta \implies E(|X|;F) < \epsilon\)
  • Proof
    • If the conclusion is false, then, for some \(\epsilon_0 > 0\), we can find \(\{F_n\}\) consists of elements of \(\mathcal{F}\) s.t. \[ P(F_n) < 2^{-n}, E(|X|;F_n) \ge \epsilon_0 \]
      • Construction of “contracting” events
    • Let \(H := \limsup F_n\). Then BC1 shows that \(P(H) = 0\)
    • Yet reverse Fatou lemma shows that \(E(|X|;H) \ge \limsup_{n \rightarrow \infty} E(|X|;F_n) = \epsilon_0\)
    • Contradiction arises since \(P(H) = 0 \implies E(|X|;H) = 0\)

An “absolute continuity” property

  • Corollary 13.1.2
    • Suppose that \(X \in \mathcal{L}^1\) and that \(\epsilon > 0\)
    • Then \(\exists K \in [0,\infty)\) such that \(E(|X|;|X|>K) < \epsilon\)
  • Proof
    • Let \(\delta\) be as in lemma 13.1.1
    • Since \(KP(|X|>K) \le E(|X|)\), we can choose \(K\) such that \(P(|X|>K) \le \delta\)
    • Application of lemma 13.1.1 yields the result

UI family

  • A class \(\mathcal{C}\) of RVs is called uniformly integrable (UI) if given \(\epsilon > 0\), \[ \exists K \in [0, \infty) \textrm{ s.t. } E(|X|;|X|>K) < \epsilon, \forall X \in \mathcal{C} \]
  • For such a class \(\mathcal{C}\), we have (with \(K_1\) relating to \(\epsilon = 1\)) for every \(X \in \mathcal{C}\), \[ \begin{aligned} E(|X|) &= E(|X|;|X| > K_1) +E(|X|;|X| \le K_1) \\ &\le 1 +K_1 \end{aligned} \]
    • The first term comes from choice of \(K_1\) and corollary 13.1.2
    • The second term comes from idea of Markov’s inequality
  • This means that a UI family is bounded in \(\mathcal{L}^1\) but the converse is not true
    • Counterexample: Take \((\Omega, \mathcal{F}, \mathbb{P}) = ([0,1], \mathcal{B}[0,1], \textrm{Leb})\)
    • Let \(E_n = \left( 0, \frac{1}{n} \right)\) and \(X_n = nI_{E_n}\)
    • Then \(E(|X_n|)=1, \forall n\) so that \(\{X_n\}\) is bounded in \(\mathcal{L}^1\)
    • However, for any \(K>0\), we have for \(n>K\), \(E(|X_n|;|X_n|>K)=nP(E_n)=1\)
    • This means \(\{X_n\}\) is not UI. Here, \(X_n \rightarrow 0\) but \(E(X_n) \nrightarrow 0\)

Two sufficient conditions for the UI property

  • First condition: boundedness in \(\mathcal{L}^p\) where \(p>1\)
    • Suppose that \(\mathcal{C}\) is a class of RVs bounded in \(\mathcal{L}^p\) for some \(p>1\)
    • Thus, for some \(A \in [0,\infty)\), \(E(|X|^p) < A, \forall X \in \mathcal{C}\)
    • Then \(\mathcal{C}\) is UI
  • Proof
    • If \(v \ge K > 0\), then \(v^{1-p} \le K^{1-p} \implies v \le K^{1-p} v^p\)
    • Hence, for \(K > 0\) and \(X \in \mathcal{C}\), we have \[ E(|X|; |X|>K) \le K^{1-p} E(|X|^p; |X|>K) \le K^{1-p} A \]
    • The result follows from the fact that we can choose \(K\) based on the value of \(\epsilon := K^{1-p} A\)
  • Idea
    • Boundedness in \(\mathcal{L}^p\) for some \(p>1\) implies boundedness in \(\mathcal{L}^1\)
      • Which is a property of UI family
      • While \(\mathcal{L}^p\) provides a “faster” convergence

Two sufficient conditions for the UI property

  • Second condition: dominated by an integrable non-negative variable
    • Suppose that \(\mathcal{C}\) is a class of RVs which is dominated by an integrable non-negative variable \(Y\): \[ |X(\omega)| \le Y(\omega), \forall X \in \mathcal{C} \textrm{ and } E(Y) < \infty \]
    • Then \(\mathcal{C}\) is UI
  • Proof
    • For \(K>0\) and \(X \in \mathcal{C}\), we have \[ E(|X|; |X|>K) \le E(Y; Y>K) < \epsilon \]
    • where the last inequality comes from corollary 13.1.2
  • Remark
    • It is precisely this which makes dominated convergence theorem works for our \((\Omega, \mathcal{F}, \mathbb{P})\)
    • An extension of dominated convergence theorem to the whole class \(\mathcal{C}\)

UI property of conditional expectation

  • Theorem 13.4.1
    • Let \(X \in \mathcal{L}^1\). Then the class \(\left\{ E(X|\mathcal{G}): \mathcal{G} \textrm{ a sub-}\sigma\textrm{-algebra of } \mathcal{F} \right\}\) is uniformly integrable
    • Formally, the definition of the class \(\mathcal{C}\) is \(Y \in \mathcal{C}\) if and only if \(Y\) is a version of \(E(X|\mathcal{G})\) for some sub-\(\sigma\)-algebra \(\mathcal{G}\) of \(\mathcal{F}\)
  • Proof
    • Let \(\epsilon > 0\) be given
    • By lemma 13.1.1, we can choose \(\delta > 0\) such that, for \(F \in \mathcal{F}\), \(P(F) < \delta \implies E(|X|;F) < \epsilon\)
    • Choose \(K\) so that \(K^{-1}E(|X|) < \delta\)
    • Now let \(\mathcal{G}\) be a sub-\(\sigma\)-algebra of \(\mathcal{F}\) and let \(Y\) be any version of \(E(X|\mathcal{G})\)
    • By Jensen’s inequality, \(|Y| \le E(|X||\mathcal{G})\) a.s. (absolute function is convex)
    • Hence \(E(|Y|) \le E(|X|)\) by tower property and \(K P(|Y|>K) \le E(|Y|) \le E(|X|)\)
    • By the choice of \(K\), we now have \(P(|Y|>K) < \delta\) from last inequality
    • But \(\{|Y| > K \} \in \mathcal{G}\), so that \(E(|Y|; |Y| \ge K) \le E(|X|; |Y| \ge K) < \epsilon\) completes the proof
      • By \(|Y| \le E(|X||\mathcal{G})\), property of conditional expectation and lemma 13.1.1

Convergence of random variables

Convergence in probability

  • Definition
    • Let \(\{X_n\}\) be a sequence of RVs and \(X\) be a RV
    • We say that \(X_n \stackrel{p}{\rightarrow} X\) if for every \(\epsilon > 0\) \[ \lim_{n \rightarrow \infty} P(|X_n -X| > \epsilon) \rightarrow 0 \]
  • Lemma 13.5.1: almost sure convergence implies convergence in probability
    • \(X_n \stackrel{a.s.}{\rightarrow} X \implies X_n \stackrel{p}{\rightarrow} X\)
  • Proof
    • Suppose that \(X_n \stackrel{a.s.}{\rightarrow} X\) and that \(\epsilon > 0\)
    • Then by reverse Fatou lemma for sets, \[ \begin{aligned} 0 &= P(|X_n-X| > \epsilon, \textrm{ i.o.}) = P\left( \limsup \{ |X_n-X| > \epsilon \} \right) \\ &\ge \limsup P(|X_n-X| > \epsilon) \end{aligned} \]
    • The result is proved by non-negativity of probability and sandwich theorem

Bounded convergence theorem

  • Let \(\{X_n\}\) be a sequence of RVs and \(X\) be a RV
  • Suppose that \(X_n \stackrel{p}{\rightarrow} X\) and that for some \(K \in [0,\infty)\), we have \(|X_n(\omega)| \le K, \forall n, \forall \omega\)
  • Then \(E(|X_n-X|) \rightarrow 0\)

  • Proof
    • Let’s check that \(P(|X| \le K) = 1\). By assumption, for \(k \in \mathbb{N}\), \[ P(|X| > K +k^{-1}) \le P(|X-X_n| > k^{-1}), \forall n \]
    • \(X_n \stackrel{p}{\rightarrow} X\) implies \(P(|X| > K +k^{-1}) = 0\)
    • Hence \(P(|X|>K) = P\left( \cup_k \big\{ |X| > K +k^{-1} \big\} \right) = 0\)
    • Now let \(\epsilon > 0\) be given
    • Choose \(n_0\) such that \(P\left( |X_n-X| > \frac{1}{3} \epsilon \right) < \frac{\epsilon}{3K}\) when \(n \ge n_0\)
    • Then, for \(n \ge n_0\), \[ \begin{aligned} E(|X_n-X|) &= E\left( |X_n-X|; |X_n-X| > \frac{1}{3} \epsilon \right) +E\left( |X_n-X|; |X_n-X| \le \frac{1}{3} \epsilon \right) \\ &\le 2K P\left( |X_n-X| > \frac{1}{3} \epsilon \right) +\frac{1}{3} \epsilon \le \epsilon \end{aligned} \]
  • Remark
    • This proof shows that convergence in probability is a natural concept (how?)

A necessary and sufficient condition for \(\mathcal{L}^1\) convergence

  • Theorem 13.7.1
    • Let \(\{X_n\}\) be a sequence in \(\mathcal{L}^1\) and let \(X \in \mathcal{L}^1\)
    • Then \(X_n \stackrel{\mathcal{L}^1}{\rightarrow} X\), equivalently \(E(|X_n-X|) \rightarrow 0\), if and only if \(X_n \stackrel{p}{\rightarrow} X\) and \(\{X_n\}\) is UI
  • Remarks
    • The “if” part is more useful since it improves dominated convergence theorem
      • This can be seen from 13.3 the second sufficient condition of UI
    • The “only if” part is less surprising
      • Convergence in \(\mathcal{L}^p, p \ge 1\) implies convergence in probability

  • Proof of “if” part
    • Suppose that \(X_n \stackrel{p}{\rightarrow} X\) and \(\{X_n\}\) is UI. For \(K \in [0, \infty)\), define \(\varphi_K: \mathbb{R} \rightarrow [-K,K]\) by \[ \varphi_K(x) := \left\{ \begin{array}{ll} K &, x > K \\ x &, |x| \le K \\ -K &, x < -K \end{array} \right. \]
    • Let \(\epsilon > 0\) be given. By the UI property of \(\{X_n\}\) and corollary 13.1.2, choose \(K\) so that \[ E\big[ |\varphi_K(X_n) -X_n| \big] < \frac{\epsilon}{3}, \forall n; E\big[ |\varphi_K(X) -X| \big] < \frac{\epsilon}{3} \]
    • Note that \(|\varphi_K(x) -\varphi_K(y)| \le |x-y| \implies \varphi_K(x) \stackrel{p}{\rightarrow} \varphi_K(y)\) by taking probability
    • Applying bounded convergence theorem, we can choose \(n_0\) such that, for \(n \ge n_0\), \[ E\big[ |\varphi_K(X_n) -\varphi_K(X)| \big] < \frac{\epsilon}{3} \]
    • Minkowski inequality shows that, for \(n \ge n_0\) , \[ E\big( |X_n-X| \big) = E\big[ |X_n -\varphi_K(X_n) +\varphi_K(X) -X +\varphi_K(X_n) -\varphi_K(X)| \big] < \epsilon \]

  • Proof of “only if” part
    • Suppose that \(X_n \rightarrow X\) in \(\mathcal{L}^1\). Let \(\epsilon > 0\) be given
    • Choose \(N\) such that \(n \ge N \implies E(|X_n-X|) < \frac{\epsilon}{2}\)
    • By lemma 13.1.1, we can choose \(\delta > 0\) such that whenever \(P(F) < \delta\), we have \[ E(|X_n|;F) < \epsilon, 1 \le n \le N; \quad E(|X|;F) < \frac{\epsilon}{2} \]
      • The second inequality probably comes from choice of \(N\) instead of lemma 13.1.1
    • Since \(\{X_n\}\) is bounded in \(\mathcal{L}^1\), we can choose \(K\) such that \(K^{-1} \sup_r E(|X_r|) < \delta\)
    • Then for \(n \ge N\), we have \(P(|X_n| > K) < \delta\) (by idea in Markov inequality) and \[ E(|X_n|; |X_n|>K) \le E(|X|; |X_n|>K) +E(|X-X_n|) < \epsilon \]
      • By lemma 13.1.1 and choice of \(N\)
    • For \(n \le N\), we have \(P(|X_n| > K) < \delta\) and \(E(|X_n|; |X_n|>K) < \epsilon\) by choice of \(\delta\)
    • Hence \(\{X_n\}\) is a UI family
    • Since \(\epsilon P(|X_n-X| > \epsilon) \le E(|X_n-X|) \rightarrow 0\), we have \(X_n \stackrel{p}{\rightarrow} X\)

Concluding remarks

Comments

  • UI allows us to establish stronger \(\mathcal{L}^1\) convergence from weaker convergence in probability
    • This is appealing as there are more standard devices for convergence in probability
  • UI appears naturally in conditional expectation, which is central to martingale property
    • Thus UI martingale is studied in next chapter