Measure Theory Crash Course

1. Motivation: The Problem of “Measuring” Sets

In mathematics, especially in analysis and probability theory, we often want to assign a “size” or “measure” to subsets of a given set. For example:

  • Length: What is the length of an interval on the real line?
  • Area: What is the area of a region in the plane?
  • Probability: What is the probability of a certain event occurring?

However, not all subsets can be measured in a simple, intuitive way. For example:

  • The set of all rational numbers between 0 and 1 is infinite, but we intuitively feel it should have “size zero” in terms of length.
  • There are even more exotic sets that defy simple measurement.

This brings us to the core problem: How can we systematically define a notion of “measure” that works for a wide variety of sets while avoiding paradoxes?

  1. Intuition: Playing with Simple Examples Let’s start by thinking about some simple cases:
  • Intervals on the real line: It’s straightforward to define the length of an interval $[a,b]$ as $b - a$. This is intuitive and works well. Similarly, we can define the area of a rectangle as $a \times b$, the volume of a rectangular prism as $a \times b \times c$, and so on.
  • Finite sets: A finite set of points, like ${1, 2, 3}$, might be thought of as having zero length (in terms of intervals), but it can still be measured in a different context, like counting the number of elements.
  • Countable sets: Consider the set of all integers $\mathbb{Z}$ or the rationals $\mathbb{Q}$. These sets are infinite but still “small” in a certain sense—maybe they should have measure zero.
  • The Cantor Set: This set is a classic example of a set that has zero density but is uncountably infinite. It is constructed by removing the middle third of the interval $[0, 1]$ and then removing the middle third of the remaining interval. There is a great visualization on its Wikipedia page. Intuitively, we probably feel that it should have “size zero” in terms of length.

From these examples, we see that measuring sets involves more than just straightforward length or area. We need a system that can handle everything from simple intervals to more complicated constructions.

Idea: We need a general framework that can:

  • Assign a “size” to basic sets.
  • Combine these basic sets using operations like union, intersection, and complement.
  • Still produce a consistent and meaningful measure.

3. Formalization: Measure Spaces

Now, let’s start formalizing the ideas. We’ll introduce key concepts step by step.

3.1 Sigma-Algebras (σ-Algebras)

To handle complex sets, we define a special collection of sets that can be measured consistently:

  • A σ-algebra $\mathcal{F}$ on a set $X$ is a collection of subsets of $X$ that includes $X$ itself and is closed under complements and countable unions:
    1. $X \in \mathcal{F}$
    2. If $A \in \mathcal{F}$, then $A^c \in \mathcal{F}$
    3. If $A_1, A_2, \dots \in \mathcal{F}$, then $\bigcup_{i=1}^{\infty} A_i \in \mathcal{F}$

Intuitive Motivation:

Why do we need a σ-algebra? When we measure things, we often need to combine sets (through union or intersection) or consider what’s “left over” (complement). To ensure that these operations don’t lead us out of the realm of measurable sets, we need a collection that’s closed under these operations. A σ-algebra is this safe collection.

Probably the most natural example arises in probability theory. Indeed, we often have a set of outcomes $\Omega$ and a probability measure $\mathbb{P}$ on $\Omega$. Then, we often want to consider events in $\Omega$, which are “combinations” of outcomes. For example, landing 1 on a die is an outcome, and so is landing 6 on a die. We could then consider the event of rolling 1 or 6 on a die (union), or if we have multiple dice, we can also consider rolling 1 and 6 (intersection). In addition, the event of rolling an even number is the complement of the event of rolling an odd number.

In this example, the σ-algebra $\mathcal{F}$ is the collection of all events. It’s closed under unions, intersections, and complements, and it’s very helpful for defining probabilities formally.

Examples:

  • Trivial σ-algebra: The simplest σ-algebra on any set $X$ is ${\emptyset, X}$. This is not very interesting but serves as the most basic example.

  • Power set: The power set of $X$ (the set of all subsets of $X$) is a σ-algebra. It’s the largest possible σ-algebra on $X$.

  • Borel σ-algebra: On the real line $\mathbb{R}$, the Borel σ-algebra is generated by all open intervals. This σ-algebra includes all sets that can be formed from open intervals through countable unions, intersections, and complements, and it’s crucial for defining measures on the real line.

3.2 Measurable Spaces

  • A measurable space is a pair $(X, \mathcal{F})$, where $X$ is a set and $\mathcal{F}$ is a σ-algebra on $X$. This is the foundational structure where we can define a measure.

Intuitive Motivation:

To apply the concept of measure, we need a structured environment. The measurable space provides this by pairing a set with a σ-algebra, ensuring we only deal with subsets that are “measurable.”

Example:

Consider $X = \mathbb{R}$ and $\mathcal{F}$ as the Borel σ-algebra $\mathcal{B}(\mathbb{R})$. Then, $(\mathbb{R}, \mathcal{B}(\mathbb{R}))$ is a measurable space. This setup allows us to define measures like the Lebesgue measure on $\mathbb{R}$.

3.3 Measure

  • A measure $\mu$ on a measurable space $(X, \mathcal{F})$ is a function $\mu: \mathcal{F} \to [0, \infty]$ that assigns a non-negative extended real number to each set in $\mathcal{F}$, satisfying:
    1. Non-negativity: $\mu(A) \geq 0$ for all $A \in \mathcal{F}$.
    2. Null empty set: $\mu(\emptyset) = 0$.
    3. Countable additivity (σ-additivity): If ${A_i}$ is a countable collection of disjoint sets in $\mathcal{F}$, then $$ \mu\left(\bigcup_{i=1}^{\infty} A_i\right) = \sum_{i=1}^{\infty} \mu(A_i). $$

Intuitive Motivation:

We want to extend the idea of “size” to more complex sets. A measure is a function that assigns this size, ensuring that the size of the whole is the sum of its disjoint parts. It generalizes concepts like length, area, and probability.

Examples:

  • Lebesgue Measure: On $\mathbb{R}$, the Lebesgue measure $\lambda$ assigns the length to intervals, and more generally to any Borel set. For instance, $\lambda([0, 1]) = 1$.

  • Counting Measure: On a finite or countable set $X$, the counting measure $\mu$ assigns to each subset $A$ the number of elements in $A$. For example, if $X = {a, b, c}$, then $\mu({a, c}) = 2$.

4. Critical Theorems and Properties

4.1 Lebesgue Measure

  • The Lebesgue measure is a specific measure on $\mathbb{R}$ (or $\mathbb{R}^n$) that generalizes the notion of length, area, and volume. It is defined on the Borel σ-algebra $\mathcal{B}(\mathbb{R})$, which is the σ-algebra generated by the open sets of $\mathbb{R}$.

Intuitive Motivation:

The Lebesgue measure extends the familiar concept of length (or area in higher dimensions) to a much wider class of sets than just intervals. This allows us to measure complicated sets that can be constructed from simpler sets.

Example:

  • The interval $[0, 1]$ has a Lebesgue measure of $1$.
  • The set of all rational numbers in $[0, 1]$, which is countable, has a Lebesgue measure of $0$, illustrating that not all infinite sets have positive measure.

4.2 Measurable Functions

  • A function $f: X \to \mathbb{R}$ is measurable if, for every Borel set $B \subseteq \mathbb{R}$, the preimage $f^{-1}(B)$ is in $\mathcal{F}$. This ensures that we can apply the measure $\mu$ to the values of the function, leading to the definition of integrals.

Intuitive Motivation:

Measurable functions are those that “respect” the structure of the measurable space. They map measurable sets (sets in $\mathcal{F}$) to measurable outcomes in a way that allows us to meaningfully integrate or assign probabilities.

Example:

  • The indicator function $f: \mathbb{R} \to {0, 1}$ defined by $f(x) = 1$ if $x \in [0, 1]$ and $f(x) = 0$ otherwise is measurable. The preimage of ${1}$ under $f$ is $[0, 1]$, a Borel set.

4.3 Integration with Respect to a Measure

  • The Lebesgue integral generalizes the Riemann integral. For a non-negative measurable function $f: X \to [0, \infty]$, the Lebesgue integral is defined as:

    $$ \int_X f \, d\mu = \sup \left\{ \int_X \phi \, d\mu : \phi \text{ is a simple function and } 0 \leq \phi \leq f \right\}. $$
    • Simple Functions: A simple function takes on a finite number of values and is easier to integrate.
    • Monotone Convergence Theorem: If $f_n \uparrow f$, then $\lim_{n \to \infty} \int f_n d\mu = \int f d\mu$.
    • Dominated Convergence Theorem: If $f_n \to f$ and $|f_n| \leq g$, where $g$ is integrable, then $\lim_{n \to \infty} \int f_n d\mu = \int f d\mu$.

Intuitive Motivation:

The Lebesgue integral extends the Riemann integral, allowing us to integrate functions over more complex sets and under conditions where the Riemann integral would fail. It’s especially useful for functions that are discontinuous or where the domain has a complicated structure.

Example:

  • Integrating the indicator function $f(x) = 1$ over the interval $[0, 1]$ with respect to the Lebesgue measure gives:

    $$ \int_{[0,1]} 1 \, d\lambda = \lambda([0, 1]) = 1. $$
    • Simple Functions: Simple functions approximate more complex functions and are easier to work with. For example, the function $f(x) = x$ on $[0, 1]$ can be approximated by simple functions like $\phi(x) = \frac{k}{n}$ on intervals $\left[\frac{k}{n}, \frac{k+1}{n}\right)$.

6. Development: Key Theorems and Applications

6.1 Monotone Convergence Theorem

  • If you have a sequence of non-negative measurable functions ${f_n}$ such that $f_n \uparrow f$, then

    $$ \lim_{n \to \infty} \int f_n \, d\mu = \int f \, d\mu. $$

    This theorem allows us to swap the limit and the integral under certain conditions, which is crucial for working with infinite sums and series.

Intuitive Motivation:

When dealing with sequences of functions that increase to a limit, it’s often important to know that the integral of the limit function is the limit of the integrals. This theorem ensures that when functions grow monotonically, their integrals behave nicely.

Example:

  • Suppose $f_n(x) = \frac{nx}{1 + nx}$ for $x \in [0, 1]$. Then $f_n(x) \uparrow 1$ as $n \to \infty$. The Monotone Convergence Theorem ensures that: $$ \lim_{n \to \infty} \int_0^1 \frac{nx}{1 + nx} \, dx = \int_0^1 1 \, dx = 1. $$

6.2 Fatou’s Lemma

  • For a sequence of non-negative measurable functions ${f_n}$, Fatou’s Lemma states that:

    $$ \int \liminf_{n \to \infty} f_n \, d\mu \leq \liminf_{n \to \infty} \int f_n \, d\mu. $$

    This lemma is a foundational result that provides a lower bound for the integral of the limit inferior of a sequence of functions.

Intuitive Motivation:

Fatou’s Lemma provides a safety net when dealing with limits of sequences of functions. Even when the sequence doesn’t converge nicely, this lemma assures us that the integral of the “liminf” (the smallest values the functions approach) is not greater than the “liminf” of the integrals.

Example:

  • Consider $f_n(x) = \frac{1}{n} \chi_{[0, n]}(x)$ where $\chi_{[0, n]}(x)$ is the indicator function on $[0, n]$. Then $\liminf_{n \to \infty} f_n(x) = 0$ for all $x \in \mathbb{R}$. Fatou’s Lemma tells us that: $$ \int \liminf_{n \to \infty} f_n \, d\mu \leq \liminf_{n \to \infty} \int f_n \, d\mu = 0. $$

6.3 Dominated Convergence Theorem

  • The Dominated Convergence Theorem (DCT) is one of the most powerful results in measure theory. It states that if $f_n \to f$ almost everywhere, and $|f_n| \leq g$ where $g$ is integrable, then

    $$ \lim_{n \to \infty} \int f_n \, d\mu = \int f \, d\mu. $$

    This theorem allows the interchange of limits and integrals when the functions are bounded by an integrable function.

Intuitive Motivation:

The DCT is crucial because it tells us that under reasonable conditions (specifically, if the functions are dominated by an integrable function), we can safely take limits inside the integral. This is especially useful in probability and analysis where we deal with sequences of random variables or functions.

Example:

  • Let $f_n(x) = \frac{\sin(nx)}{n}$ on $[0, 2\pi]$. These functions converge pointwise to $f(x) = 0$, and are dominated by the integrable function $g(x) = \frac{1}{n}$. The DCT ensures that: $$ \lim_{n \to \infty} \int_0^{2\pi} \frac{\sin(nx)}{n} \, dx = \int_0^{2\pi} 0 \, dx = 0. $$

7. Conclusion: Why Measure Theory Matters

Measure theory provides the rigorous foundation for much of modern analysis, probability, and many areas of applied mathematics. By formalizing the concept of “size” or “measure,” it allows mathematicians to extend ideas like integration, differentiation, and probability to more complex and abstract settings. The key theorems and properties of measure theory are essential tools for understanding and solving problems in these areas.

Understanding measure theory enables one to:

  • Define and work with integrals in a broader sense than Riemann integration allows.
  • Handle infinite processes (like sums and limits) in a rigorous way.
  • Apply these concepts to real-world problems in probability, statistics, and beyond.