0
$\begingroup$

Sub-Gaussian concentration for reversible Markov chains with spectral gap

Setup.
Let $(X_i)_{i\ge1}$ be a stationary, $\pi$-reversible Markov chain on a measurable space with spectral gap $\gamma>0$.
Let $f:\mathcal X\to\mathbb R$ be square-integrable with respect to $\pi$ and assume the sign condition $f\le 0$ $\pi$-a.s.

Define the empirical average \begin{equation} f_n := \frac1n\sum_{i=1}^n f(X_i), \end{equation} and denote the first two (non-centered) moments \begin{equation} \mu_1 := \int f\,d\pi, \qquad \mu_2 := \int f^2\,d\pi. \end{equation} For convenience, let \begin{equation} v := 2\mu_2 - \mu_1^2. \end{equation}


Goal

I would like to obtain a sharp sub-Gaussian concentration bound for the right tail of $f_n$, i.e. \begin{equation} \mathbb P\!\left(f_n \ge \mu_1 + t\right) \le C\exp\!\left(-\frac{n t^2}{C'\sigma^2}\right), \end{equation} where $\sigma^2$ should depend explicitly on $\mu_2$ and the spectral gap $\gamma$, and interpolate between:

  • the i.i.d. case ($\gamma=1$), and
  • the poor mixing limit ($\gamma\to0$).

What I have so far

Let \begin{equation} \phi_n(\theta) := \mathbb E\!\left[\exp\!\left(\theta\sum_{i=1}^n f(X_i)\right)\right], \qquad \theta\ge0. \end{equation}

I derived the following upper bounds: \begin{equation} \phi_1(\theta) \le 1 + \mu_1\theta + \tfrac12\mu_2\theta^2, \end{equation} and, for general $n$, \begin{equation} \phi_n(\theta) \le \phi_1(\theta)^n + \phi_1(\theta)^{n-1}\,\mathrm{Var}\!\bigl(e^{\theta f(X_1)}\bigr) \left[\left(1+\frac{1-\gamma}{\phi_1(\theta)}\right)^{n-1} - 1\right]. \end{equation}

For small $\theta>0$, $ \mathrm{Var}\!\bigl(e^{\theta f(X_1)}\bigr)\sim \theta^2\mu_2$, and \begin{equation} \mathrm{Var}\!\bigl(e^{\theta f(X_1)}\bigr) \le \theta^2 v \quad\text{with}\quad v = 2\mu_2 - \mu_1^2. \end{equation}

From $\phi_1$ (or $\gamma=1$), one recovers the standard i.i.d. sub-Gaussian bound via the Chernoff method:

\begin{equation} \mathbb P(f(X_1)\ge\mu_1+t) \le \exp\!\left(-\frac{t^2}{2\mu_2}\right). \end{equation}


Case $n=2$

When $n=2$, the general bound simplifies to \begin{equation} \phi_2(\theta) \le \bigl(1+\mu_1\theta + \tfrac12\mu_2\theta^2\bigr)^2 + (1-\gamma)\theta^2 v. \end{equation}

Using a simple algebraic lemma:

Lemma.
If $a<0$, $b,c>0$, $X\ge0$, and $1+2aX\ge0$, then
$(1+aX+(b+c)X^2)^2 \ge (1+aX+bX^2)^2 + cX^2.$

Applying this with $a=\mu_1$, $b=\mu_2/2$, $c=(1-\gamma)v$, and $\theta\in[0,-1/(2\mu_1)]$ (recall $f\le0\Rightarrow \mu_1\le0$) gives \begin{equation} \phi_2(\theta) \le \bigl(1+\mu_1\theta + \theta^2(\tfrac{\mu_2}{2}+(1-\gamma)v)\bigr)^2 \le \exp\!\bigl(2\mu_1\theta + \theta^2\sigma_2^2\bigr), \end{equation} where \begin{equation} \sigma_2^2 = \mu_2 + 2(1-\gamma)v = \mu_2 + 2(1-\gamma)(2\mu_2 - \mu_1^2). \end{equation}

Hence, by Markov and Chernoff inequalities, \begin{equation} \mathbb P\bigl(f(X_1)+f(X_2)\ge 2\mu_1+2t\bigr) \le \inf_{\theta\in[0,-1/(2\mu_1)]} \exp(-2t\theta + \theta^2\sigma_2^2), \end{equation} yielding a sub-Gaussian tail with variance proxy $\sigma_2^2$.


Question

How can this analysis be extended to general $n$? While the first step using the algebraic lemma is valid for large $n\geq 2$, the variance will itself be a function of $\theta$ as it depends on $\phi_1(\theta)$, making the overall minimisation problem intractable.

  1. Is the bound \begin{equation} \phi_n(\theta)\le \phi_1(\theta)^n + \phi_1(\theta)^{n-1}\,\mathrm{Var}\!\bigl(e^{\theta f(X_1)}\bigr) \left[\left(1+\frac{1-\gamma}{\phi_1(\theta)}\right)^{n-1}-1\right] \end{equation} sharp enough (for small $\theta$) to imply a sub-Gaussian bound of the form \begin{equation} \phi_n(\theta)\le \exp\!\bigl(n\mu_1\theta + n\theta^2 \tilde\sigma_n^2\bigr), \end{equation} leading to $\mathbb P(f_n\ge\mu_1+t)\le\exp(-n t^2/(C\tilde\sigma_n^2))$? If so, what would be an explicit expression for $\tilde\sigma_n^2$?

  2. If not, what would be a better approach?

Any references or ideas showing how to bridge from this MGF bound to a clean sub-Gaussian concentration would be very helpful.

$\endgroup$
1
  • $\begingroup$ Iirc the typical way to show sub-Gaussian concentration for Markov chains is a modified log-sobolev inequality (the term “Herbst argument” is a keyword here). See for example the introduction of this (arbitrary) paper for some pointers. $\endgroup$ Commented Oct 24 at 16:11

0

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.