4
$\begingroup$

For $T : R^n \to P({R^n})$ maximally monotone, the proximal point algorithm (step size $c>0$) $$ x^{k+1} = (I + c T)^{-1} x^k, $$ converges linearly with rate $\kappa = \frac{1}{1 + c \sigma}$ if $T$ is strongly monotone with parameter $\sigma > 0$.

I'm interested in analyzing the linear convergence rate in case of matrix-valued step sizes, i.e., $C \succ 0$, $$ x^{k+1} = (I + C T)^{-1} x^k. $$ I could only manage to prove a bound depending on $\lambda_{\text{min}}(C)$, while in practice I numerically observe that the convergence rate depends on the whole spectrum of $C$.

It seems like such a basic algorithm, so I am surprised that I could not find classic literature (e.g. by Rockafellar) on this topic.

Background: many proximal algorithms for solving problems of the form $$ \min_x \max_y~G(x) - F(y) + \langle Kx,y \rangle $$ such as Douglas-Rachford, ADMM or Chambolle-Pock fit the above setting of proximal point algorithms given a special choice of $C$. In case $G$ and $F$ are both strongly convex, $T$ is strongly monotone and my goal is to connect the linear convergence rate to the choice of metric/step size.

$\endgroup$

1 Answer 1

1
$\begingroup$

I am not aware of results on the linear rate of this variant of the proximal point method. Let me note that convergence is usually shown by the following observation: Since $C$ is a bijection, you may view the iteration $x^{k+1} = (I+CT)^{-1}x^k$ as a preconditioned proximal point method that solves the preconditioned inclusion $0\in CTx$.

Define $S = CT$ and observe that $S$ is a monotone operator on the same Hilbert space but equipped with the inner product $\langle x,y\rangle_C = \langle C^{-1} x,y\rangle$ (which is the natural inner product for the preconditioned problem). Hence, you immediately get convergence of the method and a rate, but with respect to the $C$-norm $\|x\|_C = \sqrt{\langle x,y\rangle_C}$ (also called energy norm in the context of preconditioners).

$\endgroup$
1
  • 1
    $\begingroup$ I was aware of this observation -- my goal is to "compare" preconditioners, i.e., argue that $C_1$ is a better preconditioner than $C_2$. Do you see any possibility for such an argument? I.e., why would one $C$-norm be better than another? $\endgroup$ Commented Jun 22, 2017 at 19:10

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.