8
$\begingroup$

Given tall matrices $A$ and $Y$ and the following overdetermined linear system in square matrix $X$

$$AX=Y$$

is there an explicit formula for the least-squares solution if $X$ is constrained to be symmetric?

$\endgroup$
5
  • $\begingroup$ Presumably, given symmetry constraint, $X$ and $Y$ are square matrices, in which case $X=A^+Y$ is not the least squares solution (of vec'd problem, or equivalently, minimizing Frobenius norm of $X - Y$). Please clarify. $\endgroup$ Commented Sep 4, 2019 at 22:44
  • $\begingroup$ @MarkL.Stone $X$ is square but $A$ and $Y$ are tall. $\endgroup$ Commented Sep 4, 2019 at 22:53
  • $\begingroup$ Sorry, I shouldn't have said $Y$ square. But my point about $X=A^+Y$ not being the solution in the absence of symmetry requirement still stands. That formula assumes $X$ and $Y$ are column vectors. $\endgroup$ Commented Sep 4, 2019 at 23:17
  • $\begingroup$ @MarkL.Stone I see. That part isn't essential to the question so I just removed it. $\endgroup$ Commented Sep 4, 2019 at 23:39
  • $\begingroup$ I don't know of any explicit formula (but not saying it doesn't exist). But it can be readily formulated and numerically solved as a convex Quadratic Programming (QP) or Second Order Cone Problem (SOCP). $\endgroup$ Commented Sep 4, 2019 at 23:48

3 Answers 3

8
$\begingroup$

I assume that $A$ is onto, so that $H:=A^TA$ is positive definite. Minimizing $\|AX-Y\|_F^2$ in Frobenius norm (the least square) among symmetric matrices $X$ yields the optimality condition that $$\langle AS,AX-Y\rangle=0$$ for every symmetric $S$. This amounts to saying that $A^T(AX-Y)$ is skew-symmetric. In other words, $X$ is the solution of the Lyapunov equation $$HX+XH=A^TY+Y^TA=:K.$$ The explicit formula is $$X=\int_0^\infty e^{-tH}Ke^{-tH}dt.$$

$\endgroup$
6
  • $\begingroup$ Is (numerical) evaluation of that integral more difficult than numerically solving the convex Quadratic Programming (QP) or Second Order Cone Problem (SOCP) of minimizing Frobenius norm (or its square) subject to constraint $X = X^T$? $\endgroup$ Commented Sep 7, 2019 at 13:05
  • $\begingroup$ @MarkL.Stone Why compute the integral? Isn't the Lyapunov equation a system of linear equations? Why not use Gaussian elimination? $\endgroup$ Commented Sep 7, 2019 at 14:18
  • $\begingroup$ Prof. Serre, is it still a Lyapunov equation without information on the positive definiteness of (symmetric) matrix $K$? $\endgroup$ Commented Sep 7, 2019 at 14:22
  • $\begingroup$ @Rodrigo de Azevedo My bad. You are correct. $\endgroup$ Commented Sep 7, 2019 at 15:12
  • $\begingroup$ @RodrigodeAzevedo. $K$ does not need to be positive, but $H$ needs to. Actually, the semi-definite case (for $H$) reduces to the positive definite one, by using the orthogonal projection $\Pi$ onto the subspace $(\ker A)^\bot$. Then you must replace $Y$ by $\Pi Y$. $\endgroup$ Commented Sep 7, 2019 at 16:31
4
$\begingroup$

Complementing Denis Serre's answer and rephrasing the original problem slightly, given tall matrices $\rm A$ and $\rm B$, we have the following quadratic program in square matrix $\rm X$

$$\begin{array}{ll} \text{minimize} & \| \mathrm A \mathrm X - \mathrm B \|_{\text F}^2\\ \text{subject to} & \mathrm X = \mathrm X^\top\end{array}$$

We define the Lagrangian

$$\mathcal L (\mathrm X, \Lambda) := \| \mathrm A \mathrm X - \mathrm B \|_{\text F}^2 + \langle \Lambda, \mathrm X - \mathrm X^\top \rangle$$

Differentiating the Lagrangian with respect to $\mathrm X$ and $\Lambda$ and finding where the derivatives vanish, we obtain the following system of linear matrix equations

$$\begin{aligned} 2 \mathrm A^\top \left( \mathrm A \mathrm X - \mathrm B \right) + \Lambda - \Lambda^\top &= \mathrm O\\ \mathrm X - \mathrm X^\top &= \mathrm O \end{aligned}$$

which can be rewritten as follows

$$\begin{aligned} \mathrm A^\top \left( \mathrm A \mathrm X - \mathrm B \right) &= -\frac12 \left( \Lambda - \Lambda^\top \right)\\ \mathrm X &= \mathrm X^\top\end{aligned}$$

From the 1st matrix equation, we conclude that matrix $\mathrm A^\top \left( \mathrm A \mathrm X - \mathrm B \right)$ is skew-symmetric, i.e.,

$$\left( \mathrm A^\top \left( \mathrm A \mathrm X - \mathrm B \right) \right)^\top = -\mathrm A^\top \left( \mathrm A \mathrm X - \mathrm B \right)$$

which can be rewritten as the following Lyapunov-like linear matrix equation in symmetric matrix $\rm X$

$$\boxed{ \mathrm X \left( \mathrm A^\top \mathrm A \right) + \left( \mathrm A^\top \mathrm A \right) \mathrm X = \mathrm A^\top \mathrm B + \mathrm B^\top \mathrm A }$$

Half-vectorizing both sides of the matrix equation above, we obtain a system of linear equations in the entries of $\rm X$ not in the lower triangular

$$\left( \mathrm A^\top \mathrm A \oplus \mathrm A^\top \mathrm A \right) \mathrm D \, \mbox{vech} (\mathrm X) = \mathrm D \, \mbox{vech} \left( \mathrm A^\top \mathrm B + \mathrm B^\top \mathrm A \right)$$

where $\oplus$ denotes the Kronecker sum and $\rm D$ is a (tall) duplication matrix. Assuming invertibility, the least-squares solution could be written as follows

$$\hat{\mathrm X} := \mbox{vech}^{-1} \left( \mathrm L \left( \mathrm A^\top \mathrm A \oplus \mathrm A^\top \mathrm A \right)^{-1} \mathrm D \, \mbox{vech} \left( \mathrm A^\top \mathrm B + \mathrm B^\top \mathrm A \right) \right)$$

where $\rm L$ is a (fat) elimination matrix.

$\endgroup$
1
$\begingroup$

An alternative way to get Rodrigo's solution without resorting to Lagrange multipliers. Since any symmetric matrix can be expressed as $X = V+V^\top $, the least squares problem takes the form

$$\min_V \|A(V+V^\top )-Y\|_F^2\\ = \min_V \Big[\mathrm{Tr}\Big((V+V^\top )A^\top A(V+V^\top )\Big) - 2\mathrm{Tr}\Big(Y^\top A(V+V^\top )\Big) + \mathrm{Tr}\Big(Y^\top Y\Big)\Big]$$

If $B$ is an arbitrary matrix of suitable size, the following formulas hold (I have no reference but can be easily proven from the ground up)

\begin{align} &\frac{\mathrm{d}}{\mathrm{d} V} \mathrm{Tr}\Big(B(V+V^\top )\Big) = B+B^\top ;\\ &\frac{\mathrm{d}}{\mathrm{d} V} \mathrm{Tr}\Big((V+V^\top )B(V+V^\top )\Big) = (V+V^\top )(B+B^\top ) + (B+B^\top )(V+V^\top ). \end{align}

By using these formulas, we obtain the optimality condition

$$2(V+V^\top )A^\top A + 2A^\top A(V+V^\top ) -2Y^\top A -2A^\top Y = 0.$$

Remembering that $V+V^\top = X$, we re-obtain the Rodrigo-Lyapunov equation

$$\boxed{XA^\top A + A^\top AX = Y^\top A + A^\top Y}$$

I have used a similar approach based on bespoke matrix differentiation formulas here and here.

$\endgroup$
0

You must log in to answer this question.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.