Singular Value Decomposition (as an optimization)

$\def\half{\frac{1}{2}} \def\bm#1{\boldsymbol{#1}} \def\R{\mathbb{R}} \def\norm#1{\left\lVert #1 \right\rVert} \def\X{\bm{X}} \def\U{\bm{U}} \def\V{\bm{V}} \def\L{\bm{L}} \def\M{\bm{M}} \def\I{\bm{I}} \def\u{\bm{u}} \def\v{\bm{v}} \def\y{\bm{y}} \def\Q{\bm{Q}} \def\D{\bm{D}} \def\S{\bm{S}} \def\x{\bm{x}} \def\vec{\operatorname{vec}} \def\tr{\operatorname{tr}} \def\d{\operatorname{d}} \def\Xv{\widehat{\X\v}} \def\H{\operatorname{H}} \def\It{{\bm I}_2}$ The setting for the svd is the following optimization problem: $$ \sup_{\norm{u}=1,\norm{v}=1} \u^\top \X \v $$ where $\u \in \R^{m}$, $\v \in \R^{n}$ and $\X \in \R^{m \times n}$. This can be extended to the following optimization using matrices: $$ \begin{equation} \sup_{\U,\V} \qty[ \tr\qty(\U^\top \X \V ) -\frac{1}{2} \tr\qty(\L^\top\qty(\U^\top\U-\I_m)) -\frac{1}{2} \tr\qty(\M^\top\qty(\V^\top\V-\I_n)) ] \equiv \sup_{\U,\V} \phi(\U,\X,\V,\L,\M) \end{equation} $$ where $\U, \L, \I_m \in \R^{m \times m}$, $\V,\M,\I_n \in \R^{n \times n}$.

The differential of $\phi$ w.r.t. $\U$ is $$ \begin{align} \d\phi & = \tr\qty((\d\U^\top) \X \V ) - \frac{1}{2} \tr\qty(\L^\top\U^\top \d\U) - \frac{1}{2} \tr\qty(\L^\top(\d\U^\top) \U) \nonumber\\ & = \tr\qty(\V^\top \X^\top \d\U ) - \frac{1}{2} \tr\qty(\L^\top\U^\top \d\U) - \frac{1}{2} \tr\qty(\U^\top(\d\U) \L) \nonumber\\ & = \tr\qty(\V^\top \X^\top \d\U ) - \frac{1}{2} \tr\qty(\L^\top\U^\top \d\U) - \frac{1}{2} \tr\qty(\L \U^\top\d\U ) \nonumber\\ & = \tr\qty(\V^\top \X^\top \d\U ) - \tr\qty(\frac{1}{2}\qty(\L+\L^\top)\U^\top \d\U) \nonumber\\ \pdv{\phi}{\U^\top} & = \V^\top \X^\top - \frac{1}{2}\qty(\L+\L^\top)\U^\top. \label{condU} \end{align} $$ Using the same procedure we can show that $$ \begin{align} \d\phi & = \tr\qty(\U^\top \X \d\V ) - \tr\qty(\frac{1}{2}\qty(\M+\M^\top)\V^\top \d\V) \nonumber\\ \pdv{\phi}{\V^\top} & = \U^\top \X - \frac{1}{2}\qty(\M+\M^\top)\V^\top. \label{condV} \end{align} $$

Setting (\ref{condU}), (\ref{condV}) to zero we obtain the optimisation conditions: $$ \begin{align} \pdv{\phi}{\U^\top}=0 & \Rightarrow \V^\top \X^\top = \frac{1}{2}\qty(\L+\L^\top)\U^\top, \label{VU} \\ \pdv{\phi}{\V^\top}=0 & \Rightarrow \U^\top \X = \frac{1}{2}\qty(\M+\M^\top)\V^\top. \label{UV} \end{align} $$

Next, right multiply (\ref{VU}) with $\U$ to get $$ \V^\top \X^\top \U= \frac{1}{2}\qty(\L+\L^\top)\U^\top\U=\frac{1}{2}\qty(\L+\L^\top) $$ and, right multiply (\ref{UV}) with $\V$ to get $$ \U^\top \X \V= \frac{1}{2}\qty(\M+\M^\top)\V^\top\V= \frac{1}{2}\qty(\M+\M^\top). $$ If we transpose both sides of the last equation we obtain $$ \V^\top \X^\top \U=\frac{1}{2}\qty(\M+\M^\top) $$ and so $\frac{1}{2}\qty(\M+\M^\top)=\frac{1}{2}\qty(\L+\L^\top)=\S$ (a symmetric matrix). So (\ref{VU}) and (\ref{UV}) can be rewritten as $$ \begin{align*} \V^\top \X^\top & = \S\U^\top, \\ \U^\top \X & = \S\V^\top. \end{align*} $$ Taking the transpose of both sides of these equations we obtain the familiar form for the svd: $$ \begin{align} \X\V & = \U\S, \label{standardVU}\\ \X^\top \U & = \V\S. \label{standardUV} \end{align} $$ (\ref{standardVU}) and (\ref{standardUV}) generate the following eigenequations: $$ \begin{align} \X^\top\X\V & = \V\S^2, \label{eigen1}\\ \X\X^\top \U & = \U\S^2. \label{eigen2} \end{align} $$ Since $\X^\top\X$, $\X\X^\top$ are symmetric matrices their eigenvectors form an orthonormal basis and $\V^\top\V=\V\V^\top=\I_n$, $\U^\top\U=\U\U^\top=\I_m$ as required. It is also the case that both matrices are positive semi-definite since, for example, for any $\x \in \R^n$ $$ \x^\top \X^\top \X \x = (\X \x)^\top (\X \x) \geq 0. $$ Using $\V\V^\top=\I_n$ we can rewrite (\ref{standardVU}) as $$ \X = \U\S\V^\top $$ which is the full svd of $\X$.

Comments

Popular posts from this blog

CSAT 1

Poor man's Daletskii-Krein

PAT 2022