linear algebra – Help understanding matrix math in whitening transformation proofs

I’m looking at a couple small articles about whitening transformations:


In both articles, there comes a step where given a centered data matrix $X$

we compute its covariance

$$Sigma = XX^T$$

and come up with a matrix $W$ that satisfies

$$WW^T = Sigma^{-1}$$

The idea now is that if we transform our data $X$ into $Y = WX$ we can show that

$$cov(Y) = WX (WX)^T$$
$$= WXX^TW^T$$
$$= WSigma W^T$$


All of this seems reasonable so far, but both authors in the referenced articles make the following leap:
They claim you can reduce the above to $I$. In this article,,

some of the work is sort of shown:

It is stated that $WSigma W^T$ = $WW^TSigma$ which would then obviously reduce to $I$.

Why is it OK to swap the order of $W^T$ and $Sigma$ in the above expression?