The Cholesky decomposition of a matrix is a useful tool which can be used to transform a matrix into the identity matrix. My interest in this has stemmed from such a use in which to normalise an optimisation problem I am required to remove the complexity from my data matrix.
A Cholesky decomposition of a positive-definite matrix Q of dimensions (nXn) has the form,
$Q = LL^*$
where $L^*$ is the conjugate transpose of L and L is a lower triangular matrix composed of real and positive diagonal entries.
This decomposition can be achieved easily in Python with both the numpy.linalg and scipy.linalg packages as,
Transforming Q into the identity matrix then follows from,
$Q' = L^{-1} Q (L^{-1})^{T}$,
and in Python we can write this as,
This is a simple enough task that for low order n is capable of returning the identity matrix. Depending on the scale of the entries in Q I have found that the accuracy of the returned identity matrix can vary with the result Q_, hereafter referred to as Q', having slight deviations from the target matrix.
I therefore wanted to try and gain a better understanding of the Cholesky decomposition and write my own code to perform this task. I began this process by inspecting the decomposition further. For a three by three Q matrix the decomposition looks like,
$Q = LL* =
\begin{pmatrix}
L_{11} & 0 & 0 \\
L_{21} & L_{22} & 0 \\
L_{31} & L_{32} & L_{33}
\end{pmatrix}
\begin{pmatrix}
L_{11} & L_{21} & L_{31} \\
0 & L_{22} & L_{32} \\
0 & 0 & L_{33}
\end{pmatrix}$
and expanding this out results in,
$Q =
\begin{pmatrix}
L_{11}^2 & L_{21}L_{11} & L_{31}L_{11} \\
L_{21}L_{11} & L_{21}^2 + L_{22}^2 & L_{31}L_{21} + L_{32}L_{22}\\
L_{31}L_{11} & L_{31}L_{21} + L_{32}L_{22}& L_{31}^2+L_{32}^2 + L_{33}^2\\
\end{pmatrix}$
which by equating with the elements of Q leads too,
$L =
\begin{pmatrix}
\sqrt{Q_{11}} & 0 & 0 \\
Q_{21}/L_{11} & \sqrt{Q_{22} - L_{21}^2} & 0 \\
Q_{31}/L_{11} & (Q_{32} - L_{31}L_{21})/L_{22} & \sqrt{Q_{33} - L_{31}^2 - L_{32}^2} \\
\end{pmatrix}$.
This can be generalised into the following formulas for the elements of L,
$L_{j, j} = \sqrt{A_{j,j} - \sum_{k=1}^{j-1}L_{j,k}^2}$,
$L_{i, j} = \frac{1}{L_{j,j}} \bigg(Q_{i, j} - \sum_{k=1}^{j-1}L_{i,k}L_{j,k}\bigg)$ for $i>j$.
These recurrence relations can be coded in Python as,
where L has the same shape as Q by definition. This code effectively completes the same job that numpy and scipy's 'linalg.cholesky' achieves and gives the decomposition of Q.
I proceeded to compare the ability of my code to accurately return the identity matrix in comparison with that from numpy's cholesky decomposition. The code for this is available at https://github.com/htjb/Blog/tree/master/Cholesky_decomposition.
To build my Q matrix and ensure it is positive-definite, before I decompose it and transform it to the identity matrix, I use the following code,
where N defines the dimension of Q and x is arbitrarily chosen. 'phi' is effectively a matrix of basis functions and this was just a convenient choice given the larger application of the Cholesky decomposition I have mentioned at the beginning of this post.
x here is normalised to the interval [0, 1]. I have found that this type of normalisation on x gives the most consistent results and is the most likely to return perfectly the identity matrix for any N and any range of x as opposed to say something like $x/x_{max}$.
The graphs below show the elements of Q' as produced with L from both my code and numpy's linalg library. The results show that both approaches return L to a sufficient degree of accuracy to transform Q into the identity matrix for N = 4, N = 9 and N = 11.
A Cholesky decomposition of a positive-definite matrix Q of dimensions (nXn) has the form,
$Q = LL^*$
where $L^*$ is the conjugate transpose of L and L is a lower triangular matrix composed of real and positive diagonal entries.
This decomposition can be achieved easily in Python with both the numpy.linalg and scipy.linalg packages as,
L = np.linalg.cholesky(Q)
Transforming Q into the identity matrix then follows from,
$Q' = L^{-1} Q (L^{-1})^{T}$,
and in Python we can write this as,
A = np.linalg.inv(L) Q_ = A @ Q @ A.T
This is a simple enough task that for low order n is capable of returning the identity matrix. Depending on the scale of the entries in Q I have found that the accuracy of the returned identity matrix can vary with the result Q_, hereafter referred to as Q', having slight deviations from the target matrix.
I therefore wanted to try and gain a better understanding of the Cholesky decomposition and write my own code to perform this task. I began this process by inspecting the decomposition further. For a three by three Q matrix the decomposition looks like,
$Q = LL* =
\begin{pmatrix}
L_{11} & 0 & 0 \\
L_{21} & L_{22} & 0 \\
L_{31} & L_{32} & L_{33}
\end{pmatrix}
\begin{pmatrix}
L_{11} & L_{21} & L_{31} \\
0 & L_{22} & L_{32} \\
0 & 0 & L_{33}
\end{pmatrix}$
and expanding this out results in,
$Q =
\begin{pmatrix}
L_{11}^2 & L_{21}L_{11} & L_{31}L_{11} \\
L_{21}L_{11} & L_{21}^2 + L_{22}^2 & L_{31}L_{21} + L_{32}L_{22}\\
L_{31}L_{11} & L_{31}L_{21} + L_{32}L_{22}& L_{31}^2+L_{32}^2 + L_{33}^2\\
\end{pmatrix}$
which by equating with the elements of Q leads too,
$L =
\begin{pmatrix}
\sqrt{Q_{11}} & 0 & 0 \\
Q_{21}/L_{11} & \sqrt{Q_{22} - L_{21}^2} & 0 \\
Q_{31}/L_{11} & (Q_{32} - L_{31}L_{21})/L_{22} & \sqrt{Q_{33} - L_{31}^2 - L_{32}^2} \\
\end{pmatrix}$.
This can be generalised into the following formulas for the elements of L,
$L_{j, j} = \sqrt{A_{j,j} - \sum_{k=1}^{j-1}L_{j,k}^2}$,
$L_{i, j} = \frac{1}{L_{j,j}} \bigg(Q_{i, j} - \sum_{k=1}^{j-1}L_{i,k}L_{j,k}\bigg)$ for $i>j$.
These recurrence relations can be coded in Python as,
L = np.zeros(Q.shape) for i in range(L.shape[0]): for j in range(L.shape[1]): if i == j: L[i, j] = np.sqrt(Q[i, j] - np.sum([pow(L[j, k], 2) for k in range(j)])) elif i > j: L[i, j] = (1/L[j, j])*(Q[i, j] - np.sum([L[i, k]*L[j, k] for k in range(j)]))
where L has the same shape as Q by definition. This code effectively completes the same job that numpy and scipy's 'linalg.cholesky' achieves and gives the decomposition of Q.
I proceeded to compare the ability of my code to accurately return the identity matrix in comparison with that from numpy's cholesky decomposition. The code for this is available at https://github.com/htjb/Blog/tree/master/Cholesky_decomposition.
To build my Q matrix and ensure it is positive-definite, before I decompose it and transform it to the identity matrix, I use the following code,
x = np.linspace(40, 120, 100) x_ = (x-x.min())/(x.max()-x.min()) N = 9 phi = np.empty([len(x), N]) for i in range(len(x)): for l in range(N): phi[i, l] = x_[i]**(l) Q = phi.T @ phi
where N defines the dimension of Q and x is arbitrarily chosen. 'phi' is effectively a matrix of basis functions and this was just a convenient choice given the larger application of the Cholesky decomposition I have mentioned at the beginning of this post.
x here is normalised to the interval [0, 1]. I have found that this type of normalisation on x gives the most consistent results and is the most likely to return perfectly the identity matrix for any N and any range of x as opposed to say something like $x/x_{max}$.
The graphs below show the elements of Q' as produced with L from both my code and numpy's linalg library. The results show that both approaches return L to a sufficient degree of accuracy to transform Q into the identity matrix for N = 4, N = 9 and N = 11.
Beyond N = 11 we begin to encounter machine precision related issues that cause deviations in Q' from the identity matrix for both my code and numpy. However, the deviations are usually in the elements of Q' with $i \geq 10$ and are only minor.
The machine precision related issues can also result in Q matrices that 'are not positive definite' in the sense that elements of Q maybe sufficiently below the machine precision limit, usually around 1e-07, that we find it has negative eigenvalues. These issues can be corrected by adding $1e-07 \times I$ to Q where $I$ is the identity matrix however this is ill advised as it can significantly skew the values of Q.
For practical purposes it is unlikely that we would need to go to high enough N to encounter these issues. For example in my case N represents the order of a polynomial fit and so N = 11 is already high enough to be over fitting my data.
This is a fun little project to have a go at and definitely leads to a better understanding of how the Cholesky decomposition and it's relationship to the identity matrix work.
Thanks for reading!
Comments
Post a Comment