Table of Contents
Overview#
The matrix expression \(ABA^T\) appears frequently in linear algebra, statistics, and machine learning. Understanding its properties and applications is essential for working with covariance matrices, transformations, and decompositions.
Basic Form#
$$ C = ABA^T $$Where:
- \(A\): Transformation matrix
- \(B\): Original matrix
- \(C\): Transformed result
Key Properties#
Symmetry Preservation#
If \(B\) is symmetric (\(B = B^T\)), then \(ABA^T\) is also symmetric:
$$ (ABA^T)^T = (A^T)^T B^T A^T = AB^T A^T = ABA^T $$Positive Semi-Definiteness#
If \(B\) is positive semi-definite, so is \(ABA^T\):
For any vector \(\mathbf{x}\):
$$ \mathbf{x}^T (ABA^T) \mathbf{x} = (A^T\mathbf{x})^T B (A^T\mathbf{x}) \geq 0 $$Application 1: Symmetric Matrix Generation#
When \(A\) and \(B\) satisfy certain conditions, \(ABA^T\) generates symmetric matrices useful in optimization algorithms.
Example#
Given arbitrary matrix \(M\):
$$ B = M^T M \quad \text{(always symmetric)} $$$$ C = A(M^TM)A^T $$Result is guaranteed symmetric.
Application 2: Transformation Stability#
Rotation and Reflection#
When \(A\) is orthogonal (\(AA^T = I\)):
$$ ABA^T $$Rotates/reflects the “shape” defined by \(B\).
Example: Rotating Covariance#
Original covariance:
$$ \Sigma = \begin{pmatrix} \sigma_x^2 & 0 \\ 0 & \sigma_y^2 \end{pmatrix} $$Rotated by angle \(\theta\):
$$ R = \begin{pmatrix} \cos\theta & -\sin\theta \\ \sin\theta & \cos\theta \end{pmatrix} $$$$ \Sigma' = R\Sigma R^T $$Application 3: Eigenvalue Decomposition and PCA#
Connection to SVD#
For matrix \(X\) with SVD:
$$ X = U\Sigma V^T $$The covariance matrix:
$$ X^TX = V\Sigma^2 V^T = V\Lambda V^T $$This is the \(ABA^T\) form with:
- \(A = V\)
- \(B = \Lambda\) (diagonal eigenvalues)
PCA Interpretation#
Principal components:
$$ \Sigma = Q\Lambda Q^T $$Where:
- \(Q\): Eigenvectors (principal directions)
- \(\Lambda\): Eigenvalues (variances)
Application 4: Graph Theory#
Adjacency and Laplacian#
Given incidence matrix \(A\) and weight matrix \(W\):
$$ L = AW A^T $$This gives the weighted Laplacian matrix:
- Diagonal: Node degrees
- Off-diagonal: Connection strengths
Application 5: Normalization and Scaling#
Whitening Transform#
To decorrelate data with covariance \(\Sigma\):
- Decompose: \(\Sigma = Q\Lambda Q^T\)
- Whitening matrix: \(W = \Lambda^{-1/2}Q^T\)
- Whitened covariance: \(W\Sigma W^T = I\)
Neural Network Normalization#
Batch normalization involves similar transformations to standardize activations.
Concrete Examples#
Example 1: Rotation and Scaling#
Rotation matrix (45°):
$$ A = \begin{pmatrix} \frac{\sqrt{2}}{2} & -\frac{\sqrt{2}}{2} \\ \frac{\sqrt{2}}{2} & \frac{\sqrt{2}}{2} \end{pmatrix} $$Scale matrix:
$$ B = \begin{pmatrix} 4 & 0 \\ 0 & 1 \end{pmatrix} $$Result \(ABA^T\) is rotated ellipse covariance.
Example 2: SVD Application#
For data matrix \(X\):
- \(A = U\) (left singular vectors)
- \(B = \Sigma\) (singular values)
- \(A^T = V^T\) (right singular vectors)
Used for dimensionality reduction, extracting principal components.
Example 3: Neural Network Layers#
Weight transformation with normalization:
$$ W_{normalized} = \gamma \cdot \frac{W - \mu}{\sqrt{\sigma^2 + \epsilon}} + \beta $$Internally uses covariance-like operations.
Summary#
| Application | \(A\) | \(B\) | Purpose |
|---|---|---|---|
| Rotation | Rotation matrix | Covariance | Rotate distribution |
| PCA | Eigenvectors | Eigenvalues | Extract features |
| SVD | Singular vectors | Singular values | Decomposition |
| Whitening | Decorrelation | Original cov | Normalize data |
| Graphs | Incidence | Weights | Laplacian matrix |