Table of Contents

Overview
#

The matrix expression \(ABA^T\) appears frequently in linear algebra, statistics, and machine learning. Understanding its properties and applications is essential for working with covariance matrices, transformations, and decompositions.

Basic Form
#

$$ C = ABA^T $$

Where:

  • \(A\): Transformation matrix
  • \(B\): Original matrix
  • \(C\): Transformed result

Key Properties
#

Symmetry Preservation
#

If \(B\) is symmetric (\(B = B^T\)), then \(ABA^T\) is also symmetric:

$$ (ABA^T)^T = (A^T)^T B^T A^T = AB^T A^T = ABA^T $$

Positive Semi-Definiteness
#

If \(B\) is positive semi-definite, so is \(ABA^T\):

For any vector \(\mathbf{x}\):

$$ \mathbf{x}^T (ABA^T) \mathbf{x} = (A^T\mathbf{x})^T B (A^T\mathbf{x}) \geq 0 $$

Application 1: Symmetric Matrix Generation
#

When \(A\) and \(B\) satisfy certain conditions, \(ABA^T\) generates symmetric matrices useful in optimization algorithms.

Example
#

Given arbitrary matrix \(M\):

$$ B = M^T M \quad \text{(always symmetric)} $$$$ C = A(M^TM)A^T $$

Result is guaranteed symmetric.

Application 2: Transformation Stability
#

Rotation and Reflection
#

When \(A\) is orthogonal (\(AA^T = I\)):

$$ ABA^T $$

Rotates/reflects the “shape” defined by \(B\).

Example: Rotating Covariance
#

Original covariance:

$$ \Sigma = \begin{pmatrix} \sigma_x^2 & 0 \\ 0 & \sigma_y^2 \end{pmatrix} $$

Rotated by angle \(\theta\):

$$ R = \begin{pmatrix} \cos\theta & -\sin\theta \\ \sin\theta & \cos\theta \end{pmatrix} $$$$ \Sigma' = R\Sigma R^T $$

Application 3: Eigenvalue Decomposition and PCA
#

Connection to SVD
#

For matrix \(X\) with SVD:

$$ X = U\Sigma V^T $$

The covariance matrix:

$$ X^TX = V\Sigma^2 V^T = V\Lambda V^T $$

This is the \(ABA^T\) form with:

  • \(A = V\)
  • \(B = \Lambda\) (diagonal eigenvalues)

PCA Interpretation
#

Principal components:

$$ \Sigma = Q\Lambda Q^T $$

Where:

  • \(Q\): Eigenvectors (principal directions)
  • \(\Lambda\): Eigenvalues (variances)

Application 4: Graph Theory
#

Adjacency and Laplacian
#

Given incidence matrix \(A\) and weight matrix \(W\):

$$ L = AW A^T $$

This gives the weighted Laplacian matrix:

  • Diagonal: Node degrees
  • Off-diagonal: Connection strengths

Application 5: Normalization and Scaling
#

Whitening Transform
#

To decorrelate data with covariance \(\Sigma\):

  1. Decompose: \(\Sigma = Q\Lambda Q^T\)
  2. Whitening matrix: \(W = \Lambda^{-1/2}Q^T\)
  3. Whitened covariance: \(W\Sigma W^T = I\)

Neural Network Normalization
#

Batch normalization involves similar transformations to standardize activations.

Concrete Examples
#

Example 1: Rotation and Scaling
#

Rotation matrix (45°):

$$ A = \begin{pmatrix} \frac{\sqrt{2}}{2} & -\frac{\sqrt{2}}{2} \\ \frac{\sqrt{2}}{2} & \frac{\sqrt{2}}{2} \end{pmatrix} $$

Scale matrix:

$$ B = \begin{pmatrix} 4 & 0 \\ 0 & 1 \end{pmatrix} $$

Result \(ABA^T\) is rotated ellipse covariance.

Example 2: SVD Application
#

For data matrix \(X\):

  • \(A = U\) (left singular vectors)
  • \(B = \Sigma\) (singular values)
  • \(A^T = V^T\) (right singular vectors)

Used for dimensionality reduction, extracting principal components.

Example 3: Neural Network Layers
#

Weight transformation with normalization:

$$ W_{normalized} = \gamma \cdot \frac{W - \mu}{\sqrt{\sigma^2 + \epsilon}} + \beta $$

Internally uses covariance-like operations.

Summary
#

Application\(A\)\(B\)Purpose
RotationRotation matrixCovarianceRotate distribution
PCAEigenvectorsEigenvaluesExtract features
SVDSingular vectorsSingular valuesDecomposition
WhiteningDecorrelationOriginal covNormalize data
GraphsIncidenceWeightsLaplacian matrix