Table of Contents

Overview
#

Depthwise Convolution is a key technique used in efficient neural network architectures like MobileNet-V2.

Standard Convolution vs Depthwise Separable Convolution
#

Standard Convolution
#

Traditional convolution processes all input channels together with a single kernel.

  • Input: \(H \times W \times C_{in}\)
  • Kernel: \(K \times K \times C_{in} \times C_{out}\)
  • Operations: \(H \times W \times K^2 \times C_{in} \times C_{out}\)

Depthwise Separable Convolution
#

Splits convolution into two steps:

1. Depthwise Convolution

  • Applies separate kernel to each input channel individually
  • Kernel: \(K \times K \times 1\) per channel
  • Operations: \(H \times W \times K^2 \times C_{in}\)

2. Pointwise Convolution (1×1 Conv)

  • Combines channel information
  • Kernel: \(1 \times 1 \times C_{in} \times C_{out}\)
  • Operations: \(H \times W \times C_{in} \times C_{out}\)

Computational Cost Comparison
#

$$ \text{Reduction Ratio} = \frac{1}{C_{out}} + \frac{1}{K^2} $$

For typical values (\(K=3\), \(C_{out}=256\)):

$$ \frac{1}{256} + \frac{1}{9} \approx 0.115 $$

~8-9x fewer operations compared to standard convolution.

Key Benefits
#

BenefitDescription
Reduced ComputationSignificantly fewer multiply-add operations
Smaller Model SizeFewer parameters to store
Edge DeploymentEnables deployment on embedded systems
Mobile OptimizationCore technique in MobileNet series

Applications
#

  • MobileNet-V1, V2, V3
  • EfficientNet
  • Edge AI / IoT devices
  • Real-time mobile applications