Segmentation types
| Type | Description | Example |
|---|---|---|
| Semantic | Each pixel gets a class label; no distinction between instances | All cars are “car” |
| Instance | Separate mask per object instance | Car #1, Car #2 |
| Panoptic | Combines semantic + instance segmentation | Background classes + counted objects |
UNet architecture
UNet follows an encoder-decoder design with skip connections that copy feature maps from the encoder directly to the corresponding decoder level.Encoder path (contracting)
The encoder applies repeated blocks of:- Two convolutions + ReLU
- max pooling (stride 2) — halves spatial dimensions, doubles channels
Bottleneck
The deepest layer captures the most abstract, high-level features without spatial pooling.Decoder path (expanding)
The decoder mirrors the encoder:- transposed convolution (upsampling) — doubles spatial dimensions
- Concatenation with the skip connection from the corresponding encoder level
- Two convolutions + ReLU
Output layer
A convolution maps the 64-channel feature maps tonum_classes channels, followed by softmax (or sigmoid for binary segmentation).
PyTorch implementation
Loss functions for segmentation
Cross-entropy loss
The standard choice for multi-class segmentation. Applied pixel-wise: Can be weighted per class to handle class imbalance (background vs. small objects).Dice loss
Dice loss directly optimizes the Dice coefficient (F1 score at pixel level), which is more robust to class imbalance: where are predicted probabilities and are ground truth labels.Training and evaluation
Resources
UNet Segmentation Examples
Colab notebook with UNet segmentation on real datasets.
Exercise E08: Segmentation with UNet
Hands-on exercise: train UNet for image segmentation.
Original UNet Paper
Ronneberger et al. (2015) — the original UNet paper for biomedical image segmentation.
Video: UNet, GAN & Anomaly Detection
Recorded lecture covering UNet, GANs, and anomaly detection.
