Skip to main content
When a 3D scene is observed from multiple cameras (or from a single moving camera), geometric constraints link the corresponding image points across views. Exploiting these constraints is the basis of stereo vision, structure from motion, and multi-view 3D reconstruction.

Epipolar geometry

Consider two cameras with centres C1\mathbf{C}_1 and C2\mathbf{C}_2 observing a 3D point M\mathbf{M}. The three points C1\mathbf{C}_1, C2\mathbf{C}_2, and M\mathbf{M} define the epipolar plane. Its intersections with the two image planes are the epipolar lines 1\ell_1 and 2\ell_2. Key insight: given a point m1\mathbf{m}_1 in image 1, its corresponding point m2\mathbf{m}_2 in image 2 must lie on the epipolar line 2=Fm1\ell_2 = F\,\mathbf{m}_1. This reduces the correspondence search from 2D to 1D.

Epipoles

The epipole ei\mathbf{e}_i is the projection of one camera centre into the other image:
  • e2=P2C1\mathbf{e}_2 = P_2\,\mathbf{C}_1 — projection of C1\mathbf{C}_1 into camera 2
  • e1=P1C2\mathbf{e}_1 = P_1\,\mathbf{C}_2 — projection of C2\mathbf{C}_2 into camera 1
All epipolar lines in image 2 pass through e2\mathbf{e}_2, and all epipolar lines in image 1 pass through e1\mathbf{e}_1.

Fundamental matrix FF

The fundamental matrix FF is the 3×33\times 3 matrix that encodes the epipolar geometry between two uncalibrated cameras: m2Fm1=0 corresponding pairs (m1,m2)\mathbf{m}_2^\top F\,\mathbf{m}_1 = 0 \quad \forall \text{ corresponding pairs } (\mathbf{m}_1, \mathbf{m}_2) Properties:
  • FF has rank 2 (singular matrix), 7 degrees of freedom (up to scale).
  • Epipolar lines: 2=Fm1\ell_2 = F\,\mathbf{m}_1 and 1=Fm2\ell_1 = F^\top\,\mathbf{m}_2.
  • Epipoles satisfy Fe1=0F\,\mathbf{e}_1 = \mathbf{0} and Fe2=0F^\top\,\mathbf{e}_2 = \mathbf{0}.

8-point algorithm

Each correspondence m1(i)m2(i)\mathbf{m}_1^{(i)} \leftrightarrow \mathbf{m}_2^{(i)} gives one linear equation in the 9 entries of FF. With 8 or more correspondences: Af=0,f=vec(F)A\,\mathbf{f} = \mathbf{0}, \qquad \mathbf{f} = \text{vec}(F) Solve via SVD; then enforce rank-2 by zeroing the smallest singular value of the 3×33\times 3 result.

Essential matrix EE

For calibrated cameras (intrinsic matrices K1K_1, K2K_2 known), the essential matrix EE relates normalised image coordinates: m^2Em^1=0,m^i=Ki1mi\hat{\mathbf{m}}_2^\top E\,\hat{\mathbf{m}}_1 = 0, \qquad \hat{\mathbf{m}}_i = K_i^{-1}\mathbf{m}_i EE and FF are related by: E=K2FK1E = K_2^\top F\,K_1 EE has 5 degrees of freedom (3 for rotation, 2 for translation direction) and satisfies EEE=12trace(EE)EEE^\top E = \frac{1}{2}\text{trace}(EE^\top)\,E.

Triangulation and 3D reconstruction

Given the projection matrices P1P_1, P2P_2 and a correspondence m1m2\mathbf{m}_1 \leftrightarrow \mathbf{m}_2, the 3D point M\mathbf{M} is recovered by triangulation: m1P1Mandm2P2M\mathbf{m}_1 \sim P_1\mathbf{M} \qquad \text{and} \qquad \mathbf{m}_2 \sim P_2\mathbf{M} This is a linear system in M\mathbf{M} solvable via SVD (DLT). Due to noise the two rays from P1P_1 and P2P_2 do not intersect exactly; the optimal M\mathbf{M} minimises the sum of squared reprojection errors.

Trifocal geometry

With three views and projection matrices P1P_1, P2P_2, P3P_3, a point visible in views 1 and 2 can be located in view 3 using two fundamental matrices: 13=F13m1,23=F23m2,m3=13×23\ell_{13} = F_{13}\,\mathbf{m}_1, \qquad \ell_{23} = F_{23}\,\mathbf{m}_2, \qquad \mathbf{m}_3 = \ell_{13} \times \ell_{23} The point in the third view is the intersection of the two epipolar lines, computed as the cross product of the two line vectors.
This makes 3D localisation from three views straightforward: click a point in views 1 and 2, compute the two epipolar lines in view 3, and intersect them to get the predicted location — without explicit triangulation.

MATLAB code examples

% Two-camera epipolar geometry visualisation
% Camera 1 (left): projection matrix A = K1 * P1 * H1
H1 = [0 0 1 0; 1 0 0 -350; 0 1 0 0; 0 0 0 1];
P1 = [700 0 0 0; 0 700 0 0; 0 0 1 0];
K1 = [1 0 0; 0 1 350; 0 0 1];
A  = K1 * P1 * H1;

% Camera 2 (right): projection matrix B = K2 * P2 * H2
H2 = [0 1 0 -350; 0 0 1 0; 1 0 0 0; 0 0 0 1];
P2 = [700 0 0 0; 0 700 0 0; 0 0 1 0];
K2 = [1 0 350; 0 1 0; 0 0 1];
B  = K2 * P2 * H2;

% Project a 3D point M and find its epipolar line in camera 2
M  = [350, 500, 200, 1]';
w1 = A*M;  u1 = w1(1)/w1(3);  v1 = w1(2)/w1(3);
w2 = B*M;  u2 = w2(1)/w2(3);  v2 = w2(2)/w2(3);

fprintf('Point in cam1: (%.1f, %.1f)\n', u1, v1);
fprintf('Point in cam2: (%.1f, %.1f)\n', u2, v2);

Python resources

Epipolar geometry (Colab)

Interactive notebook: fundamental matrix estimation, epipolar line visualisation, and stereo matching.

3D reconstruction (Colab)

Reconstruct 3D point clouds from stereo image pairs using triangulation.

Trifocal geometry (Colab)

Transfer points across three views using the trifocal tensor and fundamental matrices.

Video lectures

Lecture: Epipolar geometry (2021)

Recorded class on the epipolar constraint, fundamental matrix, and the 8-point algorithm.

Lecture: Trifocal geometry and multiple views (2021)

Recorded class on trifocal geometry, multi-view applications, and 3D reconstruction.

Concepts at a glance

The fundamental matrix FF works with pixel coordinates and does not require knowledge of the camera intrinsics. The essential matrix EE works with normalised (metric) image coordinates and embeds the intrinsics — it has only 5 DOF versus 7 for FF. If K1K_1 and K2K_2 are known, use EE; otherwise use FF.
The epipolar constraint m2Fm1=0\mathbf{m}_2^\top F\,\mathbf{m}_1 = 0 must hold for all points on the epipole — the epipoles are in the left and right null spaces of FF. A rank-2 matrix has a non-trivial null space, so Fe1=0F\,\mathbf{e}_1 = \mathbf{0} and Fe2=0F^\top\,\mathbf{e}_2 = \mathbf{0}.
Accuracy depends on the baseline (distance between cameras) and the image noise. A wider baseline gives better depth resolution but increases the chance of occlusion. Noise in the correspondences translates directly into 3D error; the depth error grows as Z2/(fb)Z^2 / (f \cdot b) where bb is the baseline.
The trifocal tensor generalises the fundamental matrix to three views. It is a 3×3×33\times 3\times 3 array that encodes all point and line transfer relationships across three views simultaneously. For point transfer using pairs of fundamental matrices (as in the MATLAB example above) the full tensor is not needed, but for line transfer and other constraints it provides a more complete model.