Multiple View Geometry

Structure From Motion

Structure From Motion

Structure from motion (SfM) is the process of estimating the 3D structure of a scene from a set of 2D images. It is a fundamental problem in computer vision and photogrammetry. Core steps of SfM include:

Feature detection: Identify points of interest within images that can be tracked across a series;

Feature matching: find scene points seen by multiple cameras;

Reconstruction: Robustly estimate camera poses and triangulation;

Refinement (Bundle Adjustment): Refine camera poses , and scene structure .

COLAMP

COLAMP is a robust and efficient algorithm for SfM. It is based on the idea of local bundle adjustment and global optimization. Detail of the algorithm can be found here.

FlowMap

FlowMap is a method for dense 3D reconstruction from a set of images. It is based on the idea of optical flow and depth map estimation. Detail of the algorithm can be found here.

Neural Radiance Fields

Neural Radiance Fields (NeRF) is a method for synthesizing novel views of complex scenes. It is based on the idea of learning a continuous 5D implicit representation of the scene. It involves following steps:

Let take photos;

Use COLMAP (SfMs) to calculate camera poses, i.e. detect and match features across the images;

Volumetric formulation of NeRF:

Select one pixel with its corresponding ray;

Ray tracing for this pixel (straight ray);

Construct two MLPs: one for the chance the ray hits a particle, and one for the color.

Accumulation for this ray and get its rendered pixel.

Detail of the algorithm can be found here.

A Framework Unifying All 3D Vision Tasks

DUSt3R

DUSt3R is a method for dense 3D reconstruction from a set of images. It is based on the idea of depth map estimation and surface reconstruction. The input to the model is multiple camera views (without intrinsic), and the output is corresponding point maps. Detail of the algorithm can be found here.

Science Done Right

Explorer

Multiple View Geometry

Structure From Motion

A Framework Unifying All 3D Vision Tasks

Graph View

Table of Contents

Backlinks