
Measurements
Computer Vision–Based 3D Measurement and Spatial Quantification from Laparoscopic Videos

Accurate spatial measurements during laparoscopic surgery are essential for surgical planning, intra-operative decision making, and objective documentation of outcomes. Surgeons frequently estimate distances, areas, and volumes (e.g., defect size, resection margins, anatomical spacing) based on visual judgment, which can be subjective and inconsistent.
The aim of this project is to develop a computer vision system that extracts 3D measurements from laparoscopic video. Using monocular laparoscopic footage, students will design, implement, and evaluate vision-based methods that reconstruct scene geometry and compute clinically relevant spatial measurements from video data.
Mentor Details:
Prof. Yoav Mintz
Mentor Details:
Requirments:
Students will aim to:
Analyze laparoscopic video data and understand camera geometry constraints
Develop a computer vision pipeline for depth estimation and 3D reconstruction
Apply deep learning–based models (e.g., monocular depth networks, neural SLAM)
Leverage temporal consistency to stabilize measurements across frames
Quantitatively evaluate 3D measurement accuracy
Problem Statement
Laparoscopic videos present significant challenges for reliable 3D measurement:
Monocular video with limited or unknown scale
Moving camera with changing zoom and orientation
Non-rigid, deformable anatomy
Specular reflections and texture-poor surfaces
Partial occlusion by surgical instruments
Inferring accurate 3D structure and scale from such data is non-trivial. The problem is to design a vision-based system that can robustly estimate depth and spatial dimensions from laparoscopic video frames or sequences under real surgical conditions.
Project Objectives
Students will aim to:
Analyze laparoscopic video data and understand camera geometry constraints
Develop a computer vision pipeline for depth estimation and 3D reconstruction
Apply deep learning–based models (e.g., monocular depth networks, neural SLAM)
Leverage temporal consistency to stabilize measurements across frames
Quantitatively evaluate 3D measurement accuracy
Technical Scope
The project may include one or more of the following components:
Monocular or stereo depth estimation from laparoscopic video
Camera calibration and scale recovery using known instruments or markers
Structure-from-motion or SLAM-based 3D reconstruction
Geometric measurement of distances, areas, or volumes
Temporal filtering and uncertainty estimation
Required Knowledge and Prerequisites
Core Requirements
Familiarity with fundamental computer vision concepts
Experience with convolutional neural networks (CNNs)
Basic understanding of deep learning frameworks (e.g., PyTorch, TensorFlow)
Ability to work with image and video datasets
Recommended Background
Camera geometry and projective transformations
Depth estimation and 3D reconstruction
Video-based modeling and tracking
Evaluation metrics for regression and spatial accuracy
No prior surgical knowledge is required; relevant procedural context will be provided.
Project Difficulty and Expected Level
Vision complexity: High (monocular depth, deformable scenes)
Modeling complexity: High
Domain knowledge: Low
This project is well-suited for:
Teams of 2–4 students
Expected Outcomes
A working computer vision prototype for 3D measurement from laparoscopic video
Quantitative evaluation against reference measurements or phantoms
Analysis of accuracy, robustness, and failure modes
Well-documented codebase and a technical report