
Identification of the Esophageal Mucosa in Heller Myotomy Procedures
Computer Vision–Based Identification of Esophageal Mucosa in Heller Myotomy Videos

Heller myotomy is a minimally invasive surgical procedure used to treat esophageal achalasia. A critical step in the operation is accurately identifying the esophageal mucosa while cutting the surrounding muscular layers. Failure to correctly identify the mucosa can result in perforation and serious complications.
The aim of this project is to develop a computer vision solution that operates on surgical videos of Heller myotomy procedures to assist in the identification of the esophageal mucosa. Using laparoscopic or robotic surgical video data, students will design, implement, and evaluate vision-based methods to detect, segment, or highlight the mucosal layer in real time or offline video analysis.
This project focuses on applying core computer vision techniques—such as image segmentation, object detection, temporal modeling, and deep learning—to a real-world, safety-critical medical application.
Mentor Details:
Prof. Yoav Mintz
Mentor Details:
Requirments:
Students will aim to:
Analyze surgical video data and identify relevant visual cues
Develop a computer vision pipeline for mucosa identification
Apply deep learning–based models (e.g., CNNs, Transformers) to image or video data
Explore temporal consistency across video frames
Evaluate performance using appropriate computer vision metrics
Problem Statement
Surgical videos present unique challenges for computer vision systems:
Variable lighting and reflections
Occlusions by surgical instruments
Motion blur and camera movement
High inter-patient anatomical variability
In Heller myotomy, the visual differences between muscular layers and mucosa can be subtle, making automated identification particularly challenging. The problem is to design a vision-based system that can reliably distinguish the esophageal mucosa from surrounding tissue in surgical video frames or sequences.
Project Objectives
Students will aim to:
Analyze surgical video data and identify relevant visual cues
Develop a computer vision pipeline for mucosa identification
Apply deep learning–based models (e.g., CNNs, Transformers) to image or video data
Explore temporal consistency across video frames
Evaluate performance using appropriate computer vision metrics
Technical Scope
The project may include one or more of the following tasks:
Image or video segmentation of esophageal layers
Object detection or region proposal for mucosal areas
Temporal modeling using optical flow, 3D CNNs, or recurrent architectures
Self-supervised or weakly supervised learning (if annotations are limited)
Model robustness analysis under varying lighting and motion conditions
Required Knowledge and Prerequisites
Core Requirements
Familiarity with fundamental computer vision concepts
Experience with convolutional neural networks (CNNs)
Basic understanding of deep learning frameworks (e.g., PyTorch, TensorFlow)
Ability to work with image and video datasets
Recommended Background
Image segmentation and detection architectures (e.g., U-Net, Mask R-CNN)
Video processing techniques
Model evaluation metrics (IoU, precision, recall, F1-score)
No prior medical or surgical knowledge is required; necessary clinical background will be provided.
Project Difficulty and Expected Level
Vision complexity: High (real-world, noisy, non-curated video data)
Modeling complexity: Moderate to high, depending on chosen approach
Domain knowledge: Low (medical expertise not required)
This project is well-suited for:
Teams of 2–4 students
Expected Outcomes
A working computer vision prototype for mucosa identification
Quantitative evaluation of model performance on surgical videos
Clear discussion of failure cases and limitations
Well-documented code and a technical report
Educational Value
This project exposes students to:
Real-world video-based computer vision challenges
Safety-critical applications of AI
Dataset bias, annotation limitations, and robustness concerns
Translating abstract vision techniques into applied systems