Objective: Achieving reliable automatic left ventricle (LV) segmentation from echocardiograms is challenging due to the inherent sparsity of annotations in the dataset, as clinicians typically only annotate two specific frames for diagnostic purposes. Here we aim to address this challenge by introducing simplified LV segmentation (SimLVSeg), a novel paradigm that enables video-based networks for consistent LV segmentation from sparsely annotated echocardiogram videos.
Methods: SimLVSeg consists of two training stages: (i) self-supervised pre-training with temporal masking, which involves pre-training a video segmentation network by capturing the cyclic patterns of echocardiograms from largely unannotated echocardiogram frames, and (ii) weakly supervised learning tailored for LV segmentation from sparse annotations.