Performing unsupervised anomaly detection in retinal optical coherence tomography (OCT) images involves training a model solely on anomaly-free samples and detecting anomalies during inference, which reduces the cost of collecting large-scale annotated anomalous data. However, retinal OCT images exhibit significant variations in shape, thickness, and orientation, and lesions often have similar reflectance signals as normal tissues, making anomaly localization highly challenging. Existing methods address these challenges by flattening retinal layers, normalizing thickness, or leveraging reflectance priors, but their reliance on complex pre- and post-processing introduces uncertainties and limits end-to-end clinical applicability. To overcome these issues, we propose a novel semantic augmentation variational autoencoder (SeAugVAE) for unsupervised anomaly detection in retinal OCT images. Specifically, to capture the anatomical variability of normal retinas and thereby enhance anomaly sensitivity, we introduce a self-supervised semantic data augmentation strategy that enforces dual distribution consistency in both image and feature spaces during VAE training. For precise anomaly localization, we develop structural-semantic anomaly attention maps in the inference phase to detect anomalies from both local and global perspectives, and combine them to calculate anomaly score maps as the metric for localizing anomalous regions in images. Extensive experiments on multiple publicly and privately collected Cirrus and Spectralis OCT datasets demonstrate the effectiveness of SeAugVAE in pixel-wise unsupervised anomaly detection across multiple retinal diseases. Our codes are available at https://github.com/xyzhou1121/SeAugVAE.
Zhou et al. (Thu,) studied this question.