Abstract Purpose Preserving neurovascular bundles (NVB) during robot-assisted radical prostatectomy (RARP) is vital for reducing postoperative complications such as urinary incontinence and erectile dysfunction. Building on our previous work in ensemble-based NVB classification, we propose the hybrid self-supervised teacher–student model (Hybrid T–S model) that leverages multi-task learning to predict NVB preservation in prostatectomy videos. Methods Our approach integrates a self-supervised framework (DINO) as an online self-distillation objective on multi-crop views to learn robust embeddings in a limited data setting, rather than as a stand-alone large-scale pretraining. A teacher encoder , which is an exponential moving average (EMA) of the student encoder , and a reconstruction decoder are trained jointly with a classification head in a single end-to-end framework. This model is evaluated on single frames from patients who underwent RARP surgery. Results Our experimental evaluation shows that the Hybrid T–S model outperforms previous NVB classification methods. This highlights the benefits of integrating self-supervised learning and multi-task objectives in this surgical context. We achieved an average accuracy of 86.55%, precision of 83.93%, recall of 90.73%, F1-score of 87%, and AUROC of 88.35%, based on fivefold cross-validation. Conclusion Incorporating representation learning through self-distillation, classification, and reconstruction provides complementary signals that enhance the prediction of NVB preservation. Our Hybrid T–S model can assist surgeons in real decision-making and improve patient recovery.
Moraga et al. (Thu,) studied this question.