January 25, 2025Open Access

Surgical video workflow analysis via visual-language learning

Key Points

Key points are not available for this paper at this time.

Abstract

Abstract Surgical video workflow analysis has made intensive development in computer-assisted surgery by combining deep learning models, aiming to enhance surgical scene analysis and decision-making. However, previous research has primarily focused on coarse-grained analysis of surgical videos, e.g., phase recognition, instrument recognition, and triplet recognition that only considers relationships within surgical triplets. In order to provide a more comprehensive fine-grained analysis of surgical videos, this work focuses on accurately identifying triplets from surgical videos. Specifically, we propose a vision-language deep learning framework that incorporates intra- and inter- triplet modeling, termed I 2 TM, to explore the relationships among triplets and leverage the model understanding of the entire surgical process, thereby enhancing the accuracy and robustness of recognition. Besides, we also develop a new surgical triplet semantic enhancer (TSE) to establish semantic relationships, both intra- and inter-triplets, across visual and textual modalities. Extensive experimental results on surgical video benchmark datasets demonstrate that our approach can capture finer semantics, achieve effective surgical video understanding and analysis, with potential for widespread medical applications.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Cite this study

Li et al. (Sat,) studied this question.

www.synapsesocial.com/papers/69dbe1d6eb8801008ea3c196 — DOI: https://doi.org/10.1038/s44401-024-00010-3

Also consider

Synapse has enriched 5 closely related papers on similar clinical questions. Consider them for comparative context:

Rethinking ImageNet Pre-Training· 2019 · 1,008 citations
Spectral imaging enables contrast agent–free real-time ischemia monitoring in laparoscopic surgery· 2023 · 37 citations
Autonomous robotic laparoscopic surgery for intestinal anastomosis· 2022 · 341 citations
Nineteen-year trends in incidence and indications for laparoscopic cholecystectomy: the NY State experience· 2016 · 62 citations
Recognition of Instrument-Tissue Interactions in Endoscopic Videos via Action Triplets

Authors

Pengpeng Li

Xiangbo Shu

Chun-Mei Feng

Actions

Institutions

Agency for Science, Technology and Research

Harbin Institute of Technology

Nanjing Medical University

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Surgical video workflow analysis via visual-language learning

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Cite this study

Also consider

Authors

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion