Abstract Background Clinical notes are the most abundant data type within electronic health records; however, their highly unstructured format presents significant challenges for supervised natural language processing (NLP) methods. The NLP community is increasingly adapting large language models to analyze clinical notes, achieving strong performance and generalizability with minimal task-specific fine-tuning. We conducted a scoping review of NLP methods applied to clinical notes prior to widespread adoption of generative artificial intelligence (AI) to establish a pre–large language model methodological baseline, showcase potential clinical utility, and highlight key challenges and limitations of extractive, supervised techniques that generative AI approaches may need to overcome. Objective This review aimed (1) to characterize the clinical notes used, (2) to identify NLP techniques used to analyze these notes, (3) to determine the clinical applications of NLP in cancer research and patient care, and (4) to highlight challenges and limitations of traditional pregenerative AI methods. Methods We systematically searched MEDLINE, Embase, Scopus, and Web of Science for English-language studies published from January 1, 2014, to March 8, 2024. Retrieved references were imported into Covidence, a web-based platform that streamlines management of reviews. Two authors (ABK and HRAE) independently screened studies for eligibility and extracted data using a predefined data extraction template. Results A total of 226 studies were included in the review. Research using NLP to derive insights from clinical notes grew significantly, from 4 studies in 2014 to 43 in 2023. NLP methods have evolved from predominantly rule-based and ontology-driven approaches (2014-2017) to hybrid approaches that combine these with deep neural models such as Bidirectional Encoder Representations from Transformers (2018-2024). Most studies (161/226, 71.2%) developed their systems using small, single-institution datasets. Supervised learning approaches with manually annotated corpora were predominant (181/226, 80.1%). Most studies (174/226, 77%) focused on information extraction, with a few applying the extracted data to downstream tasks such as diagnostic and prognostic classification. Clinical domain pretrained models outperformed general domain pretrained models in the majority (11/16, 68.8%) of studies that evaluated multiple model types. In total, 25 studies compared their NLP-based systems with current practice in their respective clinical settings and reported potential benefits, including improved data coverage and completeness, faster information extraction, and improved classification or prediction accuracy. No studies evaluated the utility or impact of their systems in real-world clinical practice. The most common challenges reported by authors were restricted access to clinical notes (n=39) and limited data (n=18). Conclusions The application of NLP to clinical notes in oncology has expanded, but most studies focus on information extraction rather than downstream clinical tasks. Oncology NLP has the potential to advance cancer research and patient care, but barriers remain to robust evaluation and clinical deployment of promising tools. Emerging generative AI approaches will need to overcome these challenges to deliver real-world impact.
Kayira et al. (Thu,) studied this question.