Abstract Objective To evaluate CigStopper, a machine learning algorithm designed to predict billing eligibility for tobacco cessation counseling (CPT 99406/99407), addressing persistent underbilling and documentation gaps in health systems. Materials and Methods We trained CigStopper on a 40,000-note corpus comprising real-world de-identified clinical notes, synthetically generated notes, and a blended dataset. Notes were categorized by billing eligibility and smoking documentation. Random Forest models were trained and evaluated using both flat multiclass and hierarchical classification approaches. Performance was assessed on a 20% holdout set of real notes using standard metrics (accuracy, precision, recall, F1). Results Models trained on real or blended datasets achieved high performance for billing eligibility (F1 ≥ 0.97) and 99406 prediction (F1 ≥ 0.90). Prediction for intensive counseling (99407) remained limited (F1 ≤ 0.56). Synthetic-only training resulted in overfitting, with poor generalization to real-world data. Hierarchical classification improved eligibility detection and CPT code prediction compared with flat multiclass models. Discussion Findings demonstrate that blended datasets mitigate class imbalance and improve generalizability, while hierarchical architectures enhance performance on billing tasks. Persistent gaps in 99407 prediction were related to low training volume, likely reflecting documentation and coding culture rather than model limitations, underscoring systemic issues in clinical note content. Conclusion CigStopper demonstrates feasibility as a scalable NLP-based billing validation tool. By automating tobacco cessation CPT coding, the algorithm can improve data integrity, reduce missed reimbursement, and support health systems in aligning clinical care with financial and population health priorities.
Building similarity graph...
Analyzing shared references across papers
Loading...
Derek J Baughman
Layth Qassem
Lina Sulieman
JAMIA Open
Icahn School of Medicine at Mount Sinai
Vanderbilt University Medical Center
Building similarity graph...
Analyzing shared references across papers
Loading...
Baughman et al. (Fri,) studied this question.
www.synapsesocial.com/papers/69b79ea18166e15b153ac2f2 — DOI: https://doi.org/10.1093/jamiaopen/ooag026