What question did this study set out to answer?

To establish a leakage-aware evaluation protocol for predicting employee attrition using imbalanced data.

March 25, 2026Open Access

Leakage-Free Evaluation for Employee Attrition Prediction on Tabular Data

Key Points

To establish a leakage-aware evaluation protocol for predicting employee attrition using imbalanced data.
Proposed a reproducible evaluation protocol for employee attrition prediction.
Implemented SMOTE only within training set during stratified 5-fold cross-validation.
Applied one-hot encoding consistently on the dataset.
Evaluated models including Logistic Regression, Random Forest, and XGBoost using imbalance-aware metrics.
XGBoost achieved the highest mean Average Precision of 0.556 ± 0.056 in cross-validation.
Logistic Regression attained the highest mean F1 score of 0.439 ± 0.048.
LightGBM showed the best mean ROC-AUC of 0.791 ± 0.026.
On the test set, XGBoost delivered precision of 0.65 and recall of 0.45.

Abstract

In the context of employee attrition prediction using imbalanced tabular data, we propose a reproducible, leakage-aware evaluation protocol and validate it on the IBM HR Attrition dataset. We perform the train/test split prior to any rebalancing; SMOTE (Synthetic Minority Over-sampling Technique) is applied exclusively within the training portion of each fold in stratified 5-fold cross-validation, while the test set remains untouched. One-Hot Encoding is performed consistently using pd. getdummies. We benchmark Logistic Regression, Random Forest, ExtraTrees, LightGBM, and XGBoost using imbalance-aware metrics: F1 for the minority class, PR-AUC reported as Average Precision (AP), and ROC-AUC reported both in cross-validation and on the held-out test set. XGBoost attains the best mean AP in cross-validation (0. 556 ± 0. 056). Logistic Regression achieves the highest mean F1 (0. 439 ± 0. 048), while LightGBM yields the best mean ROC-AUC (0. 791 ± 0. 026). On the test set, XGBoost achieves a precision value of 0. 65 and a recall value of 0. 45 at a fixed threshold of 0. 5. Overall, the results highlight a trade-off between stable minority-class detection (Logistic Regression) and stronger risk ranking performance (boosting models) under class imbalance.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Ana Maria Căvescu

Alina Nirvana Popescu

Journals

Information

Actions

Institutions

Universitatea Națională de Știință și Tehnologie Politehnica București

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Leakage-Free Evaluation for Employee Attrition Prediction on Tabular Data

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

Institutions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study

Also consider