• Mutual information-based imputation improves missing-data estimation accuracy • A proactive workflow supports process diagnosis and yield management • Outlier imputation strengthens model accuracy and robustness • Reliable diagnosis enabled from real-world incomplete semiconductor data Machine learning (ML) plays a critical role in semiconductor process monitoring by enabling manufacturers to manage the complexities of mass production, including ensuring yield and reducing turnaround time (TAT). However, measurement sampling to reduce cost and TAT introduces many missing values that degrade ML model performance. Conventional imputation methods rarely capture the unique interdependencies in semiconductor electrical parameter measurement (EPM) data, thereby limiting accuracy. This study proposes stepwise and selective missing-value imputation approaches for ML-based process monitoring and diagnosis of dynamic random-access memory (DRAM) peripheral devices. These methods utilize mutual information between parameters for informed selection and sequencing of parameter imputation, focusing on missForest (MF) and multivariate imputation by chained equations (MICE). Compared with universal imputation, stepwise-selective MF and selective MICE reduce the normalized mean absolute error by 22.90% and 2.51%, respectively, and improve both prediction and classification accuracy. Furthermore, this study validates a novel application to correcting DRAM device EPM outliers arising from measurement errors. Imputing outliers as missing values improves the model R 2 score by 4.02% compared with the standard practice of sample removal, enhancing model robustness, particularly in data-scarce DRAM device scenarios. The approaches are verified in the latest 1Y-nm node DRAM test vehicles with baseline and split tests, which introduce high-k metal gates with a minimum gate length of 1A-nm node for further node scaling. Overall, this study presents a practical solution for managing nanoscale variabilities and improving productivity in mass production, contributing to a robust ML-based monitoring framework for future DRAM device manufacturing.
Kang et al. (Fri,) studied this question.