Designing descriptors for multiple defects in two-dimensional materials is challenging due to the diverse local atomic environments created by different defect types and arrangements. Existing physics-informed descriptors struggle to distinguish distinct defect configurations with identical composition, while deep learning models, though powerful, require large data sets and are less interpretable. In this work, we address this limitation by engineering chemical descriptors and constructing structural features from nearest-neighbor distributions provided by the classical force-field-inspired descriptors (CFID). We show that our engineering method, combined with defect-aware structural features derived from the Hellinger distance, even excluding the full distribution features, improves data point discrimination in high-dimensional feature space while reducing the number of features by 50%. In predicting formation energy per defect site, this extended feature set balances reliance on a few dominant features, enhancing model interpretation and generalization at the cost of a marginal 10% increase in prediction error compared to baseline descriptors. This generalization capability is empirically validated on an external out-of-distribution data set of bulk hBN defects, where our model exhibits lower uncertainty and superior stability within the applicable physical domain (- 1 Ef < 5 eV). However, predicting a highly complex and nonlinear target, such as the HOMO-LUMO gap, remains challenging, as none of our extensions outperform the baseline. This physics-informed approach offers an interpretable and computationally efficient alternative to deep learning models, providing new insights into defect representations in 2D materials and serving as a tool for the high-throughput prescreening of stable defect candidates prior to expensive first-principles calculations.
Building similarity graph...
Analyzing shared references across papers
Loading...
Cheewawut Na Talang
Aniwat Kesorn
Chanaprom Cholsuk
Journal of Chemical Information and Modeling
Technical University of Munich
Friedrich Schiller University Jena
Mahidol University
Building similarity graph...
Analyzing shared references across papers
Loading...
Talang et al. (Thu,) studied this question.
www.synapsesocial.com/papers/699010df2ccff479cfe571d3 — DOI: https://doi.org/10.1021/acs.jcim.5c02100