Machine learning (ML)-based surrogate models have made substantial progress in predicting structural responses under environmental loading. However, its practical applications are hampered by large and redundant training samples. To address this issue, a training sample reduction strategy is proposed for data-driven surrogate modeling of wind-induced structural responses. First, autoencoders were constructed to extract latent features from wind speed and direction data after wavelet approximation, and statistical features were employed as an alternative method for feature extraction. Subsequently, an algorithm was proposed to determine the optimal sample size under a specified confidence coefficient and error limit. Finally, training samples were clustered, and stratified sampling was applied to generate the reduced training dataset. The proposed strategy was validated using monitoring data from a real transmission tower, where a long short-term memory (LSTM) neural network was employed to establish the surrogate model for wind-induced responses prediction. Specifically, the prediction performance of LSTM models trained on reduced datasets was compared, and the effects of the training sample reduction strategy were discussed in detail. The results demonstrate that the strategy can reduce the size of the training dataset and achieve the goal of reducing redundant training samples with minimal or no loss of model performance.
Zhang et al. (Wed,) studied this question.