This article investigates and compares four unsupervised anomaly detection algorithms: the Autoencoder (AE), LSTM-Autoencoder (LSTM-AE), One-Class SVM (OCSVM), and the Isolation Forest (IF). The analysis focuses on SCADA telemetry data from an urban wind turbine, characterized by a unique case of extreme inverted class imbalance, where operational anomalies constitute 75.7% of the records. The AE model, trained exclusively on the rare normal state, achieved the best overall performance (AUC 0.9667), maintaining balanced and high classification effectiveness for both classes (Recall Normal ≈ 95%, Recall Anomaly ≈ 88.5%; Macro F1-Score 0.8962). In contrast, the IF model, despite a strong discriminative ability (AUC 0.8616), exhibited a complete inability to correctly recognize the normal class (Recall Normal 0.00) when using the optimal F1-score threshold. This performance degradation was a direct consequence of the necessity to apply a classification threshold imposed by the statistical fraction of the anomaly-dominated dataset. These results empirically demonstrate the methodological superiority of the reconstruction-based approach (AE) in constructing a stable decision boundary independent of the statistically dominant class. The study provides quantitative guidelines for the selection and calibration of algorithms in PHM diagnostic systems where states deviating from the operational norm constitute the majority.
Lukasz Pawlik (Wed,) studied this question.