Purpose: This study proposes the LLM-based Hyperparameter Tuning for Time Series (LHTT) framework, an automated time series forecasting system utilizing large language models and validates its effectiveness through Seoul air quality (PM2.5) data analysis. The research aims to automate the entire process from model selection to hyperparameter optimization and result analysis using generative AI.Methods: Five different time series forecasting models (Exponential Smoothing, ARIMA, Prophet, LSTM, Transformer) were implemented and compared using Seoul PM2.5 data from May 1 to May 31, 2025 (18,500 observations from 25 monitoring stations). Gemma3:27B was utilized for automated hyperparameter tuning through iterative feedback loops. Performance evaluation was conducted using MSE, RMSE, MAE, R², and MAPE metrics.Results: Among baseline models, Transformer achieved the best performance with RMSE 4.08, R² 0.79, and MAPE 9.53%. However, after LLM-based tuning, the LSTM model achieved superior performance with RMSE 3.70, R² 0.82, and MAPE 9.71%, representing significant improvement over the baseline models. Statistical models also showed dramatic improvements after LLM tuning, with Exponential Smoothing achieving 87.85% reduction in RMSE.Conclusion: The proposed LHTT framework demonstrates significant potential for improving time series forecasting accuracy while minimizing expert intervention. The automated system successfully generated comprehensive analysis reports and achieved practical prediction accuracy suitable for environmental policy applications, proving the feasibility of end-to-end automation in data science workflows.
Lee et al. (Tue,) studied this question.