Abstract Objectives: This work focuses on accurate and precise short-term cloud workload estimation on nonlinear and dynamic workloads to achieve non-disruptive resource management, scalability, and a cost-effective environment. Methods: The suggested method provides an individual training and testing model for capturing the dataset pattern and validating future estimations. The training model is implemented with an optimized parameter contribution rate that replaces the learning rate hyperparameter of the boosting method (XGBoost), whereas the testing model is designed by performing statistical operations on the residuals of the individually trained models and optimized window size. To assess the strength and performance, the approach is compared with the results of baseline methods. Findings: To evaluate the accuracy and reliability, performance metrics were evaluated by training the suggested technique and baseline methods on 80% of the dataset to observe the nonlinear behavior, and testing on the remaining 20% of the GWA-BITBRAIN cloud workload. In the experimental results of the training phase, the suggested technique outperformed and showed a significant reduction of 84%, 81%, 76%, 74%, 70%, and 30% in RMSE compared to all baseline methods, respectively. Similarly, the proposed method enhanced the model accuracy with the lowest MAE. This lower model error demonstrates the ability of the model to capture the complex nonlinear relationships in the dataset. During the testing phase, the proposed methods observed 62%, 61%, 27%, 41%, 22%, and 16% less estimation RMSE than baseline methods, respectively, which reveals higher reductions in errors, showing accurate CPU usage predictions. This accurate estimation reveals the reliability and robustness of the model in short-term workload estimation. This method also proved its fitness with 6% reductions in model RMSE error than earlier presented work. Novelty: The contribution rate highly influences the prediction of each weak learner, following the percentage of residuals during each iteration. Testing model statistical operations focus on possible future values from the optimized window. Keywords: Regression Tree, Resource Estimation, Ensemble method XGBoost, Cloud Workload, Non-Linear Time Series Data
Prajapati et al. (Sat,) studied this question.