Many researchers have recommended refactoring models through source code metrics based on a threshold value. However, this approach is not universally acceptable in the industry because each organisation has its own threshold value. Therefore, it is desirable to develop an automated model that can determine an acceptable threshold value for most of them. This paper aims to develop a Software Refactoring Model (SRM) through an abstract syntax tree (AST) rather than source code metrics to predict the class-level refactoring. AST is generated for each data set (Antlr4, Mct, Titan, Junit) through the check style 9.0.1 tool. As our considered data set is purely imbalanced, we have used the SMOTE data sampling technique to balance the data, and then it is tokenized. After tokenization, words are collected through different tree traversal techniques. In total, we have considered 500 words for each project. Then, LSTM takes a sequence of 50 words in each group, incrementally through the padding, to predict the requirement of refactoring classes. We then estimate metrics such as the Area under the Curve (AUC), F-score, and accuracy to measure the performance of the refactoring prediction model. We have also performed a comparative analysis by applying LSTM with different layers and with other frequently used classifiers. We have evaluated our proposed model by using AST and source code metrics. In our proposed refactoring model, The three-layer LSTM (LSTM3) had the best performance among LSTM architectures (Accuracy=96.24% and AUC=0.58) but BNB performs well among all (AUC=0.87). Balancing the dataset with SMOTE further enhanced discrimination ability, increasing the AUC to 0.98 (median = 1.00), up from 0.78 before balancing. Sequence length also had an impact on performance: shorter inputs of 150 words produced the greatest results, with a mean accuracy of 97.15% and a mean AUC of 0.61. In comparative trials, Bernoulli Naive Bayes (BNB) consistently beat traditional classifiers, including LSTM, while AST-based models outperformed object-oriented measures (accuracy = 94.66%, AUC = 0.94). Our experimental result suggests that the Initial 150 words achieve the mean AUC rank of 4.47, which is the highest performer among all ten groups to predict the classes that need refactoring. Our results also show that BNB performs better (based on AUC value) than other well-known classifiers, including LSTM. Additionally, it is also observed that a larger number of layers obtains significant results for software refactoring. We have also compared the results obtained after applying AST and object-oriented metrics (OOM) and observed that AST is obtaining better results than OOM.
Building similarity graph...
Analyzing shared references across papers
Loading...
Panigrahi et al. (Mon,) studied this question.
www.synapsesocial.com/papers/69df2abce4eeef8a2a6afc55 — DOI: https://doi.org/10.1007/s10586-026-05965-6
Rasmita Panigrahi
Sanjay Misra
Lov Kumar
Cluster Computing
National Institute of Technology Kurukshetra
Institute for Energy Technology
GIET University
Building similarity graph...
Analyzing shared references across papers
Loading...