CircRNAs have attracted more and more attentions in recent years as they play important roles in many biological processes. It is essential for determining the functions of circRNAs. The subcellular localizations of circRNAs are deemed to be related to their functions. Thus, it is necessary to determine the subcellular localizations of circRNAs. The traditional biochemical experiments are expensive and time-consuming in determining subcellular localizations of circRNAs. It is an alternative way to design computation models. In this study, a new computational model, namely CircLoc, was designed to predict subcellular localizations of circRNAs. This model employed circRNA sequences and networks, from which circRNA features were extracted through both traditional methods (e.g. k-mer), large language model (RNAErnie), and network representation learning algorithms (e.g. node2vec, graph attention auto-encoder). All features were processed by a self-attention layer and fed into a fully connected layer to make predictions. The model was evaluated by ten-fold cross-validation, yielding average AUC and AUPR of 0.7856 and 0.4055, respectively. Such performance was better than that of the models using traditional multi-label classification algorithms and the miRNA subcellular localization prediction models. The reasonableness of CircLoc was also elaborated using ablation tests. The CircLoc was effective in predicting circRNA subcellular localizations and can be a latent useful tool in circRNA study.
Chen et al. (Mon,) studied this question.