This study introduces a novel encoding scheme for DNA/RNA sequences, integrating Komlós and Hadamard transforms. Unlike traditional One-Hot encoding, this approach offers a more informative representation of omics data while significantly reducing computational complexity. However, it is important to note that the Komlós transform component provides fewer features and does not utilize sparse codes. By leveraging the inherent properties of these transforms, our method effectively captures complex patterns within the data, leading to improved model accuracy and reduced training times. When combined with an image transformation, this encoding scheme demonstrates particularly efficient results, achieving superior performance across various predictive tasks with significantly lower computational resource demands compared to One-Hot encoding. Our findings suggest that this novel encoding scheme, particularly when integrated with Hilbert Curve mapping or sequence to image analysis, holds significant promise for advancing DNA/RNA data analysis by offering a more efficient and effective approach to feature representation.
Building similarity graph...
Analyzing shared references across papers
Loading...
Kareem Kabbani
Samir B. Belhaouari
Michaël Aupetit
BMC Bioinformatics
University of Illinois Urbana-Champaign
Texas A&M University
Hamad bin Khalifa University
Building similarity graph...
Analyzing shared references across papers
Loading...
Kabbani et al. (Mon,) studied this question.
www.synapsesocial.com/papers/69df2c88e4eeef8a2a6b1a58 — DOI: https://doi.org/10.1186/s12859-026-06442-y