Current helminth genomes possess thousands of predicted fusion genes, encoding novel protein domain architectures that are unique to these species. To investigate this, we analyzed 20,313 two-domain proteins annotated in current helminth genomes, of which 10,297 are apparently unique to helminths, and used RNA-seq data from 20 species of helminth to examine their plausibility as true fusion genes. For comparison, we analyzed a set of 400 high confidence, evolutionarily conserved domain fusions that are present in both helminth and non-helminth species. Our analysis suggests that, in contrast to genuine fusion genes, the majority of helminth-specific fusion genes in the 20 species investigated are likely gene prediction artifacts based on several criteria: (1) they show a lack of correlation between RNA-seq derived expression levels of the first and second “fused” domains, as well as the interdomain region; (2) they have significantly longer interdomain regions; (3) there is significantly less continuity of coverage in their interdomain regions consistent with breakpoints in RNA-seq coverage; and (4) they are generally not supported in de novo transcriptome assemblies. Proteins containing novel domain combinations have been included in widely used sequence and protein databases, including WormBase ParaSite and InterPro, but the analyses presented here suggest that many helminth-specific domain fusion proteins are erroneously annotated. These findings emphasize the importance of using RNA-seq data to validate gene predictions in helminth genomes, especially those with unique structures not observed in other species. Given the increasing need to accurately identify helminth-specific proteins as therapeutic targets, the accuracy of proteome annotation in widely used genomic databases is essential.
Building similarity graph...
Analyzing shared references across papers
Loading...
Emma Collington
Andrew C. Doxey
Brendan J. McConkey
BMC Genomics
University of Waterloo
Building similarity graph...
Analyzing shared references across papers
Loading...
Collington et al. (Wed,) studied this question.
www.synapsesocial.com/papers/69a760cbc6e9836116a2ddf4 — DOI: https://doi.org/10.1186/s12864-026-12589-y