Sarcasm detection remains a challenging task in natural language processing because sarcastic expressions often convey meanings that contradict their literal wording. Although transformer-based encoders such as RoBERTa capture contextual semantics effectively, sparse linguistic signals common in sarcastic user-generated text, such as exaggerated punctuation, elongated words, capitalization, and sentiment contrast, may not always remain explicitly accessible in the final sentence representation. To address this limitation, we propose HYSARD, a hybrid feature-fusion model that combines RoBERTa-based sentence embeddings with complementary linguistic features, including sentiment polarity, stylistic markers, syntactic patterns, and TF-IDF lexical cues. The resulting feature space is refined through Random Forest-based feature selection to reduce redundancy and improve robustness, while SMOTE mitigates class imbalance during training. We evaluate HYSARD on the SemEval-2022 iSarcasmEval dataset and the balanced Main and Political subsets of SARC 2.0. Results show strong and consistent performance across datasets, with an F1-score of 0.80 on iSarcasmEval, while held-out test-set error analysis further highlights strong class-wise discrimination. The ablation study further confirms that combining contextual embeddings with explicit linguistic cues improves sarcasm detection over reduced feature configurations. These findings show that hybrid feature fusion remains an effective and practical strategy for sarcasm detection in noisy social media text.
Building similarity graph...
Analyzing shared references across papers
Loading...
Jabri et al. (Wed,) studied this question.
www.synapsesocial.com/papers/69fd7f65bfa21ec5bbf07ec4 — DOI: https://doi.org/10.3390/bdcc10050144
Ismail Jabri
Zine Eddine Louriga
Aziza El Ouaazizi
Big Data and Cognitive Computing
Sidi Mohamed Ben Abdellah University
Building similarity graph...
Analyzing shared references across papers
Loading...