What question did this study set out to answer?

The aim is to enhance scene graph generation by improving feature fusion and balancing predicate distributions.

January 25, 2026Open Access

Enhancing scene graph generation via hybrid co-attention and predicate reweighting for long-tail robustness

Key Points

The aim is to enhance scene graph generation by improving feature fusion and balancing predicate distributions.
Developed a unified framework called ReBalance-HCA
Combined Hybrid Co-Attention Networks with Predicate Reweighting
Conducted experiments on the Visual Genome and OpenImages datasets
Achieved competitive mR@K scores compared to state-of-the-art methods
Demonstrated improved performance in handling long-tail distributions
Showed effective bias reduction in predictions

Abstract

Scene Graph Generation (SGG) aims to extract visual entities and their semantic relationships from images, providing a structured layout for scene understanding. Current models often suffer from insufficient multi-modal feature fusion and imbalanced predicate distributions, leading to biased predictions. To address these issues, we propose ReBalance-HCA, a unified framework that combines Hybrid Co-Attention Networks (HCA) with Predicate Reweighting (PR). HCA enhances intra-modal features and aligns cross-modal semantics, while PR dynamically adjusts the predicate distribution by modeling inter-predicate correlations. Extensive experiments on the Visual Genome and OpenImages datasets demonstrate that ReBalance-HCA achieves competitive mR@K scores compared to recent state-of-the-art methods in SGG sub-tasks. Our code and datasets are available at: https://github.com/LinusLing/ReBalance-HCA .

Enhancing scene graph generation via hybrid co-attention and predicate reweighting for long-tail robustness

Key Points

Abstract

Cite This Study