This paper presents the proposed solution by our team, Gut-Instincts, for the GutBrainIE task, which introduces a named-entity recognition (NER) subtask and three relation extraction (RE) subtasks on biomedical articles related to the gut-brain axis. To address the domain-specific terminology involved in the tasks, we rely on biomedical pretrained transformer-based models. For NER, we extend these with three different classification heads: (1) a dense layer, (2) a dense layer followed by a conditional random field (CRF), or (3) a bidirectional long short-term memory layer followed by a CRF. For RE, we introduce negative samples and experiment with different ratios between positive and negative samples. For all subtasks, we use model ensembling to reduce variability and improve robustness. Furthermore, since the provided dataset is of different quality levels, we use weighted training that enables the models to utilize all available data, while ensuring that high-quality data has a stronger influence during optimization. Our experimental results suggest that a large ratio of negative to positive samples, model ensembling, and weighted training improve performance in the NER and RE subtasks. In the GutBrainIE task, we placed second in the NER subtask (6.1) with an F1 micro score of 0.8382, and first place in all three RE subtasks 6.2.1, 6.2.2, and 6.2.3 with F1 micro scores of 0.6864, 0.6866, and 0.4635, respectively.
Andersen et al. (Mon,) studied this question.