The use of neural models for abstractive summarization has grown in popularity for generating concise summaries from large amounts of text across various domains. Establishing the factual faithfulness of summaries to the original text is especially important for high stake domains such as scientific research, healthcare, and law. Existing approaches typically rely heavily on domain-specific reference-based evaluation metrics that do not capture factual inaccuracies, limiting their broader use across domains. In light of these challenges, we introduce Cross-Domain Faithfulness Evaluation using Reference-Free Metrics (CDFE-RFM), a proposal to combine model-based natural language inference, entity-level precision and recall, and question-answering consistency tests to evaluate the factual faithfulness of summaries without human references. This method also employs data adaptation techniques making it applicable for examining faithfulness across disparate types of content and terminology. This proposal can be applied to multidomain processes in which a concise summary is generated for critical content that must be accurately represented and retained (e.g., concise medical report, legal case summaries, or scientific abstracts). The experimental results show that the CDFE-RFM significantly improves the detection of factual inaccuracies compared to traditional human reference based metrics concluding that the CDFE-RFM can provide a more reliable, less domain bound method for evaluating faithfulness in abstractive summarization
Dewangan et al. (Thu,) studied this question.