We present a Retrieval-Augmented Generation (RAG)-based question-answering system for nuclear energy science communication, characterizing retrieval quality in generated responses. The system introduces a dual-similarity analysis that jointly measures (i) question-to-context (Q→C) and (ii) answer-to-context (A→C) semantic consistency, serving as “retrieval-side semantic alignment signal” and “post-generation semantic alignment indicator” respectively. Built with LangChain, FAISS retrieval, and a large language model, our pipeline separates offline indexing from online inference and is grounded on authoritative Taiwanese Nuclear Safety Commission documents. We evaluate two settings: (a) in-domain prompts derived from the corpus and (b) out-of-domain, randomly generated nuclear energy questions. Results show that generated answers are, on average, more semantically similar to retrieved contexts than the original questions under the present setup, while the overall association between retrieval-side and answer-side signals remains stronger in the in-domain setting. Out-of-domain questions show weaker but still observable answer-to-context alignment patterns, contingent on corpus overlap. These findings suggest that combining RAG with dual-similarity analysis offers a practical and audit-oriented approach for educational Q&A, and we discuss potential improvements in versioned regulations, re-ranking, and abstention strategies. In this study, the RAG technique and dual-similarity analysis are combined together to promote nuclear energy knowledge. The research flow chat of this study can be applied to many other fields of scientific knowledge.
Chiang et al. (Thu,) studied this question.