As a key application of technology-enhanced learning, Intelligent Tutoring Systems have long been constrained by bottlenecks such as expert-dependent, costly manual knowledge base construction and difficulties in adapting to unstructured teaching resources. Concurrently, generative large language models face challenges in educational question-answering, including factual inaccuracies and insufficient logical reasoning capabilities. To address these issues, this study proposes a framework for an Intelligent Tutoring System based on the automatic construction of multimodal knowledge graphs and Retrieval-Augmented Generation (RAG). The system integrates technologies such as FFmpeg, Whisper, OCR, and layout analysis to establish a pipeline for the fully automatic extraction and construction of knowledge graphs not only from course videos, but also from textbook PDFs. This process enables the integration of auditory information from videos with visual and textual knowledge from textbooks, building on this foundation, the framework combines graph retrieval and vector retrieval strategies, leveraging the RAG mechanism to drive large language models in generating accurate and explainable question-answering content. Experimental results demonstrate that the proposed system achieves positive feedback in terms of knowledge graph construction, the average accuracy and relevance of intelligent QA responses, overall user satisfaction, and system performance. Beyond automation, its core innovation is a cross-modal fusion mechanism that aligns and integrates knowledge from auditory explanations and visual-textual textbook content, thereby creating a unified, instructionally-structured knowledge graph. Thus, this study provides a feasible and innovative path from multimodal resources to intelligent services for Intelligent Tutoring Systems, holding significant practical implications for advancing personalized learning.
Building similarity graph...
Analyzing shared references across papers
Loading...
Chao Deng
Guangdong University of Technology
Bo Yuan
Guangdong University of Technology
Frontiers in Computer Science
SHILAP Revista de lepidopterología
Guangdong University of Technology
Building similarity graph...
Analyzing shared references across papers
Loading...
Deng et al. (Wed,) studied this question.
synapsesocial.com/papers/69a285aa0a974eb0d3c00a81 — DOI: https://doi.org/10.3389/fcomp.2026.1777749