ABSTRACT This study investigates the impact of multimodal resources, specifically dynamic visual stimuli such as animated video clips, on the comprehension among young English as a foreign language (EFL) learners. Utilizing eye‐tracking technology, the research examines how 32 elementary school students in South Korea allocate their attention while engaging multimodal resources in audiovisual (AV) and audiovisual with text modes. The findings reveal that participants exhibit a preference for visual elements over textual content, particularly in video‐only conditions, suggesting that dynamic visuals affect learners’ attention within multimodal input and facilitate deeper processing of language concepts. Although no significant differences in test scores were observed between the text‐supported and video‐only conditions, eye‐tracking data indicated distinct patterns of attention allocation, highlighting the critical role of visual cues in directing learner focus. Additionally, qualitative insights from stimulated recall interviews reveal varied strategies employed by learners, with many benefiting from a combination of visual and textual inputs. This research underscores the necessity of integrating multimodal resources in EFL instruction to cater to diverse learning preferences and enhance engagement. Ultimately, the study highlights the need for instructional materials that account for individual differences and the specific integration of text—such as subtitles or animated boxes—to optimize cognitive processing and learning outcomes in young learners.
Suh Keong Kwon (Thu,) studied this question.