March 3, 2026Open Access

Small Models, Big Questions : Retrieval-Augmented Generation In A German Tourism Context

Key Points

The system generates contextually relevant responses using a retrieval-augmented generation framework with BERT and small language models.
Evaluation metrics such as cosine similarity and precision@k reveal limitations in semantic alignment and generation accuracy.
Smaller models provide speed and memory efficiency but struggle with longer prompts and structured contexts.
Challenges in retrieval quality and corpus design emphasize the need for refined prompting strategies in domain-specific applications.

Abstract

This thesis investigates the usability and limiations of implementing a Retrieval-Augmentet Generation(RAG) framwork using small-and medium-sized language models in the domain of German tourism.By integratiing domain-specific retrival using a BERT model TourBERT, together with small-and medium-sized generative language models such as LeoLM, Zephyr and Gemma 2B, the system aim to generate semantically grounded as well as contextually relevant responses based on data used from the German Tourism Knowlede Graph(GTKG). A structured corpus of around 350 tourism-related Points of Interest(POIs) was used to test the framework with the different language models. For evaluation of the retrieval part of the system, Cosine Similiarty and precision@k were used, as well as BLUE, ROUGE, and BERTscore for assessing generative performance. The results show that although smaller models do offer advantages in speed and memory efficiency, for this task they struggle with semantic alignment and generation accuracy, especially when handling long prompts and structured contexts. With these results, the study highlights key challenges when applying a RAG network using limited resources, and emphasized the importance of having optimal retrieval quality, a refined corpus deisgn, as well as rigid model-specific prompting strategies for domain-specific applications.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

Erik Magnusson

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Small Models, Big Questions : Retrieval-Augmented Generation In A German Tourism Context

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study