Oslo Municipality seeks to automate the extraction of public opinion from large volumesof unstructured hearing documents to enhance data-driven decision-making in urban devel-opment. While the municipality’s ”AI Factory” initiative has successfully implemented sum-marization tools, these fail to provide the structured, quantitative data required to measure thedistribution of support and opposition across policy topics. This project investigates the feas-ibility of implementing Aspect-Based Sentiment Analysis (ABSA) to bridge this gap within theconstraints of a low-resource, privacy-sensitive public sector environment.Through a Structured Literature Review (N = 26), this study maps the transition in the state-of-the-art from discriminative classifiers to Generative AI and structure-aware architectures.Concurrently, an Exploratory Data Analysis on a pilot dataset of municipal texts (N = 63)identifies a critical ”Span Ambiguity Gap”: while human annotators struggle to agree on exacttext boundaries for bureaucratic phrases (κ = 0.21), they exhibit high reliability in identifyingsemantic categories (κ = 0.93).These findings demonstrate that standard extraction pipelines are allegedly ill-suited forthe bureaucratic register. Consequently, this report proposes a strategic pivot to Aspect Cat-egory Sentiment Analysis (ACSA). The study concludes with a proposal for a subsequent Mas-ter’s Thesis centered on a Teacher-Student framework, which utilizes Large Language Models(LLMs) to generate synthetic training data (”Inverse Generation”). This approach aims to en-able the training of capable, privacy-compliant local models, effectively solving the ”Cold Start”data scarcity problem while bypassing the inherent ambiguity of manual span annotation.
Sebastian Småland (Tue,) studied this question.