This paper lays the foundation for using Large Language Models (LLMs) in the Slovenian legal domain. We address data scarcity in low-resource languages by constructing the largest publicly available Slovene Legal Corpus, spanning over one billion tokens from legislative, judicial, and governmental texts. We introduce PravniBERT, a domain-specific Slovene legal language model, and evaluate it on contradiction-based legal article retrieval, achieving 83.6% ac-curacy@3. Our results demonstrate the feasibility of applying LLMs to complex legal reasoning in under-resourced settings and highlight the potential for transparent, domain-adapted legal AI in Slovenia.
Building similarity graph...
Analyzing shared references across papers
Loading...
Miha Malenšek
Aleš Završník
Saša Krajnc
Building similarity graph...
Analyzing shared references across papers
Loading...
Malenšek et al. (Tue,) studied this question.