What question did this study set out to answer?

The research aims to analyze how the design of Likert scales affects judgment reliability in LLM evaluations across Korean and English.

March 26, 2026

The Impact of Likert Scale Design on Judgment Reliability in Korean and English LLM-as-a-Judge

Key Points

The research aims to analyze how the design of Likert scales affects judgment reliability in LLM evaluations across Korean and English.
Utilized the NLG-Eval dataset for experiments.
Analyzed the impact of ascending and descending score designs on reliability.
Compared the consistency of numeric and descriptive scales with word-based scales.
Investigated reliability patterns in both Korean and English contexts.
Ascending scale design demonstrated higher judgment reliability than descending design.
Numeric scales combined with descriptions showed greater consistency than word-based scales.
Overall judgment reliability in Korean was lower than in English, yet similar patterns of reliability variation were observed across both languages.

Abstract

본 연구는 거대언어모델(LLM)이 답변 품질을 직접 채점(Direct scoring)하는 ‘LLM-as-a-Judge’ 패러다임에서 리커트 척도 설계가 판정 신뢰도에 미치는 영향을 분석한다. 기존 연구가 영어권의 쌍별 비교 방식에 편중되었다는 한계를 고려하여, 한국어 및 영어 환경에서 평가 지시문 내 척도 변인에 따른 신뢰도 변화를 실증한다. NLG-Eval 데이터셋을 활용한 실험 결과, 첫째, ‘수치가 높을수록 우수한 품질’을 의미하는 오름차순 설계가 내림차순 설계보다 높은 판정 신뢰도를 나타냈다. 둘째, 수치에 기반하거나 상세 설명이 결합된 척도가 단어 기반 척도보다 높은 일관성을 보였다. 셋째, 한국어 환경의 전반적인 신뢰도는 영어보다 낮았으나, 리커트 척도 설계에 따른 신뢰도 변화 양상은 두 언어에서 유사하게 관찰되었다. 이러한 실험 결과를 바탕으로, 본 연구는 견고한 LLM-as-a-Judge 시스템 구축을 위한 실질적인 평가 지시문 설계 가이드라인을 도출하여 제시한다.

Connected Papers

Building similarity graph...

Analyzing shared references across papers

Discussion

Authors

You-Won Jang

Woo-Suk Choi

Minsu Lee

Journals

KIISE Transactions on Computing Practices

Actions

References and Citations

Connected Papers

Building similarity graph...

Analyzing shared references across papers

The Impact of Likert Scale Design on Judgment Reliability in Korean and English LLM-as-a-Judge

Key Points

Abstract

Citation Network

Connected Papers

Discussion

Authors

Journals

Actions

References and Citations

Citation Network

Connected Papers

Discussion

Cite this study