Near-Deterministic Qualitative Grading with LLMs Using Binary Path Scoring
Disclaimer: This article introduces a technique developed by the author and does not constitute an explanation, interpretation or commentary on a peer-reviewed paper.
If you tell an LLM to grade an article on a scale of 1-10, you will get a different answer almost every time (looking at you, Gemini🫵).
Evaluating creative or perceptual qualities — such as how captivating an article is, how realistic an image appears, or how impactful a speech feels — has always been difficult.
[Read More]