Medium

Implement a pairwise LLM-as-judge harness with position-swap debiasing (run both orders, a

Evaluation, Reward Hacking & Alignment Methodology · Problem 2 of 4

All problems

Chapter 12Evaluation, Reward Hacking & Alignment Methodology

Implement a pairwise LLM-as-judge harness with position-swap debiasing (run both orders, a

MediumProblem 2 / 4

Implement a pairwise LLM-as-judge harness with position-swap debiasing (run both orders, average).

Implement the function/class skeleton in the editor. Any correct approach is accepted.

Hints

solution.pypython

local draft

def score_from_verdict(verdict, a_is_first):
    raise NotImplementedError

def pairwise_judge(prompt, resp_a, resp_b, judge_fn):
    raise NotImplementedError

def win_rate(prompts, resp_as, resp_bs, judge_fn):
    raise NotImplementedError

⌘/Ctrl + ↵ to submit

AI review

Ready when you are

Submit your solution and a structured review appears here — verdict, score, and concrete feedback. Any correct approach passes.

Chapter 12Evaluation, Reward Hacking & Alignment Methodology

Implement a pairwise LLM-as-judge harness with position-swap debiasing (run both orders, a

MediumProblem 2 / 4

Implement a pairwise LLM-as-judge harness with position-swap debiasing (run both orders, average).

Implement the function/class skeleton in the editor. Any correct approach is accepted.

Hints