Evaluation, Reward Hacking & Alignment Methodology · Problem 4 of 4
Implement an n-gram/embedding contamination detector that flags eval items overlapping a training corpus and reports a contamination-adjusted score.
Implement the function/class skeleton in the editor. Any correct approach is accepted.
import numpy as np
def shingles(text, k=8):
raise NotImplementedError
def build_training_index(train_docs, k=8):
raise NotImplementedError
def ngram_overlap(item_text, train_index, k=8):
raise NotImplementedError
def embed_max_sim(item_vec, train_vecs):
raise NotImplementedError
def contamination_report(eval_items, correct, train_docs, k=8, ngram_thr=0.5, item_vecs=None, train_vecs=None, emb_thr=0.95):
raise NotImplementedErrorReady when you are
Submit your solution and a structured review appears here — verdict, score, and concrete feedback. Any correct approach passes.
Implement an n-gram/embedding contamination detector that flags eval items overlapping a training corpus and reports a contamination-adjusted score.
Implement the function/class skeleton in the editor. Any correct approach is accepted.