Medium

Implement best-of-nn selection given a reward/verifier over sampled completions

Reasoning & Test-Time Compute · Problem 2 of 4

Chapter 11Reasoning & Test-Time Compute

Implement best-of-nn selection given a reward/verifier over sampled completions

MediumProblem 2 / 4

Implement best-of-nn selection given a reward/verifier over sampled completions.

Implement the function/class skeleton in the editor. Any correct approach is accepted.

Hints