Reasoning & Test-Time Compute · Problem 3 of 4
Implement beam-search-over-reasoning-steps that expands/prunes partial CoTs using a PRM score.
Implement the function/class skeleton in the editor. Any correct approach is accepted.
import heapq
from dataclasses import dataclass, field
@dataclass(order=True)
class Beam:
score: float
steps: list = field(compare=False, default_factory=list)
done: bool = field(compare=False, default=False)
def beam_search_reasoning(prompt, expand_fn, prm_fn, is_terminal, beam_width=4, n_expand=4, max_depth=8):
raise NotImplementedErrorReady when you are
Submit your solution and a structured review appears here — verdict, score, and concrete feedback. Any correct approach passes.
Implement beam-search-over-reasoning-steps that expands/prunes partial CoTs using a PRM score.
Implement the function/class skeleton in the editor. Any correct approach is accepted.