Optimization & Training Dynamics · Problem 5 of 5
Implement a mixed-precision loop (bf16 compute, fp32 master weights) with loss scaling on a toy MLP, including a deliberate overflow and recovery.
Implement the function/class skeleton in the editor. Any correct approach is accepted.
import torch
def forward_backward(master, x, y, loss_scale):
raise NotImplementedError
def is_finite(grads):
raise NotImplementedErrorReady when you are
Submit your solution and a structured review appears here — verdict, score, and concrete feedback. Any correct approach passes.
Implement a mixed-precision loop (bf16 compute, fp32 master weights) with loss scaling on a toy MLP, including a deliberate overflow and recovery.
Implement the function/class skeleton in the editor. Any correct approach is accepted.