Infrastructure, Distributed Training & Scaling · Problem 2 of 4
Implement toy data-parallel SGD with manual gradient all-reduce (torch.distributed or a simulated ring).
Implement the function/class skeleton in the editor. Any correct approach is accepted.
import numpy as np
def ring_all_reduce(buffers):
raise NotImplementedError
def dp_sgd_step(weights, grads, lr, N):
raise NotImplementedErrorReady when you are
Submit your solution and a structured review appears here — verdict, score, and concrete feedback. Any correct approach passes.
Implement toy data-parallel SGD with manual gradient all-reduce (torch.distributed or a simulated ring).
Implement the function/class skeleton in the editor. Any correct approach is accepted.