Medium

Implement toy data-parallel SGD with manual gradient all-reduce (torch

Infrastructure, Distributed Training & Scaling · Problem 2 of 4

All problems

Chapter 06Infrastructure, Distributed Training & Scaling

Implement toy data-parallel SGD with manual gradient all-reduce (torch

MediumProblem 2 / 4

Implement toy data-parallel SGD with manual gradient all-reduce (torch.distributed or a simulated ring).

Implement the function/class skeleton in the editor. Any correct approach is accepted.

Hints

solution.pypython

local draft

import numpy as np

def ring_all_reduce(buffers):
    raise NotImplementedError

def dp_sgd_step(weights, grads, lr, N):
    raise NotImplementedError

⌘/Ctrl + ↵ to submit

AI review

Ready when you are

Submit your solution and a structured review appears here — verdict, score, and concrete feedback. Any correct approach passes.

Chapter 06Infrastructure, Distributed Training & Scaling

Implement toy data-parallel SGD with manual gradient all-reduce (torch

MediumProblem 2 / 4

Implement toy data-parallel SGD with manual gradient all-reduce (torch.distributed or a simulated ring).

Implement the function/class skeleton in the editor. Any correct approach is accepted.

Hints