Infrastructure, Distributed Training & Scaling · Problem 4 of 4
Implement tensor-parallel linear layers (column-parallel then row-parallel) with correct forward/backward collectives in PyTorch.
Implement the function/class skeleton in the editor. Any correct approach is accepted.
import torch, torch.distributed as dist
from torch.autograd import Function
class _CopyToModelParallel(Function):
@staticmethod
def forward(ctx, x):
raise NotImplementedError
@staticmethod
def backward(ctx, grad):
raise NotImplementedError
class _ReduceFromModelParallel(Function):
@staticmethod
def forward(ctx, x):
raise NotImplementedError
@staticmethod
def backward(ctx, grad):
raise NotImplementedError
class ColumnParallelLinear(torch.nn.Module):
def __init__(self, d_in, d_out, world, rank):
raise NotImplementedError
def forward(self, x):
raise NotImplementedError
class RowParallelLinear(torch.nn.Module):
def __init__(self, d_in, d_out, world, rank):
raise NotImplementedError
def forward(self, x_shard):
raise NotImplementedError
class ParallelFFN(torch.nn.Module):
def __init__(self, d_model, d_ff, world, rank):
raise NotImplementedError
def forward(self, x):
raise NotImplementedErrorReady when you are
Submit your solution and a structured review appears here — verdict, score, and concrete feedback. Any correct approach passes.
Implement tensor-parallel linear layers (column-parallel then row-parallel) with correct forward/backward collectives in PyTorch.
Implement the function/class skeleton in the editor. Any correct approach is accepted.