Mixture-Of-Experts · Problem 4 of 4
Implement expert-parallel dispatch/combine with a simulated all-to-all and verify outputs match a dense reference for the same routing.
Implement the function/class skeleton in the editor. Any correct approach is accepted.
import torch
import torch.nn as nn
import torch.nn.functional as F
class EP_MoE(nn.Module):
def __init__(self, d, d_ff, E, k, world_size):
raise NotImplementedError
def route(self, x):
raise NotImplementedError
def dense_reference(self, x, gates, idx):
raise NotImplementedError
def forward(self, x):
raise NotImplementedError
def verify():
raise NotImplementedErrorReady when you are
Submit your solution and a structured review appears here — verdict, score, and concrete feedback. Any correct approach passes.
Implement expert-parallel dispatch/combine with a simulated all-to-all and verify outputs match a dense reference for the same routing.
Implement the function/class skeleton in the editor. Any correct approach is accepted.