Medium

Implement a full sparse MoE FFN with capacity, token dropping, and gate-weighted combinati

Mixture-Of-Experts · Problem 2 of 4

Chapter 07Mixture-Of-Experts

Implement a full sparse MoE FFN with capacity, token dropping, and gate-weighted combinati

MediumProblem 2 / 4

Implement a full sparse MoE FFN with capacity, token dropping, and gate-weighted combination.

Implement the function/class skeleton in the editor. Any correct approach is accepted.

Hints