Hard

Implement the load-balancing aux loss and a training step demonstrating it equalizes exper

Mixture-Of-Experts · Problem 3 of 4

Chapter 07Mixture-Of-Experts

Implement the load-balancing aux loss and a training step demonstrating it equalizes exper

HardProblem 3 / 4

Implement the load-balancing aux loss and a training step demonstrating it equalizes expert usage on synthetic data.

Implement the function/class skeleton in the editor. Any correct approach is accepted.

Hints