Hard

Implement a single MLA layer (down-proj to latent, up-proj, decoupled RoPE)

Transformer Architecture Internals · Problem 6 of 7

Chapter 01Transformer Architecture Internals

Implement a single MLA layer (down-proj to latent, up-proj, decoupled RoPE)

HardProblem 6 / 7

Implement a single MLA layer (down-proj to latent, up-proj, decoupled RoPE).

Implement the function/class skeleton in the editor. Any correct approach is accepted.

Hints