Medium

Implement a KV cache + incremental single-token decode loop for a small transformer

Attention Efficiency & Long Context · Problem 2 of 5

Chapter 03Attention Efficiency & Long Context

Implement a KV cache + incremental single-token decode loop for a small transformer

MediumProblem 2 / 5

Implement a KV cache + incremental single-token decode loop for a small transformer.

Implement the function/class skeleton in the editor. Any correct approach is accepted.

Hints