Easy

Implement scaled dot-product attention with a causal mask in numpy

Transformer Architecture Internals · Problem 1 of 7

Chapter 01Transformer Architecture Internals

Implement scaled dot-product attention with a causal mask in numpy

EasyProblem 1 / 7

Implement scaled dot-product attention with a causal mask in numpy.

Implement the function/class skeleton in the editor. Any correct approach is accepted.

Hints