Super-hard

Implement a continuous-batching scheduler simulator with a paged KV cache that admits/evic

Inference & Serving · Problem 5 of 5

Chapter 13Inference & Serving

Implement a continuous-batching scheduler simulator with a paged KV cache that admits/evic

Super-hardProblem 5 / 5

Implement a continuous-batching scheduler simulator with a paged KV cache that admits/evicts requests and reports throughput and P99.

Implement the function/class skeleton in the editor. Any correct approach is accepted.

Hints

solution.pypython

local draft

import numpy as np
from collections import deque

class Request:

    def __init__(self, rid, arrival, prompt_len, gen_len):
        raise NotImplementedError

class PagedScheduler:

    def __init__(self, total_pages, page_size, max_batch):
        raise NotImplementedError

    def _pages_needed(self, seq_len):
        raise NotImplementedError

    def _try_admit(self):
        raise NotImplementedError

    def _evict(self, req, t):
        raise NotImplementedError

    def run(self, arrivals):
        raise NotImplementedError

⌘/Ctrl + ↵ to submit

AI review

Ready when you are

Submit your solution and a structured review appears here — verdict, score, and concrete feedback. Any correct approach passes.

Chapter 13Inference & Serving

Implement a continuous-batching scheduler simulator with a paged KV cache that admits/evic

Super-hardProblem 5 / 5

Implement a continuous-batching scheduler simulator with a paged KV cache that admits/evicts requests and reports throughput and P99.

Implement the function/class skeleton in the editor. Any correct approach is accepted.

Hints