Agenda — Deployment¶
Session length: 3 hours | Difficulty: Intermediate | Prerequisites: Observability, FastAPI basics, async Python
What you will build today¶
A production-ready FastAPI service that wraps an LLM pipeline with streaming responses, async handling, semantic caching, and a serverless-compatible structure.
Schedule¶
| Time | Topic | File |
|---|---|---|
| 0:00–0:20 | FastAPI wrappers: request/response models, auth | 01-fastapi-wrappers |
| 0:20–0:55 | Streaming responses: SSE and chunked transfer | 02-streaming-responses |
| 0:55–1:20 | Async patterns: concurrent requests, background tasks | 03-async-patterns |
| 1:20–1:45 | Serverless deployment: AWS Lambda, Modal, Fly.io | 04-serverless |
| 1:45–2:10 | Caching strategies: exact match and semantic cache | 05-caching-strategies |
| 2:10–2:45 | Practice exercises | 06-practice-exercises |
| 2:45–3:00 | Interview questions review | 07-interview-questions |