- Experience building and operating ML infrastructure or model serving systems in production
- Proficiency in Golang or Python, with strong systems engineering fundamentals
- Hands-on experience with Kubernetes and container orchestration at scale
- Familiarity with ML serving frameworks such as Ray Serve, Triton, TorchServe, or similar
- Understanding of distributed systems, API design, and system reliability
- Strong collaboration and communication skills in a remote-first environment
- Experience with feature stores, feature pipelines, or online/offline feature serving (nice-to-have)
- Background in ad tech, real-time bidding, or programmatic advertising systems (nice-to-have)
- Familiarity with infrastructure-as-code such as Terraform (nice-to-have)
- Experience with observability tooling — Prometheus, Grafana, OpenTelemetry (nice-to-have)
- Background with real-time data pipelines, caching layers, or low-latency serving systems (nice-to-have)