$5 free credits when you sign up
All open roles
InfrastructureSeniorFull-time

Backend Engineer — LLM Infrastructure

Work on LLM routing, cost tracking, and high-throughput proxy infrastructure. Python, asyncio, Postgres, and a lot of provider-specific edge cases.

Location
Remote — US / EU
Employment
Full-time
Base salary
$150k – $210k /yr
PythonFastAPIasyncioPostgreSQLRedisLLM Routing

About the role

You will work on the always-in-path FastAPI proxy that sits between every customer request and every provider. That means guardrails, prompt management, A/B testing, cost attribution, rate limits, and streaming. You will own the parts of the request lifecycle that have to stay correct under real production load.

What you'll do

  • Extend the Nemo Backend FastAPI proxy with new guardrails and features
  • Own streaming, retries, and provider failover correctness
  • Build and maintain cost attribution from x-nemo-request-cost
  • Profile and tune hot paths — every millisecond is in the user-facing latency budget
  • Harden multi-tenancy isolation at the request layer

Required experience

  • 5+ years of backend Python in production
  • Deep experience with asyncio and high-concurrency services
  • Comfortable with Postgres, connection pooling, and query optimization
  • Production experience with streaming APIs or proxies

Nice to have

  • Prior work on LLM APIs, model gateways, or SSE streaming
  • Experience with LLM routing engines or model gateways
  • Performance profiling and flame graph literacy

Compensation & location

Base salary range
$150k – $210k /yr

Offer depends on experience and location. Equity offered on top of base.

Remote policy
Remote — US / EU

We hire through employer-of-record services in 11+ countries.

Ready to apply?

We reply to every application within 5 business days.