Design and ship large‑scale AI systems end‑to‑end, from training and inference to evaluation and observability, improving performance and reliability.
What You’ll Do
Collaborate with engineers, researchers, and product teams to build AI-powered products that transform user experiences.
Design, develop, test, deploy, and maintain large-scale AI systems - including model training, inference, similarity search, evaluation, and observability.
Work with modern AI tech stacks (e.g. AWS, PyTorch, Hugging Face, VectorDBs, Nemo Guardrails).
Develop and optimize LLMs for scalability, latency, cost, and performance.
Define the technical vision and roadmap for foundational AI infrastructure.
Qualifications
Bachelor’s or Master’s in Computer Science, AI, Electrical/Computer Engineering, or a related field.
6+ years (or 4+ with a Master’s) developing AI/ML algorithms or technologies.
Strong programming experience in Python, Go, Java, or Scala.
Proven experience deploying scalable AI systems on cloud platforms (AWS, GCP, Azure).
Deep understanding of AI system design, optimization, and performance engineering.
Ability to mentor teams and influence cross-functional stakeholders.
Strong communication skills and a passion for advancing responsible AI.