About the Role
As a Senior ML Software Engineer on the ML Infrastructure team, you will be crucial in building and maintaining the foundational ML infrastructure that powers our cutting-edge AI products. Your work will directly impact our ability to research, develop, and deploy state-of-the-art machine learning models efficiently and reliably. You will play a key role in developing tools and platforms that enable our ML engineers and researchers to iterate quickly and deliver high-quality models to production.
What you’ll do
- Design, develop, and maintain robust and scalable ML infrastructure and platforms (e.g., model training pipelines, data versioning, experiment tracking, model deployment, monitoring systems).
- Collaborate with ML engineers and researchers to understand their needs and develop solutions that accelerate ML development and deployment cycles.
- Implement and improve CI/CD pipelines for ML models, ensuring smooth and reliable deployment processes.
- Develop tools and libraries to streamline the ML workflow, from data ingestion to model inference.
- Optimize ML systems for performance, reliability, and cost-efficiency.
- Participate in on-call rotation to support production systems.
Who you are
- Bachelor’s or Master’s degree in Computer Science, Engineering, or a related field (or equivalent practical experience).
- 5+ years of experience in software engineering, with a strong focus on machine learning infrastructure or MLOps.
- Proficiency in Python and experience with relevant ML frameworks (e.g., TensorFlow, PyTorch).
- Solid understanding of cloud platforms (e.g., AWS, GCP, Azure) and containerization technologies (e.g., Docker, Kubernetes).
- Experience with data processing technologies (e.g., Spark, Flink, Kafka) is a plus.
- Familiarity with distributed systems and microservices architecture.
- Strong problem-solving skills and the ability to work independently and as part of a team.
- Excellent communication and collaboration skills.