About the Role
Our mission is to bring community and belonging to everyone in the world. As of today, Reddit is the 3rd most visited site in the US and #6 in the world. We have over 500 million users who visit 100,000+ communities every month. Reddit is evolving at a breakneck pace and our AI/ML infrastructure is growing to support this evolution.
The AI Platform team provides the foundation for ML at Reddit, empowering product teams to build innovative experiences. We manage the entire ML lifecycle, including offline training, online serving, feature engineering, and inference. As a Senior Research Engineer, Post-training & Evaluation on the AI Platform team, you will play a crucial role in ensuring the quality and reliability of Reddit’s large language models (LLMs) and other deep learning models. You will design and implement advanced evaluation methodologies, develop robust evaluation infrastructure, and provide actionable insights to improve model performance across various product initiatives.
What you’ll do:
- Design and implement methodologies for evaluating large language models (LLMs) and other deep learning models for various Reddit products, focusing on post-training evaluation.
- Develop robust, scalable, and reproducible evaluation pipelines and infrastructure.
- Conduct in-depth analysis of model performance, identify failure modes, and provide actionable insights to improve model quality.
- Collaborate with research scientists and product teams to define evaluation metrics and criteria aligned with product goals.
- Stay up-to-date with the latest advancements in LLM evaluation, responsible AI, and machine learning research.
- Publish research findings in top-tier conferences and journals, and contribute to the open-source community, if applicable.
- Mentor junior engineers and promote a culture of technical excellence.
What you’ll bring:
- PhD or Master’s degree in Computer Science, Machine Learning, Statistics, or a related field, or equivalent practical experience.
- 5+ years of experience in machine learning, with a focus on deep learning, natural language processing, or information retrieval.
- Strong background in designing and implementing evaluation methodologies for large-scale ML systems, especially LLMs.
- Proficiency in Python and experience with ML frameworks such as PyTorch or TensorFlow.
- Experience with cloud platforms (e.g., AWS, GCP, Azure) and distributed computing frameworks (e.g., Spark, Ray).
- Excellent communication, collaboration, and problem-solving skills.
- Demonstrated ability to lead projects, mentor junior engineers, and drive technical initiatives.