Work with the newest AI models at scale. Evaluation Engineers develop and execute sophisticated evaluation strategies to assess the performance, reliability, and safety of large language models (LLMs) and other AI models. By designing comprehensive tests, analyzing results, and collaborating with research and engineering teams, they ensure models meet rigorous standards before deployment.