Senior LLM Evaluator · Former NLP Researcher
Top-decile profile: published methodology, runs evaluation pods, owns red-teaming output that has shipped to model trainers. Premium vendors will compete for this candidate.
Authored evaluator-onboarding docs adopted by 60+ contractors.
Designs rubrics and reasons about failure modes statistically.
Python, evals harness, light fine-tuning experience.
Published two papers and a methodology blog post.
Clear, structured prose at researcher level.
Deep NLP + safety domain expertise.
Consistent multi-year contracts with renewals.
9 years fully remote across 3 timezones.
Owns evaluation strategy end-to-end.
Trained and audited 9-person grader pods.
Upload your resume and run the same analysis across every supported AI workforce company — plus career roadmap, hiring probabilities, interview prep, and more.