The Effective Altruism

Opportunities Board

Work on the world's most pressing problems. Browse jobs, fellowships, internships, courses, and more at high-impact organisations.
                Research Scientist, Safety Post-Training
                Research Scientist, Safety Post-Training
                ScaleSan Francisco, CA / New York, NY
                San Francisco, CA / New York, NY
                3 weeks ago
                AI safety & policy
                Full-time
                Salary
                $216,000 $270,000 USD
                Routes to impact
                Direct high impact on an important cause
                Skill-building & building career capital
                Learning about important cause areas
                Description
                Conduct research on post-training and interpretability techniques to improve frontier AI safety and robustness.
                • Design RLHF and post-training safety evaluation pipelines
                • Study deceptive or unsafe model behaviors using interpretability tools
                • Translate findings into safety standards and evaluation benchmarks
                • Collaborate across policy, engineering, and research teams
                This text was generated by AI. If you notice any inconsistencies, please let us know using this form.
                Mathematical Scientist, AI Safety Research
                Mathematical Scientist, AI Safety Research
                LawZeroMontreal, Canada
                Montreal, Canada
                3 weeks ago
                Researcher, AI Cognition Initiative (Technical Focus)
                Researcher, AI Cognition Initiative (Technical Focus)
                Remote
                3 weeks ago
                Research Scientist/Engineer (Science of Scheming)
                Research Scientist/Engineer (Science of Scheming)
                Apollo ResearchLondon, United Kingdom
                London, United Kingdom
                3 months ago
                Research Scientist/Engineer (Evaluations)
                Research Scientist/Engineer (Evaluations)
                Apollo ResearchLondon, United Kingdom
                London, United Kingdom
                3 months ago
                Research Scientist, Manipulation Evaluations
                Research Scientist, Manipulation Evaluations
                Apart ResearchRemote (Europe preferred)
                Remote (Europe preferred)
                3 days ago
                Research Engineer, Scalable Interpretability
                Research Engineer, Scalable Interpretability
                TransluceSan Francisco, CA
                San Francisco, CA
                4 days ago
                Associate Machine Learning Engineer, Secure AI Lab
                Associate Machine Learning Engineer, Secure AI Lab
                Carnegie Mellon UniversityPittsburgh, PA | Arlington, VA
                Pittsburgh, PA | Arlington, VA
                6 days ago
                Expression of Interest, Red Team
                Expression of Interest, Red Team
                London, UK
                2 weeks ago
                Join 60k subscribers and sign up for the EA Newsletter, a monthly email with the latest ideas and opportunities