The Effective Altruism
Opportunities Board
Work on the world's most pressing problems. Browse jobs, fellowships, internships, courses, and more at high-impact organisations.
Research Engineer, Scalable Interpretability
TransluceSan Francisco, CA
San Francisco, CA
Today
Routes to impact
Direct high impact on an important cause
Skill-building & building career capital
Description
Develop scalable interpretability systems to improve oversight of advanced AI models.
- Build evaluations for undesirable model behaviors
- Design architectures and training objectives for interpretability assistants
- Scale training and inference for frontier models
- Conduct research on model activations and behavior prediction
This text was generated by AI. If you notice any inconsistencies, please let us know using this form.
Related opportunities
Alignment Scientist / Engineer
AE StudioRemote (US)
Remote (US)
Yesterday
Software Engineer, Safeguards Evaluations
AnthropicSan Francisco, CA | New York City, NY
San Francisco, CA | New York City, NY
Yesterday
Associate Machine Learning Engineer, Secure AI Lab
Carnegie Mellon UniversityPittsburgh, PA | Arlington, VA
Pittsburgh, PA | Arlington, VA
2 days ago
Machine Learning Engineer
10a LabsRemote (US)
Remote (US)
1 week ago
Senior Software Engineer, AI Security
Carnegie Mellon UniversityArlington, VA / Pittsburgh, PA
Arlington, VA / Pittsburgh, PA
2 weeks ago
Software Engineer, AI Security
Carnegie Mellon UniversityPittsburgh, PA / Arlington, VA
Pittsburgh, PA / Arlington, VA
2 weeks ago
Researcher, AI Cognition Initiative (Technical Focus)
Rethink PrioritiesRemote
Remote
2 weeks ago
Research Scientist, Safety Post-Training
ScaleSan Francisco, CA / New York, NY
San Francisco, CA / New York, NY
2 weeks ago
Join 60k subscribers and sign up for the EA Newsletter, a monthly email with the latest ideas and opportunities