Research Scientist, Safety Post-Training

The Effective Altruism

Opportunities Board

Work on the world's most pressing problems. Browse jobs, fellowships, internships, courses, and more at high-impact organisations.

Add opportunity

Get notified

Add opportunity

Get notified about new roles

Give us feedback →

Research Scientist, Safety Post-Training

ScaleSan Francisco, CA / New York, NY

San Francisco, CA / New York, NY

Today

AI safety & policy

Full-time

Routes to impact

Direct high impact on an important cause

Skill-building & building career capital

Learning about important cause areas

Description

Conduct research on post-training and interpretability techniques to improve frontier AI safety and robustness.

Design RLHF and post-training safety evaluation pipelines
Study deceptive or unsafe model behaviors using interpretability tools
Translate findings into safety standards and evaluation benchmarks
Collaborate across policy, engineering, and research teams

This text was generated by AI. If you notice any inconsistencies, please let us know using this form.

View opportunity

Related opportunities

Research Scientist/Engineer (Science of Scheming)

Apollo ResearchLondon, United Kingdom

London, United Kingdom

2 months ago

Research Scientist/Engineer (Evaluations)

Apollo ResearchLondon, United Kingdom

London, United Kingdom

2 months ago

Team Member, Search and AI Evaluations

National Institute of Standards and Technology (NIST)Gaithersburg, MD

Gaithersburg, MD

Yesterday

Director, Evaluations

LawZeroMontreal, Canada

Montreal, Canada

Yesterday

Team Member, Model Policy

OpenAISan Francisco, CA

San Francisco, CA

2 days ago

Member of Technical Staff

CivAIBerkeley, USA

Berkeley, USA

2 weeks ago

Data Scientist

Center for Security and Emerging Technology (CSET)Washington, USA

Washington, USA

2 weeks ago

Researcher, Misalignment Research

OpenAISan Francisco, USA

San Francisco, USA

3 weeks ago

Join 60k subscribers and sign up for the EA Newsletter, a monthly email with the latest ideas and opportunities

View past editions