Jan Leike, a prominent AI researcher who recently left OpenAI and openly expressed concerns about the company’s approach to AI safety, has now joined Anthropic, a competitor of OpenAI, to head a newly formed “superalignment” team.
Leike, in a post on X, said that his team at Anthropic would prioritise many facets of AI safety and security, including “scalable oversight,” “weak-to-strong generalisation,” and automated alignment research.
According to an insider, Leike will have a direct reporting line to Jared Kaplan, who is the chief science officer at Anthropic. Additionally, the researchers at the company who are currently focused on developing methods to control the behaviour of large-scale AI systems in a predictable and desirable manner will now be reporting to Leike as his team begins its operations.
Leike’s team has a similar aim with OpenAI’s recently disbanded Superalignment project in some aspects. Under the joint supervision of Leike, the Superalignment team aimed to tackle the fundamental technological obstacles associated with managing superintelligent AI within a timeframe of four years. However, they often encountered obstacles imposed by OpenAI’s management.
Anthropic has consistently sought to establish itself as a company that prioritises safety more than OpenAI.
Dario Amodei, the CEO of Anthropic, formerly held the position of Vice President of Research at OpenAI. It is said that he parted ways with OpenAI due to a dispute on the company’s trajectory, namely its increasing emphasis on commercial endeavours. Amodei recruited many former OpenAI personnel, including Jack Clark, who previously served as OpenAI’s policy head, to establish Anthropic.