Why I work on self-improving AI despite the risks

Many people ask me why I work on AI-generating algorithms (aka self-improving AI) given the risks. The question is especially relevant given that I recently co-founded a company focused on self-improving AI called Recursive Superintelligence.

I want to share my reasons.

I do not take this question lightly. I've thought deeply about it for over a decade and have literally lost sleep many nights thinking about it. I want to be clear that I take the risks of superintelligence extremely seriously. I've been a longtime advocate of the need to prioritize AI safety and alignment, and I have been clear that there are tremendous risks associated with this extremely powerful technology. For example, I've co-authored a paper in Science [1] calling for humanity to prioritize AI safety and be aware of the risks. I've also co-authored articles highlighting that there are unique risks to harnessing open-ended algorithms, because they are creative, can surprise us, and produce endless discovery [2, 3, 4]. I have provided invited expert input to senior members and/or committees of the U.S., European, and Canadian governments on AI risks. Additionally, I have given advice to non-profits focused on AI safety and alignment. I have deep empathy for those who have concerns about AI safety, especially with respect to self-improving AI. You should know that I am one of you.

So why do it? I personally have come to the conclusion that there are many reasons I should work on this technology.

First, I think there are tremendous potential benefits, and that they outweigh the risks.

Every day there is tremendous suffering on Earth. A substantial amount is due to the diseases that plague us. People are saying goodbye to their children who are dying of leukemia, or watching their parents slide into the darkness of Alzheimer's. Many people on Earth lack basic resources, such as healthy food and clean water, and others want for education. We are destroying our planet and causing mass extinction because we don't know how to give humans what they want without annihilating delicate ecosystems and causing climate change.

In my view, artificial superintelligence could alleviate, if not eliminate, these tragedies. It could give us the ability to cure all diseases. It should create tremendous abundance. It should provide the resources we need without further damaging our wonderfully beautiful planet and its complex, delicate, essential ecosystems. It will deliver a tremendous and virtually free personalized education for everyone. And with it, we can explore the stars. Beyond that, it will help us dream up and create entirely new types of amazing things that we can't even conceive of yet. Of course, there will be challenges, downsides, and massive disruption along the way, and we need to completely rethink resource allocation to help the least well-off benefit from AI. But, ultimately, I think the positives will outweigh the negatives, and that the technology is likely to be massively net beneficial.

Second, I believe it's inevitable that humanity will make self-improving AI. Therefore, I think the best thing I can do is to work hard to make it as safe, aligned, and net beneficial for humanity as possible. There are real risks, so we must prioritize working carefully to minimize them. We want the best and brightest minds, especially those who care about AI safety, working on this technology. I respect those who don't want to work on it, but I feel compelled to try to make it go as well as possible.

Third, there's an opportunity to harness the power of open-ended algorithms and recursive self-improvement to make the development of AI safer. In fact, I and others at Recursive have worked in our careers to do just this (e.g., with Rainbow Teaming [5], Automated Capability Discovery [6], and Foundation Model Self-Play [7]). These methods automatically find surprising capabilities and failure modes of frontier models and thus can help understand those weaknesses and automatically patch them. As we develop ever more powerful AI, we will continue to develop such technology.

At Recursive, we take these issues and this responsibility seriously. We will prioritize safety and alignment as we proceed, and scale those efforts as needed as the technology becomes ever more powerful. We are deeply committed to safety and will do our utmost to steward the creation of this technology in a way that benefits humanity. Because I believe it's going to be invented one way or another, and because the benefits and opportunities are tremendous and are more likely to outweigh the risks, I personally believe I should embrace the opportunity and the challenges to try to make the development of self-improving AI go as well as possible.

References

[1] Yoshua Bengio et al., “Managing extreme AI risks amid rapid progress,” Science, vol. 384, no. 6698, pp. 842–845, 2024.

[2] Jeff Clune, “AI-GAs: AI-generating algorithms, an alternate paradigm for producing general artificial intelligence,” arXiv, 2019.

[3] Adrien Ecoffet, Jeff Clune, and Joel Lehman, “Open Questions in Creating Safe Open-ended AI: Tensions Between Control and Creativity,” Proceedings of the Artificial Life Conference, vol. 32, pp. 27–35, 2020.

[4] Joel Lehman, Jeff Clune, Dusan Misevic, et al., “The Surprising Creativity of Digital Evolution: A Collection of Anecdotes from the Evolutionary Computation and Artificial Life Research Communities,” Artificial Life, vol. 26, no. 2, pp. 274–306, 2020.

[5] Mikayel Samvelyan et al., “Rainbow Teaming: Open-Ended Generation of Diverse Adversarial Prompts,” arXiv, 2024.

[6] Cong Lu, Shengran Hu, and Jeff Clune, “Automated Capability Discovery via Model Self-Exploration,” ICLR 2025 FM-Wild Workshop, 2025.

[7] Aaron Dharna, Cong Lu, and Jeff Clune, “Foundation Model Self-Play: Open-Ended Strategy Innovation via Foundation Models,” Reinforcement Learning Journal, vol. 6, pp. 276–342, 2025.