File Download

There are no files associated with this item.

  Links for fulltext
     (May Require Subscription)
Supplementary

Article: AI safety: a climb to Armageddon?

TitleAI safety: a climb to Armageddon?
Authors
KeywordsAI safety
Existential risk
Holism
Mitigation
Optimism
Issue Date1-Jul-2025
PublisherSpringer
Citation
Philosophical Studies, 2025, v. 182, p. 1933-1950 How to Cite?
AbstractThis paper presents an argument that certain AI safety measures, rather thanmitigating existential risk, may instead exacerbate it. Under certain key assumptions -the inevitability of AI failure, the expected correlation between an AI system's power atthe point of failure and the severity of the resulting harm, and the tendency of safetymeasures to enable AI systems to become more powerful before failing - safety effortshave negative expected utility. The paper examines three response strategies:Optimism, Mitigation, and Holism. Each faces challenges stemming from intrinsicfeatures of the AI safety landscape that we term Bottlenecking, the Perfection Barrier,and Equilibrium Fluctuation. The surprising robustness of the argument forces a reexaminationof core assumptions around AI safety and points to several avenues forfurther research.
Persistent Identifierhttp://hdl.handle.net/10722/366347
ISSN
2023 Impact Factor: 1.1
2023 SCImago Journal Rankings: 1.203

 

DC FieldValueLanguage
dc.contributor.authorCappelen, Herman-
dc.contributor.authorDever, Josh-
dc.contributor.authorHawthorne, John-
dc.date.accessioned2025-11-25T04:18:52Z-
dc.date.available2025-11-25T04:18:52Z-
dc.date.issued2025-07-01-
dc.identifier.citationPhilosophical Studies, 2025, v. 182, p. 1933-1950-
dc.identifier.issn0031-8116-
dc.identifier.urihttp://hdl.handle.net/10722/366347-
dc.description.abstractThis paper presents an argument that certain AI safety measures, rather thanmitigating existential risk, may instead exacerbate it. Under certain key assumptions -the inevitability of AI failure, the expected correlation between an AI system's power atthe point of failure and the severity of the resulting harm, and the tendency of safetymeasures to enable AI systems to become more powerful before failing - safety effortshave negative expected utility. The paper examines three response strategies:Optimism, Mitigation, and Holism. Each faces challenges stemming from intrinsicfeatures of the AI safety landscape that we term Bottlenecking, the Perfection Barrier,and Equilibrium Fluctuation. The surprising robustness of the argument forces a reexaminationof core assumptions around AI safety and points to several avenues forfurther research.-
dc.languageeng-
dc.publisherSpringer-
dc.relation.ispartofPhilosophical Studies-
dc.rightsThis work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.-
dc.subjectAI safety-
dc.subjectExistential risk-
dc.subjectHolism-
dc.subjectMitigation-
dc.subjectOptimism-
dc.titleAI safety: a climb to Armageddon?-
dc.typeArticle-
dc.identifier.doi10.1007/s11098-025-02297-w-
dc.identifier.scopuseid_2-s2.0-86000291642-
dc.identifier.volume182-
dc.identifier.spage1933-
dc.identifier.epage1950-
dc.identifier.eissn1573-0883-
dc.identifier.issnl0031-8116-

Export via OAI-PMH Interface in XML Formats


OR


Export to Other Non-XML Formats