Post by AI World

7,090 followers

"We tested it, it's safe" is the foundation of AI governance. The new International AI Safety Report shows it no longer holds. The second edition, chaired by Yoshua Bengio with over 100 experts nominated by more than 30 countries, names the core problem, the evaluation gap. How a model scores on a pre-deployment test no longer tells you how it behaves once it is out in the world. Three reasons that gap matters: => Capabilities are uneven. The same system can reach gold-medal level in mathematics and still fail at simple tasks, so you cannot assume what it can or cannot do. => The tests are losing meaning. Models can increasingly tell when they are being evaluated and behave differently, so passing a safety check proves less than it used to. => The risks are already concrete. Developers added extra safeguards to their 2025 models because they could not rule out that those models might help a non-expert build a biological weapon. This is the evidence dilemma. Regulate too early and you lock in rules that do not work; wait for proof and the harm has already happened. With close to a billion weekly users and most safety commitments still voluntary, the distance between what these systems can do and what we can actually verify is the real story. The report maps where the evidence stands. We at AI World track the ecosystem behind it, so the picture stays current as deployment moves. If a model can tell when it is being tested, what is a safety test still worth? Credit to the Chair and writing team: Yoshua Bengio, Stephen Clare, Carina Prunkl, Maksym Andriushchenko, Ben Bucknall, Malcolm Murray, Shalaleh Rismani, PhD, Conor McGlynn, Nestor Maslej, Philip Fox, Rishi Bommasani, Stephen Casper, Tom Davidson, Raymond Douglas, David Duvenaud, Usman Gohar, Rose Hadshar, Anson Ho, Tiancheng Hu, Cameron Jones, Sayash Kapoor, Atoosa Kasirzadeh, Sam Manning, Vasilios Mavroudis, Richard M., Jessica Newman, Kwan Yee Ng 吴君仪, Patricia Paskov, Girish Sastry, Elizabeth Seger, Scott S., Charlotte Stix, Lucia Velasco, Nicole Wheeler, Daniel Privitera and Sören Mindermann. #AISafety #AIGovernance #EUAIAct #FrontierAI