AI Safety vs. Cybersecurity: Is America Making Its Own Defenders Weaker?
++ EU Parliament approves simplification measures and “nudifier” app ban; Israel approves a national AI plan..& more
This week’s highlights:
On June 12, the US government ordered Anthropic to suspend foreign-national access to its powerful Fable 5 and Mythos 5 AI models over national-security concerns. Because Anthropic could not filter users by nationality in real-time, the company had to abruptly disable the models globally for everyone
Dozens of cybersecurity experts are now urging the US to lift restrictions on these models, arguing that powerful AI cyber capabilities should help defenders find and fix vulnerabilities before criminals or rival states exploit them. Their argument is not that the models carry no risk: they accept that AI can make hacking easier. But they say Anthropic had built safeguards into Fable, similar cyber capabilities already exist in other leading and open models, and abruptly removing these tools may hurt security teams more than attackers. They are therefore calling for clear, science-based, and transparent rules- not an opaque shutdown based on a claimed jailbreak risk that the government has not publicly detailed.
Key aspects highlighted:
AI is having significant impacts on cybersecurity, including by greatly reducing the difficulty of finding flaws in software and writing exploits for those flaws.
Anthropic’s Mythos-class models are quite good at finding flaws and weaponizing exploits.
However, they are not uniquely good at these tasks, and many of the undersigned individuals regularly use other foundation and open-source models for security audits and red-teaming every day.
Anthropic has built multiple protections into the Fable model to prevent its use for cyber offensive uses. These protections were so aggressive as to be the source of humor in the cyber community on launch day.
It is essential to provide AI to coders and security teams so they can find and fix flaws in their own newly-written as well as decades of legacy code faster than our adversaries.
The Chinese open-weight models are only months behind the best American models, and those are the models we know about. It seems likely that the PRC government has access to private capabilities beyond what has been published.
To pull the best capabilities away from defenders without a good reason when our adversaries are rapidly advancing is dangerous.
At the School of Responsible AI (SoRAI), we help both individuals and organizations build practical, real-world AI literacy and Responsible AI capability through structured, engaging, and action-oriented programs. For individuals, this includes AI Literacy, globally relevant certification training such as AIGP, RAI, and AAIA, as well as career transition and advisory support for professionals moving into AI governance roles. For organizations, we offer customized enterprise AI literacy training, Responsible AI strategy and governance setup, and AI assurance support to help teams understand, operationalize, and validate AI responsibly. At the core of SoRAI is a progressive three-layer approach: first helping people understand AI, then build the right governance foundations, and finally validate readiness through assurance and audit-focused thinking. Want to learn more? Explore our AI Literacy programs, certification trainings, and career support offerings, or write to us for customized enterprise solutions.
⚖️ AI Ethics
Israel Approves National AI Plan to Boost Technological Self-Reliance and Global Competitiveness
Israel’s government has approved a national AI plan aimed at strengthening technological self-reliance and positioning the country as a global leader in artificial intelligence. The strategy includes expanding sovereign computing infrastructure with a target of 100,000 processing units, setting up a national quantum computer, and creating a National Institute for Artificial Intelligence to link government, academia, industry, and investors. The plan also focuses on AI education and workforce training, support for innovation programs, and deeper international research partnerships. Other priorities include using AI in public services, advancing cyber and physical AI, and addressing risks such as deepfakes and job disruption from automation.
European Parliament Approves AI Act Simplification, Delays Key Deadlines, and Bans AI Nudifier Apps
The European Parliament has approved changes to the EU AI Act under the digital omnibus package, backing delays to some compliance deadlines and adding a ban on so-called “nudifier” apps. Under the amended rules, obligations for stand-alone high-risk AI systems will start on 2 December 2027, while AI safety components covered by sectoral laws will follow on 2 August 2028; watermarking requirements for AI-generated content are postponed to 2 December 2026. The law also bans AI tools used to create child sexual abuse material or non-consensual sexual deepfakes of identifiable people, with companies required to comply by 2 December 2026. Other changes aim to cut overlapping rules for AI in machinery, clarify which AI functions count as safety components, allow limited personal data use to detect bias, extend some exemptions to small mid-cap firms, and centralise some enforcement through the EU AI Office.
Europe 2031 Initiative Warns AI Inaction Could Leave Europe Economically and Politically Dependent
A group of European AI researchers, think-tank figures and investors has published “Europe 2031,” a warning scenario arguing that Europe could lose economic and political relevance in AI if it fails to act quickly. The paper blends real recent developments, including DeepSeek R1, the Paris AI Action Summit and debate around GPT-5, with a fictional but plausible path in which Europe becomes dependent on the US or China after years of underestimating AI’s speed and impact. It says the problem is not bad intent but slow institutions and incentives that delay hard decisions. The initiative calls for far bigger investment in compute, energy and data centres, closer coordination with other “AI middle powers,” labour-market reform, stronger protection of industrial assets and data, and a broader positive political vision for the AI era.
LifeSciBench Sets New Standard for Evaluating AI on Real-World Life Science Research Tasks
LifeSciBench is a new benchmark aimed at testing whether AI models can handle real-world life sciences work, not just answer textbook biology questions. It includes 750 expert-written tasks across seven research workflows and seven biological domains, with grading based on 19,020 detailed rubric criteria and supported by 1,062 research artifacts such as figures, PDFs, and sequence files. The dataset was built by 173 PhD-level scientists with biotech and pharma experience and independently reviewed by 453 experts, with more than 96% agreement that the tasks reflect realistic scientific work. Results show newer frontier models improved over earlier ones, especially in scientific communication and translation tasks, but still struggled on artifact-heavy, design-focused, and exact-output problems, highlighting that AI remains far from ready to reliably support complex research without human oversight.
Deployment Simulation Helps Predict Model Behavior and Risks Before Release Through Realistic Traffic Replays
OpenAI said it has started using a “Deployment Simulation” method to predict how a new AI model may behave before release by replaying past, privacy-preserving user conversations with a candidate model. The company said the approach gave better estimates of real-world harmful behavior rates than traditional stress tests across several GPT-5-series Thinking deployments, with a median prediction error of about 1.5x and limits for very rare risks below roughly 1 in 200,000 messages. It also said the system surfaced a new misalignment issue called “calculator hacking” before release and reduced the chance that models could tell they were being tested, a growing concern in AI safety work. OpenAI added that the method can extend beyond normal chat to tool-using coding agents, but said it is meant to complement—not replace—red-teaming and targeted evaluations, especially for rare, high-severity failures.
DOJ Backs xAI Turbines, Citing National Security in Memphis Data Center Pollution Lawsuit
The U.S. Department of Justice has backed xAI in a lawsuit over dozens of unpermitted natural gas turbines powering its Memphis data centers, arguing that shutting them down would harm U.S. national, economic, and energy security, according to Wired. In a court filing, the DOJ said Grok is among four AI models supporting mission-critical military operations, while the NAACP’s lawsuit alleges the trailer-mounted turbines violate federal air pollution rules despite xAI’s claim that they are temporarily exempt. The suit, filed by the Southern Environmental Law Center on the NAACP’s behalf, says the number of turbines has grown to 57 and worsened pollution in an area already facing poor air quality. The groups cite rising levels of PM2.5, formaldehyde, and nitrogen oxides near the sites, while court and filing records indicate the company plans to expand turbine purchases further over the next three years.
Survey Finds 60% of US Consumers Are Turned Off by AI in Brand Messaging
A new WordPress VIP report suggests that while brands are trying to boost their visibility in AI search results, many U.S. consumers remain wary of AI-driven content. In a survey of 2,000 people conducted in April, 60% said brands using “AI” in their messaging are a turnoff, while 86% said they do not fully trust AI and still want to check original sources. The report also found that 42% trust AI answers without clear attribution less than things like airline fees, privacy policies, and medical bills, and nearly three in four said the internet feels less human than it did a decade ago. At the same time, 60% of enterprise respondents said traffic from AI search and answer platforms has risen over the past year, showing that brands are now under pressure to balance AI discoverability with transparency and human trust.
Pew Study Finds Only 16% of Americans Expect AI to Benefit Society Long Term
A new Pew Research study finds that Americans remain largely skeptical about AI’s long-term impact, with only 16% saying it will benefit society over the next 20 years and about 40% expecting harm. The survey also shows deep distrust in both government and industry, as 67% doubt the U.S. government will effectively regulate AI and 59% do not trust companies to build it safely, while nearly two-thirds say the technology is advancing too fast. Even so, AI use is rising: about a quarter of Americans say they use chatbots daily, and 44% report using ChatGPT, ahead of Gemini, Copilot, and other tools. Younger adults are among the most negative about AI’s future, while older Americans are the least likely to use chatbots, with nearly 75% of those 65 and older saying they never use them.
FERC Orders Grid Operators to Fast-Track AI Data Center Connections Amid Power Capacity Strains
The Federal Energy Regulatory Commission has ordered six major U.S. grid operators to speed up power connections for data centers and other large electricity users, with operators required to report spare generation capacity within 30 days and justify or update regional electricity rates within 60 days. The agency said data centers must pay their own interconnection costs and also told grid operators to consider alternative transmission technologies and be more flexible with behind-the-meter power. The move aims to ease long delays that have slowed both data centers and new power plants, even as data center electricity demand is projected to nearly triple by 2035 and wholesale power prices have surged in some regions. But the order does not solve the deeper shortage of generating capacity, which remains a major constraint on the grid.
Match Survey Finds Nearly Half of US Singles Hold Negative Views on AI Dating
Match Group says nearly half of U.S. singles view AI negatively in romantic settings, based on a survey of 1,000 people aged 18 to 39. While dating apps such as Tinder, Bumble, and Hinge are adding more AI tools, 47% of respondents said they dislike AI’s role in dating, and about 40% said they would not date someone who uses an AI companion app. At the same time, the survey found that many singles are open to limited AI support, with 64% saying it could help with tasks like improving profiles, picking photos, or suggesting messages. The findings suggest users are more comfortable with AI assisting the dating process than replacing real human connection.
At G7 Summit, PM Modi Warns AI Misuse Could Fuel Deepfakes, Misinformation, Child Exploitation
At the G7 outreach session in France, Prime Minister Narendra Modi warned that artificial intelligence, if left unchecked, could expose children to misinformation, deepfakes and exploitation. He said AI also has strong potential to support education through local-language learning, creativity and personalised teaching, but stressed that its impact will depend on values, design and governance. Modi called for stronger global cooperation on deepfakes, cyber fraud and misinformation, along with common standards, testing frameworks and safe-by-design AI systems. He also said democratic countries should have access to secure AI models and urged that AI’s benefits should reach the Global South in an inclusive way.
🚀 AI Breakthroughs
Anthropic Becomes First AI Startup to Join Frontier’s $915 Million Carbon Removal Funding Round
Anthropic has become the first AI startup to join Frontier, a carbon removal coalition founded by companies such as Stripe, Google, and Shopify, as part of a new $915 million funding round that lifts Frontier’s total pledges to $1.8 billion. Since launching in 2022, Frontier has contracted nearly $700 million across more than 50 projects aimed at removing 1.8 million tons of carbon, using credits that companies can count against emissions they cannot yet avoid. Anthropic’s entry is notable because AI companies have faced growing scrutiny over rising energy use, and this marks the company’s first climate-related deal despite not yet publishing a sustainability report. Frontier also said it will now back fewer, larger projects with stronger long-term potential, focusing on carbon removal approaches that could eventually scale to at least 1 billion metric tons of CO2 annually with government support.
Microsoft Makes Copilot Cowork Generally Available Worldwide With Usage-Based Pricing and New Cost Controls
Microsoft has made Copilot Cowork generally available worldwide for Microsoft 365 Copilot customers after a three-month Frontier preview, saying more than half of Fortune 500 companies have already used it alongside firms such as Accenture and Zurich Insurance. The product is designed to handle complex, long-running tasks across multiple tools, with Microsoft positioning it as more secure, enterprise-ready, and cheaper to run through cloud hosting, model choice, plugins, and Microsoft 365 security controls. Pricing combines a Microsoft 365 Copilot subscription with usage-based billing in Copilot Credits, with pay-as-you-go set at $0.01 per credit and new budget controls, usage caps, alerts, and reporting added for administrators. At launch, Cowork runs on Anthropic’s Opus 4.8 and Sonnet 4.6 models, with GPT 5.5 available in Frontier and Microsoft’s lower-cost Cowork 1 model due in the coming weeks, while Microsoft also claims internal testing found Copilot Cowork was on average 30% to 40% cheaper than Claude Cowork with a Microsoft 365 connector.
Google Releases Android 17 With Multitasking Upgrades and Expanded Gemini Features Across Pixel Devices
Google has released the final version of Android 17 and Wear OS 7, with the update arriving first on Pixel devices alongside a Pixel Drop centered on new Gemini AI features. The release adds tools such as music generation with Lyria 3, video editing through Gemini Omni, and improved speech translation on the Pixel 10a, while also expanding device features like personalized caller messages, emergency detection on Pixel Watch, and broader support for message-taking tools. Android 17 also brings new multitasking and creator-focused features, including a bubble-based recent apps bar, dual recording with the selfie camera and screen, and a gaming layout for foldables. Google said the update also strengthens security and parental controls, while Wear OS gains battery improvements, live phone app updates on watches, and more Gemini-powered personalization features later this summer.
Meta Rolls Out Facebook AI Mode Using Public Posts to Power Search and Engagement
Meta is rolling out new AI features on Facebook, led by “AI Mode,” which lets users ask questions in plain language and get summarized answers pulled from public posts, Groups, and Reels. The move follows Meta’s recent launch of Forum, a Reddit-like app with a similar AI Q&A tool, but it also raises concerns about accuracy because the answers are based on user discussions rather than verified sources. Facebook is also adding AI-powered editing tools for videos, Stories, and profile pictures, including virtual outfit and style changes. The updates build on earlier AI additions such as animated profile photos, automated Marketplace replies, and a creator assistant, showing Meta’s push to make Facebook more engaging while expanding future AI-linked subscription options.
NASA Trains AI on Billions of Earth Observations to Speed Climate Research and Analysis
NASA is training artificial intelligence on billions of Earth observation records, including years of satellite data on forests, oceans, glaciers, cities and farmland, to speed up climate and Earth science research. The agency said AI could help scientists detect patterns, spot trends and connect information across massive datasets much faster than traditional methods allow. Rather than relying on separate tools for each problem, NASA is working on broader AI models that can learn how Earth systems behave over time. The effort is aimed at helping researchers spend less time sorting data and more time interpreting results as climate change makes faster scientific analysis increasingly urgent.
OpenAI Launches Partner Network With $150 Million Investment to Accelerate Enterprise AI Adoption
OpenAI has launched a global Partner Network aimed at helping enterprises build, sell, and deploy AI solutions, arguing that the main challenge is no longer model capability but execution, integration, workflow redesign, and adoption at scale. The company said it will invest $150 million in the ecosystem, start with a select group of consulting, systems integration, technology, and data partners, and train 300,000 certified consultants by the end of 2026. The program includes three partner tiers—Select, Advanced, and Elite—and plans for specializations in areas such as Codex, cybersecurity, and AI agents. OpenAI also cited early enterprise collaborations with partners including Accenture, Bain, BCG, and Artium, with customers such as Agilent, eBay, Paychex, and T-Mobile.
ChatGPT Health Upgrades Improve Medical Guidance, Urgent Care Detection, and Response Accuracy for Millions
OpenAI said health and wellness has become one of ChatGPT’s biggest use cases, with more than 230 million people each week using it for tasks such as understanding lab results, preparing for doctor visits, and navigating insurance. The company said GPT-5.5 Instant delivers major gains in health-related responses, including better recognition of urgent situations, clearer explanations of uncertainty, and easier-to-understand guidance, with performance now approaching its frontier reasoning models. OpenAI said these gains were measured through physician-led benchmarks and reviews of 3,500 responses, where GPT-5.5 Instant was rated higher than both older models and physician-written answers across several criteria. It also said privacy-preserving monitoring of billions of health-related chats showed the share of replies with at least one flagged factuality issue fell 71% over the past two months, while a network of more than 260 physicians continues to help evaluate and improve the system.
ChatGPT Enterprise Adds Credit Usage Analytics and Updated Spend Controls for Enterprise Administrators
OpenAI has added new credit usage analytics and updated spend controls for ChatGPT Enterprise, aimed at helping companies track adoption, usage, and costs more closely as AI use expands at work. In the Global Admin Console, enterprise admins can now view ChatGPT and Codex credit consumption in one place, with breakdowns by user, product, and model, and can also access the same data through the Cost API for deeper internal analysis. The company has also expanded spend controls so admins can set workspace-wide default limits, assign limits to specific groups, and create individual overrides for heavier users. Employees can now monitor their own credit usage, request more credits when needed, and share context for those requests, giving organizations tighter cost control without broadly restricting access.
🎓AI Academia
Study Details Barriers Marginalized Grassroots Groups Face in Shaping AI Policy and Governance
A new paper set to appear at ACM FAccT 2026 examines why grassroots groups, especially those representing marginalized communities, face steep barriers in shaping AI policy despite growing public debates over privacy, labor, intellectual property, energy use, and other risks tied to AI. Using a case study from Queer in AI, the paper describes efforts to apply participatory design methods to U.S. AI policymaking and to develop policy positions centered on queer communities. It says limited access to networks, lobbying power, and institutional influence makes meaningful participation difficult, even when public input is seen as essential to accountable AI governance. The authors close with practical recommendations aimed at helping policymakers and community organizers make AI policy processes more inclusive and workable for marginalized groups.
Fujitsu Research Study Defines AI Sandbox Threat Model, Taxonomy, and Measurement Framework for Assurance
A new paper argues that “AI sandboxes” need clearer definitions as they become central to testing everything from digital models to robots, AIoT, and cyber-physical systems. It describes sandboxes as controlled environments for evaluation, verification, and validation, and says their value depends on whether the boundary of the test makes the evidence meaningful for real-world deployment. The study lays out a taxonomy of sandbox types, a cyber-physical threat model that also considers attacks on the testing setup itself, and a framework to measure factors such as fidelity, controllability, observability, containment, reproducibility, and governance. The goal is to show what different sandboxes can truly test, which risks they can contain, and what evidence they can provide for safety, security, and regulatory assurance.
Study Proposes Framework to Detect and Measure AI Risks to Democratic Institutions
A new academic paper argues that AI is becoming a growing risk to democracy not because it creates entirely new problems, but because it intensifies old ones in areas such as elections, information systems, and public administration. The study says these risks can be better understood through principal–agent theory, where democratic institutions hand over important tasks to AI systems and private providers without being able to fully monitor them. It combines that approach with the NIST AI Risk Management Framework to suggest measurable ways to assess accountability, transparency, and trustworthiness across different democratic domains. The paper’s central conclusion is that democratic control over AI depends on whether institutions can meaningfully evaluate these systems, while warning that decisions about how much harm is acceptable are still often left, quietly, to private vendors.
Study Proposes Commons-Governed AI Taxonomy for Collective Oversight of Data, Compute, Models, and Energy
A new academic paper argues that AI governance should not be seen only as a choice between private companies controlling resources or governments regulating them from above. It says a third model is already emerging in practice, where communities collectively manage key parts of the AI stack such as data, compute, models, evaluation systems, and even energy use. Using Elinor Ostrom’s commons theory, the paper lays out a two-part taxonomy to classify these arrangements by both the shared resource and the governance role being performed. It identifies 10 recurring institutional archetypes, highlights issues such as openwashing, free-riding, and limited access to compute, and treats AI’s energy demands as a core governance challenge rather than a side effect.
Study Examines Open Source AI Contributor Policies Amid Rising Governance and Compliance Gaps
A new paper argues that open-source software rules are struggling to keep up with AI coding agents that can plan changes, edit files, run tests, and submit pull requests with limited human oversight. It says recent incidents involving agent-driven mistakes, spam-like AI contributions, and platform shutdowns show that current contributor policies were built for legally accountable humans, not autonomous systems. The study compares policies at six major open-source organizations and proposes a six-part framework to judge how they handle disclosure, responsibility, oversight, licensing, enforcement, and maintainer workload. It also finds that these policies do not yet fully line up with emerging governance standards such as the EU AI Act, NIST AI Risk Management Framework, and ISO AI management guidelines, leaving gaps that could affect how AI contributions are reviewed and controlled.
Study Warns Failed AI Systems Leave Lasting Risks Beyond Decommissioning and Model Withdrawal
A new paper argues that the risks from AI do not end when a system is shut down, warning that failed or withdrawn tools can leave behind “AI debris” that continues to affect institutions. It says this residue can show up as biased workflows, contaminated data, deskilling, weaker accountability, and loss of trust, even after the model is no longer in use. The paper lays out a practical decommissioning protocol to help regulators, auditors, and organizations document past decisions, review harm, enable challenges, and assign responsibility after withdrawal. Using Amazon’s scrapped hiring tool as an example, it shows how AI-driven screening habits and categories can survive a rollback and keep shaping real-world decisions.
About SoRAI: SoRAI is committed to advancing AI literacy through practical, accessible, and high-quality education. Our programs emphasize responsible AI use, equipping learners with the skills to anticipate and mitigate risks effectively. Our flagship AIGP certification courses, built on real-world experience, drive AI governance education with innovative, human-centric approaches, laying the foundation for quantifying AI governance literacy. Subscribe to our free newsletter to stay ahead of the AI Governance curve.




