• Home
  • Best Bitcoin Cards
  • Best Bitcoin Exchanges
  • Best Bitcoin Wallets
  • Bitcoin Wallet Security Guide
  • Bonuses
  • More
    • Calculator
    • Coinbase Vs Gemini Card
    • Crypto Card Fees Explained
    • Crypto Tax Starter Guide
No Result
View All Result
Card Bitcoin
Card Bitcoin
No Result
View All Result

Anthropic’s Mythos is evolving faster than expected, reports AI safety agency

by n70products
May 14, 2026
in NFTs
0
Anthropic’s Mythos is evolving faster than expected, reports AI safety agency
74
SHARES
1.2k
VIEWS
Share on FacebookShare on Twitter


aiburst-gettyimages-2189115060

Eugene Mymrin/ Moment via Getty Images

Follow ZDNET: Add us as a preferred source on Google.


ZDNET’s key takeaways

  • The latest version of Claude Mythos has already advanced.
  • External researchers found that it achieved several firsts in testing. 
  • AI capabilities may be improving much faster than anticipated. 

Anthropic’s Claude Mythos, which the company maintains is too powerful to be released generally, already appears to have gained new capabilities. 

In a blog post published Wednesday, the UK AI Security Institute (AISI) reported that it had tested a newer version of Mythos, which outperformed both its earlier results and OpenAI’s GPT-5.5 — just a month after Mythos’ initial release. 

Also: Apple, Google, and Microsoft join Anthropic’s Project Glasswing to defend world’s most critical software

“The newer Mythos Preview checkpoint completed both our cyber ranges, solving the range ‘The Last Ones’ in 6 of 10 attempts and the previously unsolved ‘Cooling Tower’ in 3 of 10 attempts,” the blog authors wrote. “This was the first time that a model completed the second of our two cyber ranges.” 

When Anthropic first announced Mythos Preview and Project Glasswing — the cybersecurity testing alliance it formed with rival tech companies and AI labs, to which it gave limited access to Mythos — last month, UK AISI evaluated it, finding that the model “represents a step up over previous frontier models in a landscape where cyber performance was already rapidly improving.” 

That third-party perspective helped balance claims that the hype around Mythos was either solely marketing or, at the other end, signaled a catastrophic shift in AI capabilities. The truth about what the model can do is likely somewhere in the middle. 

Also: How to learn Claude Code for free with Anthropic’s AI courses – one took me just 20 minutes

AISI’s updated test also exemplifies that capability improvements aren’t restricted to individual model releases, but can happen within versions of a single model. 

A rapidly accelerating cyber threat 

AISI noted that AI models are rapidly advancing in their ability to handle cyber tasks, with serious implications for cybersecurity, especially given Mythos’ knack for detecting software vulnerabilities. 

“In February 2026, we internally estimated that the length of cyber tasks AI models could complete had doubled every 4.7 months since late 2024 – already an acceleration from our November 2025 estimate of 8 months,” the blog authors wrote. “Since then, AISI reported on two new models, Claude Mythos Preview and [OpenAI’s] GPT-5.5, which substantially exceeded both doubling rate trends.” 

Also: The third major Linux kernel flaw in two weeks has been found – thanks to AI

The authors added that it’s unclear whether that trend will hold or whether these findings indicate a lasting increase. Mythos and GPT-5.5 could simply be notable breaks from the overall pattern of model evolution. 

Still, AISI clarified that there are several unknowns its testing could not account for. The tests capped tasks at 2.5 million tokens, which let researchers better compare performance results over time. That inherently “understates what frontier models can do,” they wrote. 

“Mythos Preview and GPT-5.5 have large upper-bound error bars due to near-100% success rates on our narrow cyber suite’s longest tasks, even with the 2.5M token limit,” the blog continued. “Our tasks are also not long enough to determine how sharply the models’ reliability would deteriorate at higher task lengths. This places some of the latest models at the limit of what our narrow test suite can measure.”

Also: I put GPT-5.5 through a 10-round test: It scored 93/100, losing points only for exuberance

While this makes the point of model failure hard to measure, it also means model success rates on these tasks would be much higher without the token cap — so high, in fact, that “time horizons become impossible to calculate.” Models with more token access and complex agent infrastructure would be much more capable. 

“A 2.5M token limit is relatively low — in our cyber range experiment we use up to 100M tokens and find performance would likely still improve beyond that budget, especially for recent models, which disproportionately benefit from higher token limits,” the blog added. 





Source link

Tags: AgencyAnthropicsevolvingExpectedfasterMythosreportssafety
Previous Post

Dogecoin (DOGE) Breaks Away From Pack As Momentum Turns Aggressive

Next Post

Here’s An Estimate Of How Much Strategy Would Make On Its Bitcoin Holdings If Price Rises 30% Each Year

Next Post
Here’s An Estimate Of How Much Strategy Would Make On Its Bitcoin Holdings If Price Rises 30% Each Year

Here’s An Estimate Of How Much Strategy Would Make On Its Bitcoin Holdings If Price Rises 30% Each Year

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

Recent Posts

  • Ripple Partner Thunes Unveils Development That Could Strengthen XRP’s Global Payment Narrative
  • Amazon is selling a 2TB Samsung SSD for nearly 40% off right now – and it’s plenty fast for PC
  • US Democrats Push for FTC Investigation Into Prediction Markets
  • Analyst Calls Out Stagnant Logic Being Used On XRP, Predicts When Price Will Rally To $300
  • Ethereum Weakness May Be Final Phase Before Next Market Expansion

Recent Comments

No comments to show.

Archives

  • June 2026
  • May 2026
  • April 2026
  • March 2026
  • February 2026
  • January 2026
  • December 2025
  • November 2025

CATEGORIES

  • Altcoin
  • Bitcoin
  • Blockchain
  • Cryptocurrency
  • DeFi
  • Dogecoin
  • Ethereum
  • Market & Analysis
  • NFTs
  • Regulations
  • XRP

BROWSE BY TAG

Analyst Android Bank Bitcoin Blog Bottom Breakout BTC Business Buy Coinbase Crypto Data deals DOGE Dogecoin ETF ETFs ETH Ethereum Foundation Heres Hypergrid Institutional Investors Level Major Market Means Move Price Rally Ripple Risk Samsung Shows SOL Solana Stablecoin Support Surge Time Traders Whats XRP

© 2026 Card Bitcoin | All Rights Reserved

No Result
View All Result
  • Home
  • Best Bitcoin Cards
  • Best Bitcoin Exchanges
  • Best Bitcoin Wallets
  • Bitcoin Wallet Security Guide
  • Bonuses
  • More
    • Calculator
    • Coinbase Vs Gemini Card
    • Crypto Card Fees Explained
    • Crypto Tax Starter Guide

© 2026 Card Bitcoin | All Rights Reserved