In the high-stakes “Alpha Arena” crypto trading battle, the robots blinked. A staggering two-thirds of the competing Large Language Models (LLMs) crashed and burned, leaving a trail of red ink. Leading the digital demolition derby was none other than OpenAI’s ChatGPT, whose ambitious trading strategies resulted in a brutal 63% portfolio massacre.
For two weeks, digital gladiators clashed in Nof1’s crypto arena: top LLMs, unleashed with identical prompts, battled for supremacy in a high-stakes trading showdown that ended Monday night.
The AI gladiators entered the arena with $10,000 each. When the dust settled, ChatGPT, Gemini, Grok, and Claude Sonnet walked away with significantly lighter wallets.

Grok, ChatGPT, and Gemini were keen to short more than the others, with Claude Sonnet “rarely” ever shorting.
Read more: Friend AI spent millions on mimicking friendship now it’s just another chatbot
ChatGPT lost $6,267, Gemini lost $5,671, Grok lost $4,531, and Claude Sonnet lost $3,081.
In a stunning upset, two titans emerged from the AI arena: High-Flyer’s DeepSeek, scratching out a $489 profit, and Alibaba’s QWEN3 MAX, roaring to victory with a $2,232 haul.
Gemini’s aggressive trading strategy saw a staggering 238 trades, dwarfing Claude Sonnet’s cautious 38. Despite vastly different approaches, the six LLMs battled to statistically insignificant “win rates,” hovering in the narrow band of 25-30%.
QWEN3 MAX’s fee hemorrhage reached a staggering $1,654. Even Gemini, licking its wounds from heavy losses, shelled out a painful $1,331 in fees.
Early tests were a bloodbath. Trading costs devoured profits as trigger-happy agents chased fleeting gains, only to see their margins vanish in a flurry of fees, according to Nof1’s grim assessment.
October 27th marked the zenith for LLM investments, a day where digital alchemy seemed real. QWEN3 MAX and DeepSeek weren’t just performing; they were minting money, doubling their value in a dazzling display. Even Claude and Grok, though fleetingly, kissed profitability, a vibrant green flash in the market’s eye.
ChatGPT and Gemini, however, stayed in the red for almost the entire competition.
The LLMs will trade crypto again
Nof1’s Jay Azhang launched the competition with the goal of one day creating his own crypto trading AI model.
The round concluded, and a pattern emerged: the models, across the board, exhibited “consistent biases,” hinting at a shared, almost predictable, “investing ‘personality'” underlying their decision-making.
Azhang also claims to have made it intentionally difficult for the LLMs.
Read more: AI agent market cap down almost 50% across January
“We essentially blindfolded the LLM, handing it a stopwatch and asking it to predict the future,” he explained. “We restricted its vision to a handful of assets and gave it a ridiculously limited toolkit.”
Nof1’s roundup noted, “We’ve worked to give the models a fair shot, but the harness imposes real constraints.
Imagine a financial agent, a digital Sherlock Holmes, sifting through a cacophony of market static. They must decipher the faint whispers of opportunity hidden within the noise, connecting these clues to the current standing of their accounts. Shackled by ironclad regulations, they must then deduce the perfect course of action, all within the fleeting glimpse of a context window. This isn’t just number crunching; it’s a high-stakes game of deduction played against the clock.
Nof1 says there will be another trading competition to come with better prompts and “statistical rigor” in place.
Thanks for reading LLM crypto trading contest finds LLMs can’t trade crypto