Grok 4.20 beats all other AI models in Alpha Arena test | Qwik News

Grok 4.20 beats all other AI models in Alpha Arena test

9 points by terryds 10 hours ago | 2 comments

gavinray 9 hours ago |
> Grok 4.20 reportedly uses real-time data like market trends and news to make fast decisions.
I assumed all of the models were doing that, using at least Web Search tools.
My hunch of why Grok's other model performed top-3 was due to access to Tweets, which are sentiment analysis gold mine for ticker symbols.
ben_w 8 hours ago |
> I assumed all of the models were doing that, using at least Web Search tools.
Sometimes. The other week I was asking ChatGPT about the UK PM, and had to stop the generation early because it started ~"Prime Minister Rishi Sunak…"
The unreliability is also why techniques as simple as "ask 5 times and have it take a vote of its own answers" boost performance. Or "thinking" modes which are approximately just replacing the end token with "Wait." and continuing for ten rounds.

500 Internal Server Error

500 Internal Server Error