Grok 4.20 in 2026
Few AI models in recent memory have arrived with as much momentum — or as much controversy — as Grok 4.20. Built by xAI, Elon Musk’s artificial intelligence company, Grok 4.20 was officially launched as a beta on February 17, 2026, and rapidly iterated into Grok 4.20 0309 v2 by April 7, 2026, cementing its place as the flagship reasoning model in the Grok 4 series. Under the hood, it introduces a multi-agent architecture in which four specialized agents collaborate in real time to debate, fact-check, and refine answers — a structural shift that fundamentally separates it from the single-pass inference models that dominated earlier generations. Trained on xAI’s Colossus supercluster, which now houses 555,000 NVIDIA GPUs at an estimated cost of $18 billion, Grok 4.20 operates with a 2 million token context window and achieves output speeds that sit meaningfully above the industry average for reasoning models at its price tier.
What makes Grok 4.20 globally significant in 2026 is not just its raw benchmark performance — it is the scale and speed at which it has captured real-world usage. Grok.com recorded 314 million monthly visits in January 2026, making it the third most-visited generative AI platform on the planet, behind only ChatGPT and Google Gemini. The broader xAI platform crossed 64 million monthly active users, and the company completed a landmark $20 billion Series E round in January 2026 at a $230 billion valuation — followed by an even more dramatic corporate transformation when SpaceX acquired xAI in February 2026, valuing the combined entity at $1.25 trillion. Against that backdrop, understanding Grok 4.20’s actual benchmark scores, pricing, traffic profile, and competitive position gives any marketer, developer, or enterprise decision-maker a sharper picture of exactly where this model sits in the global AI landscape today.
Interesting Facts About Grok 4.20 in 2026
GROK 4.20 AT A GLANCE — KEY FACTS SNAPSHOT (2026)
══════════════════════════════════════════════════════════════
Release Date (v2) April 7, 2026
Developer xAI (Elon Musk)
Context Window ██████████████████████████████ 2,000,000 tokens
Intelligence Index Score ████████████████████████████ 49 / 100 (Artificial Analysis)
Output Speed ████████████████████████████████ 106.1 tokens/second
Input Pricing $2.00 / 1M tokens
Output Pricing $5.00 / 1M tokens
Global Monthly Visits ████████████████████████████████████████ 314M (Jan 2026)
Monthly Active Users ██████████████████████████████████ 64 Million
xAI Valuation (Jan 2026) ████████████████████████████████████████ $230 Billion
══════════════════════════════════════════════════════════════
| Fact | Detail |
|---|---|
| Model Name & Version | Grok 4.20 (latest: 0309 v2, released April 7, 2026) |
| Developer | xAI (founded 2023 by Elon Musk) |
| Model Architecture | Multi-agent: 4 specialized agents collaborating in real time for fact-checking and reasoning |
| Context Window | 2,000,000 tokens — largest among mainstream frontier models |
| Artificial Analysis Intelligence Index | 49 / 100 — well above peer median of 35 |
| Output Speed (API) | 106.1 tokens/second — above the reasoning model average of 63.3 t/s |
| Time to First Token (TTFT) | 22.31 seconds — higher-end latency typical of deep-reasoning models |
| Input Pricing | $2.00 per 1M tokens |
| Output Pricing (API) | $6.00 per 1M output tokens (via xAI API) |
| Benchmark Suite Cost | $514.16 to run the full Artificial Analysis Intelligence Index v4.0 |
| Hallucination Rate (Grok 4.1) | Reduced ~3x — from ~12% to ~4% in production traffic |
| Grok 4 AIME 2025 Score | 91.7% — mathematics olympiad benchmark |
| Grok 4 MMLU Score | 92.1% — general knowledge across 57 subjects |
| Grok 4 GPQA Score | 87.5% — graduate-level science reasoning |
| Grok 4 HLE Score | 40.0% (Heavy variant: 50.7%) — Humanity’s Last Exam |
| Grok 4.1 FActScore | 97.0% — factual accuracy benchmark |
| LMArena Elo (Grok 4.1 Thinking) | 1,483 Elo — ranked #1 globally, 31 points ahead of nearest non-xAI model |
| GDPval-AA Elo (Grok 4.20 v2) | 1,179 (Grok 4.3 subsequently improved this to 1,500) |
| Global Rank (Cloudflare Radar 2025) | 9th among all generative AI services globally |
| Colossus GPU Cluster | 555,000 NVIDIA GPUs, estimated cost $18 billion |
| xAI Series E Funding (Jan 2026) | $20 billion at a $230 billion valuation |
| SpaceX–xAI Merger (Feb 2026) | Combined valuation: $1.25 trillion |
Source: Artificial Analysis (April 2026), xAI official announcements, Business of Apps (2026), SQ Magazine (May 2026), Reuters (Feb 2026), FatJoe (April 2026)
The facts table above tells a story of a model that is pushing two different frontiers simultaneously: raw reasoning capability and sheer global scale. The 2 million token context window — the largest of any mainstream frontier model — gives Grok 4.20 a structural advantage in tasks like analyzing lengthy financial documents, auditing codebases, or processing scientific literature end-to-end without chunking. Combined with the multi-agent architecture where four agents actively debate and fact-check outputs, the model’s 4% production hallucination rate (down from 12% in earlier iterations) makes a compelling case for enterprise reliability. The 106.1 tokens-per-second output speed sitting 67% above the reasoning model average means users are not trading throughput for depth, a tradeoff that historically plagued chain-of-thought heavy models.
The corporate and infrastructure facts are just as striking as the technical ones. The $20 billion Series E in January 2026 — secured at a $230 billion valuation — ranks among the largest single AI funding rounds in history, and the SpaceX acquisition of xAI at a $1.25 trillion combined valuation just weeks later fundamentally rewrote the ownership and strategic context around Grok. With 555,000 NVIDIA GPUs deployed in the Colossus cluster and xAI spending approximately $1 billion per month on infrastructure and training, the resource advantage behind Grok 4.20 is a structural moat that few companies on earth can match. The Grok 4 Heavy variant’s 50.7% score on Humanity’s Last Exam — the benchmark explicitly designed to resist AI saturation — cemented xAI’s claim to frontier-level reasoning in a way that is difficult to dispute.
Grok 4.20 Global Benchmark Performance in 2026
GROK 4.20 BENCHMARK SCORES — GLOBAL COMPARISON
═══════════════════════════════════════════════════════════════
Benchmark Grok 4.20 Peer Median Top Score (2026)
───────────────────────────────────────────────────────────────
Intelligence Index 49 35 53 (Grok 4.3)
MMLU (General) 92.1% ~88-90% ~93-94%
GPQA (Science) 87.5% ~70-80% 94.6% (Claude Mythos)
AIME 2025 (Math) 91.7% ~60-70% ~95%+
HLE (Hard Reasoning) 40.0% ~15-25% 50.7% (Grok 4 Heavy)
LiveCodeBench 79.0% ~50-60% ~80-85%
FActScore (Grok 4.1) 97.0% ~85-90% 97.0% (Grok 4.1)
LMArena Elo (4.1 T.) 1,483 ~1,300-1,400 1,483 (Grok 4.1 T.)
═══════════════════════════════════════════════════════════════
T. = Thinking variant
| Benchmark | Grok 4.20 Score | What It Measures | Context vs. Global Peers |
|---|---|---|---|
| Artificial Analysis Intelligence Index | 49 / 100 | Composite: reasoning, knowledge, math, coding | 40% above peer median of 35 |
| MMLU (General Knowledge) | 92.1% | 57 academic subjects, general knowledge | Top-tier; benchmark near saturation at frontier |
| GPQA Diamond (Science) | 87.5% | Graduate-level biology, physics, chemistry | Above average; top score 94.6% (Claude Mythos) |
| AIME 2025 (Mathematics) | 91.7% | Olympiad-level math problems | Exceptional; human top competitors ~90-95% |
| HMMT25 | 90.0% | Harvard-MIT Math Tournament (2025) | Among highest globally |
| LiveCodeBench (Coding) | 79.0% | Real-world competitive programming | Above the ~50–60% peer average |
| Humanity’s Last Exam (HLE) | 40.0% (Grok 4) / 50.7% (Heavy) | Multi-domain expert reasoning — hardest benchmark | First model to score 50% on HLE (Heavy variant) |
| FActScore (Grok 4.1) | 97.0% | Factual accuracy in free-form generation | Industry-leading factual reliability |
| LMArena Elo (Grok 4.1 Thinking) | 1,483 | Human-preference chat arena | #1 globally, 31 Elo points clear of nearest rival |
| GDPval-AA (Grok 4.20 v2) | 1,179 Elo | Real-world agentic tasks | Exceeded by Grok 4.3’s 1,500 Elo in April 2026 |
| ARC-AGI V2 (Grok 4) | 15.9% | Abstract visual reasoning — AGI proxy | Nearly doubled prior record (~8.6%) |
| USAMO 2025 (Grok 4 Heavy) | 61.9% | US Math Olympiad — proof-based problems | #1 globally on this benchmark |
| Vending-Bench (Agentic) | $4,694 net worth | Autonomous multi-step agentic task simulation | Vastly outperforms human baseline ($844) |
Source: Artificial Analysis (April 2026), xAI official Grok 4 launch page (July 2025), SQ Magazine Grok AI Statistics (May 2026), LLM Benchmarks 2026 (iternal.ai)
Looking at Grok 4.20’s benchmark profile globally, the model’s clearest strength lies in mathematical reasoning and hard multi-domain problems — precisely the areas where the gap between frontier models and everything else remains widest. The 91.7% AIME 2025 score and 90.0% HMMT25 place it in elite company, while the Grok 4 Heavy variant’s 50.7% on Humanity’s Last Exam was a landmark: the first AI model in history to cross the 50% threshold on a benchmark deliberately engineered to resist saturation by advanced AI. The 97.0% FActScore for Grok 4.1 — and the production hallucination rate drop from 12% to 4% — are the kinds of reliability improvements that matter most to enterprises deploying AI in regulated or accuracy-critical workflows.
Where Grok 4.20 sits more modestly is in the composite Artificial Analysis Intelligence Index score of 49, which, while 40% above the peer median of 35, is still 4 points behind Grok 4.3’s score of 53 released just weeks later in April 2026. This rapid iteration — from Grok 4.20 to Grok 4.20 v2 to Grok 4.3 within weeks — underscores that benchmark leadership in 2026 is a moving target, and Grok’s own version succession has been aggressive even by the accelerated standards of the current AI race. The GPQA Diamond score of 87.5% — while impressive in absolute terms — also sits below the 94.6% achieved by Claude Mythos Preview, signaling that scientific reasoning at the very frontier is still a competitive space where no single model dominates cleanly.
Grok 4.20 Global Traffic & User Growth Statistics in 2026
GROK.COM MONTHLY VISITS — GLOBAL GROWTH TRAJECTORY
═══════════════════════════════════════════════════════════════
Aug 2025 ████████████████ ~140-150M visits
Nov 2025 ████████████████████████ 234.4M visits (+14% MoM)
Jan 2026 ████████████████████████████████ 314M visits (record high)
Feb 2026 ███████████████████████████████ 298.6M visits (-4.9% MoM)
Mar 2026 ████████████████████████████████████ 326.3M visits (NEW RECORD +9.3%)
═══════════════════════════════════════════════════════════════
Year-over-Year Growth (Mar 2025 → Mar 2026): +61.03%
| Traffic Metric | Data Point | Period / Source |
|---|---|---|
| Monthly Web Visits (Record) | 326.3 million | March 2026 — all-time high |
| Monthly Web Visits (Jan 2026) | 314 million | January 2026 (Similarweb / Forbes) |
| Monthly Web Visits (Feb 2026) | 298.6 million | February 2026 (Similarweb) |
| Month-over-Month Change (Feb→Mar) | +9.3% | March 2026 |
| Year-over-Year Change (Mar 2025→2026) | +61.03% | March 2026 |
| Global Website Rank | 53rd globally | February 2026 (Similarweb) |
| Avg. Visit Duration | 12 min 57 sec | February 2026 (Similarweb) |
| Avg. Pages Per Visit | 21.41 | February 2026 |
| Bounce Rate | 26.48% | February 2026 |
| Desktop vs. Mobile Split | 78.62% desktop / 21.38% mobile | 2026 |
| Direct Traffic Share | 72.93% – 78.37% | 2026 (Similarweb) |
| Daily Queries Processed | ~134 million | 2026 (humanizeai.io) |
| Global Rank — Generative AI Services | 9th (new entry) | Cloudflare Radar 2025 Year in Review |
| Platform Global Rank vs. Competitors | 3rd (after ChatGPT, Gemini) | January 2026 |
Source: Similarweb via Business of Apps, FatJoe (April 2026), HumanizeAI.io (April 2026), Cloudflare Radar 2025
The traffic data for Grok.com in 2026 paints a picture of a platform that has moved decisively from niche AI curiosity to mainstream consumer destination. The 326.3 million visits in March 2026 — a 61% year-over-year increase and an all-time record — is a number that most standalone AI platforms would be proud to carry for their entire existence, let alone a product barely two-and-a-half years old. The 12 minutes and 57 seconds average session duration is particularly telling: it places Grok among the most deeply engaging AI products globally, ahead of competitors like Gemini, and suggests users are not dropping in for a quick search-engine-style query but returning for sustained, substantive interactions.
The 26.48% bounce rate — one of the lowest in the generative AI category — reinforces this pattern of high-intent usage. For context, industry-standard acceptable bounce rates for web applications hover around 40–60%; Grok’s near-27% figure implies that roughly 73 out of every 100 visitors engage meaningfully beyond the landing page. The 78.62% desktop dominance is also notable: it suggests Grok’s primary use case in 2026 is still driven by professional, developer, and research workflows conducted on larger screens — not casual mobile scrolling. The 21.41 pages per visit further confirms this depth, indicating users are navigating across features, switching models, or running multi-step workflows rather than asking a single question and leaving.
Grok 4.20 Global User Base & Market Share Statistics in 2026
GROK GLOBAL TRAFFIC SHARE BY COUNTRY (2026)
════════════════════════════════════════════════════════
United States ████████████████████████ 21–24% of visits
India ████████████ 8–10%
Brazil ████████ 4–5%
South Korea █████ 3.5%
Vietnam ████ 4%
Hong Kong ███ 3.1%
UK ████ 5.5%
Pakistan ████ 5.4%
Other Markets █████████████████████ ~50%+
════════════════════════════════════════════════════════
| User / Market Metric | Figure | Source / Period |
|---|---|---|
| Global Monthly Active Users | ~60–64 million | January 2026 (xAI internal, Business of Apps) |
| Total App Downloads (All Time) | ~100 million | 2026 (Business of Apps) |
| Google Play Store Downloads | 50 million+ | 2026 (Google Play) |
| iOS Daily Downloads (Post Grok 4 Launch) | 197,000 / day (+279%) | July 11, 2025 (App Store data) |
| Top Traffic Country | United States (21–24% of visits) | 2026 (Similarweb) |
| 2nd Largest Market | India (~8–10%) | 2026 |
| Fastest-Growing Markets | India (+42% MoM), Brazil, Vietnam | March 2026 |
| Gender Split | 60.19% male / 39.81% female | 2026 |
| Largest Age Group | 25–34 years (51.4% of all users) | 2026 |
| Second Largest Age Group | 18–24 years (12.2%) | 2026 |
| Average Session Time (Mobile) | 4 min 58 sec | 2026 (HumanizeAI.io) |
| Coding-Related Queries | 18–25% of all queries | 2026 |
| News & Trends Queries | ~30% of all queries | 2026 |
| Developer Engagement Growth | +50% year-over-year | 2026 |
| Daily Image Generation (Jan 2026) | Grok Imagine: 1.245 billion videos | January 2026 |
Source: Business of Apps (March 2026), SEOProfy (Feb 2026), FatJoe (April 2026), Bayelsawatch (April 2026), HumanizeAI.io (April 2026)
The global user profile of Grok in 2026 reveals a platform with genuine international reach that extends well beyond its American origins. While the United States leads with 21–24% of traffic, the combined share of India (~10%), Brazil (~5%), Vietnam (~4%), and the UK (~5.5%) means that more than half of Grok’s active user base sits outside the US — a meaningful shift from the platform’s early profile as primarily a product of and for the Anglophone West. India’s 42% month-over-month traffic increase in March 2026 is the most striking single geographic signal: it points to an emerging market uptake curve that, if sustained, could reshape the platform’s global center of gravity within 12–18 months.
The demographic concentration in the 25–34 age cohort (51.4% of all users) reflects a user base that is overwhelmingly composed of young professionals, developers, researchers, and early adopters — exactly the audience that drives enterprise AI adoption cycles. The 18–25% coding query share and 50% year-over-year developer engagement growth confirm that Grok is establishing a real foothold in the developer toolchain, not just as a consumer chatbot. The 1.245 billion videos generated by Grok Imagine in January 2026 alone also underscores that multimodal usage — particularly AI-generated video — has become a genuinely mainstream behavior on the platform, with scale numbers that rival or exceed dedicated image generation services.
Grok 4.20 Global Revenue & Funding Statistics in 2026
xAI FUNDING ROUNDS — CUMULATIVE CAPITAL RAISED
════════════════════════════════════════════════════════
Series A (Nov 2023) ██ $134.7M @ $673M valuation
Series B (May 2024) █████ $6B @ $24B valuation
Series C (Dec 2024) ████████ $6B @ $50B valuation
Series D (Jun 2025) █████████████ $10B ($5B debt + $5B equity)
Series E (Jan 2026) ████████████████████████████████████ $20B @ $230B valuation
════════════════════════════════════════════════════════
Total Raised: $42+ Billion | SpaceX Merger Valuation: $1.25 Trillion
| Revenue / Funding Metric | Figure | Source / Period |
|---|---|---|
| xAI Total Funding Raised | $42+ billion | All rounds through 2026 |
| Series E Round (Jan 2026) | $20 billion at $230 billion valuation | Reuters, January 2026 |
| xAI Q3 2025 Revenue | $107 million (quarter ended Sept 30, 2025) | Reuters |
| xAI Q3 2025 Net Loss | $1.46 billion | Reuters |
| Grok 2025 Full-Year Revenue Estimate | ~$300–$350 million | Business of Apps, 2026 |
| Projected 2026 Revenue | ~$2 billion | Business of Apps (2026) |
| Monthly Infrastructure Spend (xAI) | ~$1 billion/month | 2026 estimates |
| SuperGrok Subscription Price | $30/month or $300/year | xAI official pricing |
| SuperGrok Heavy Price | $300/month | xAI official pricing |
| Premium+ (X Platform) Price | $40/month or $395/year | xAI official pricing |
| Grok Business (per seat) | $30/seat/month | xAI official pricing |
| API Input Pricing (Grok 4.20) | $2.00 per 1M tokens | xAI API (Artificial Analysis) |
| API Output Pricing (Grok 4.20) | $6.00 per 1M tokens | xAI API |
| US DoD AI Contract (xAI) | Up to $200 million | July 2025 |
| SpaceX–xAI Merger Valuation | $1.25 trillion (all-stock, Feb 2026) | Reuters, February 2026 |
| App Monthly Revenue (Mar 2026 est.) | ~$12 million/month | Sensor Tower estimate |
Source: Reuters (Jan & Feb 2026), Business of Apps (March 2026), xAI official pricing, SQ Magazine (May 2026), Bayelsawatch (April 2026)
The revenue and funding trajectory of xAI is one of the most striking capital stories in the technology industry in 2026. Moving from a $134.7 million Series A in November 2023 to a $20 billion Series E in January 2026 — a period of just 26 months — represents a valuation increase of more than 340x, making it one of the fastest valuation climbs in startup history. Yet the $1.46 billion net loss in Q3 2025 alone, against $107 million in quarterly revenue, makes clear that xAI is still deep in the investment phase of its growth curve, spending on the Colossus cluster expansion, model training runs, and infrastructure at approximately $1 billion per month. The projected $2 billion in 2026 revenue — primarily driven by SuperGrok subscriptions, X Premium bundles, API revenue, and the $200 million DoD contract — would represent a roughly 6x year-over-year increase, but still a long way from break-even.
The $1.25 trillion combined valuation of the SpaceX–xAI merger in February 2026 is the corporate event that most profoundly changes the long-term context for Grok’s global trajectory. Elon Musk cited “orbital data centres” as the strategic rationale — the vision of deploying AI compute infrastructure in space via SpaceX’s Starship and Starlink networks. If that infrastructure ambition materializes, it would give Grok a distribution channel and compute substrate that no other AI company on earth currently possesses. The $300/month SuperGrok Heavy tier and $2.00 input / $6.00 output API pricing also position Grok 4.20 as a product that is priced at the premium end for general consumers but significantly below competing frontier models for developers — a deliberate commercial strategy that reflects xAI’s goal of capturing the developer ecosystem as its primary long-term revenue base.
Grok 4.20 Model Versions & Release Timeline Global in 2026
GROK MODEL EVOLUTION — RELEASE TIMELINE
════════════════════════════════════════════════════════════════
Nov 2023 ██ Grok 1 — 314B params, Apache 2.0 open-source
Mar 2024 ██ Grok 1.5 — 128K context window, open-sourced
Aug 2024 ████ Grok 2 — Image generation added
Feb 2025 ████████ Grok 3 — 10x compute vs predecessors
Jul 2025 ████████████ Grok 4 — HLE 50.7% (Heavy), Colossus 200K GPU training
Nov 2025 ██████████████ Grok 4.1 — 97% FActScore, 1,483 LMArena Elo (#1)
Feb 2026 ████████████████ Grok 4.20 Beta — Multi-agent architecture launch
Apr 2026 ████████████████████ Grok 4.20 0309 v2 — Speed & accuracy refinements
Apr 2026 ████████████████████████ Grok 4.3 — Intelligence Index 53; $395 run cost
════════════════════════════════════════════════════════════════
Total: 7 major versions in ~2.5 years | Avg: ~3 releases per year
| Model | Release Date | Key Capability Added | Context Window |
|---|---|---|---|
| Grok 1 | November 3, 2023 | Base LLM; Python/Rust architecture; 314B parameters | Standard |
| Grok 1.5 | March 28, 2024 | 128,000-token context; enhanced reasoning | 128K |
| Grok 2 | August 14, 2024 | Image generation capabilities added | 128K |
| Grok 3 | February 17, 2025 | 10x compute vs. prior models; drove +436% traffic spike | 128K |
| Grok 4 | July 9, 2025 | HLE 50.7% (Heavy); ARC-AGI V2 15.9%; native tool use; real-time search | 256K |
| Grok 4.1 | November 17, 2025 | 97% FActScore; hallucinations cut 3x; #1 LMArena Elo (1,483) | 2M |
| Grok 4.20 Beta | February 17, 2026 | Multi-agent architecture (4 agents); ultra-precise answers | 2M |
| Grok 4.20 0309 v2 | April 7, 2026 | Speed, hallucination fixes, better LaTeX, multi-image rendering | 2M |
| Grok 4.3 | April 30, 2026 | Intelligence Index 53; GDPval-AA 1,500 Elo; 20% lower benchmark cost | 1M |
Source: SQ Magazine (May 2026), xAI official, Artificial Analysis (April 2026), Business of Apps (March 2026)
The Grok model release timeline is arguably the clearest expression of xAI’s competitive philosophy: move faster than anyone else and iterate in public. Reaching 7 major model versions in approximately 2.5 years — an average of nearly 3 major releases per year — is a cadence that exceeds most frontier AI labs, including OpenAI and Anthropic, who have historically favored longer release cycles. The leap from Grok 3’s 10x compute training jump in February 2025 to Grok 4’s landmark HLE 50.7% score just five months later in July 2025 shows how rapidly Colossus-scale compute investment translates into benchmark breakthroughs. And the jump from 128,000 tokens (Grok 3) to 2,000,000 tokens (Grok 4.1 and 4.20) is not an incremental context expansion — it is an order-of-magnitude change that unlocks entirely different enterprise use cases.
The Grok 4.20 multi-agent architecture — launched February 2026 — represents the most architecturally significant change in the series since Grok 4’s reinforcement learning training methodology. Rather than a single model responding to a query, four specialized agents debate and fact-check in real time, a design that directly targets the hallucination and reliability issues that have been the main enterprise objection to deploying large language models in production. The subsequent 0309 v2 release on April 7, 2026, focused specifically on speed, LaTeX rendering quality, image handling, and instruction-following, signals that xAI is now iterating on polish and reliability rather than purely on benchmark-maximizing capability — a sign of a model transitioning from research showcase to production product.
Grok 4.20 Global Competitive Position & Pricing in 2026
INTELLIGENCE INDEX vs. COST — FRONTIER MODEL COMPARISON (Apr 2026)
════════════════════════════════════════════════════════════════
Model Intelligence Output Price Speed (t/s)
Index Score (per 1M tokens)
────────────────────────────────────────────────────────────────
Grok 4.3 53 $2.50 83.3 t/s
Grok 4.20 0309 v2 49 $6.00 106.1 t/s ← FASTEST
Peer Median 35-36 $8.00 (median) 63 t/s
════════════════════════════════════════════════════════════════
Grok 4.20 costs $514 to run full benchmark suite
Grok 4.3 costs $395 to run full benchmark suite (-23%)
| Model | Intelligence Index | Input Price (1M tokens) | Output Price (1M tokens) | Context Window | Speed (t/s) |
|---|---|---|---|---|---|
| Grok 4.3 (Apr 2026) | 53 | $1.25 | $2.50 | 1M | 83.3 |
| Grok 4.20 0309 v2 | 49 | $2.00 | $6.00 | 2M | 106.1 |
| Peer Reasoning Median | ~35–36 | ~$1.68–1.71 | ~$8.00 | Varies | ~61–63 |
| Grok 4.20 vs. Median | +40% above | Slightly above avg | Below avg output cost | Largest (2M) | +67% faster |
Source: Artificial Analysis (April 2026)
When benchmarked against the global field of frontier reasoning models, Grok 4.20 0309 v2 occupies a notable position: above-average intelligence at above-average speed, with output pricing that is below the peer median despite input pricing that is slightly elevated. The $6.00 per million output tokens — compared to the peer median of $8.00 — makes Grok 4.20 approximately 25% cheaper on the output side than the average competing model at its intelligence tier, which is the pricing dimension that matters most for high-volume API users generating large responses. Its 106.1 tokens-per-second output speed, the highest among the models evaluated, means that at scale, Grok 4.20 is not just cheaper per token — it delivers those tokens 67% faster than the average competing reasoning model.
The comparison between Grok 4.20 and its own successor Grok 4.3 is equally instructive. Grok 4.3 scores 53 on the Intelligence Index (four points higher) and costs $395 to run the benchmark suite versus Grok 4.20’s $514 — a 23% cost reduction alongside a capability improvement. This pattern of xAI releasing more capable models at lower cost within weeks of each other is a deliberate competitive signal: the company is actively trying to undercut the “frontier models are expensive” narrative that has slowed enterprise AI adoption. For businesses evaluating which model to deploy in production, the 2 million token context window that Grok 4.20 retains — vs. Grok 4.3’s 1 million tokens — means there is still a genuine use-case argument for 4.20 even after 4.3’s release, particularly for long-document or full-codebase analysis tasks.
Disclaimer: The data research report we present here is based on information found from various sources. We are not liable for any financial loss, errors, or damages of any kind that may result from the use of the information herein. We acknowledge that though we try to report accurately, we cannot verify the absolute facts of everything that has been represented.
