xAI: Grok Model Benchmarks

4

Models Tracked

2023

Founded

84.6

Best Reasoning

93.3

AIME 2025 (Grok-3)

About xAI

xAI is an AI company founded by Elon Musk in 2023 with the mission to “understand the true nature of the universe.” The company develops the Grok family of AI models, which power the Grok chatbot integrated into the X (formerly Twitter) platform.

Grok-3 (2025) represents a major leap in capability, achieving strong scores on reasoning and mathematics benchmarks. xAI has rapidly scaled its compute infrastructure, building one of the largest AI training clusters in the world.

Grok Model Timeline

Verified scores from official xAI announcements.

February 2025

Grok-3

xAI’s flagship model. With “Think” mode, Grok-3 achieves 84.6% on GPQA Diamond and 93.3% on AIME 2025 (cons@64). Grok-3 mini achieves 95.8% on AIME 2024.

GPQA 84.6 AIME 2025 93.3 LiveCodeBench 79.4

August 2024

Grok-2

A significant upgrade with improved reasoning and multimodal capabilities. Integrated into X’s platform features.

April 2024

Grok-1.5

An incremental improvement with better reasoning. Introduced vision capabilities for image understanding.

November 2023

Grok-1

xAI’s first model. Released with open weights (314B parameters), establishing the Grok brand.

Benchmark Performance

Grok-3 scores across verified benchmark categories.

About This Data

Scores sourced from xAI’s official Grok-3 announcement. AIME 2025 score uses consensus@64 methodology. LiveCodeBench is a coding benchmark not included in the standard 6-category comparison.

Explore More Providers

Previous: Meta Back to Leaderboard Next: DeepSeek