AI Providers Landscape

A current research brief on model families and platform stacks, organized by frontier providers plus Tier 1-3. It uses provider names and product families, keeps the global market explicit, and prioritizes official docs and announcements first.

Last updated: 2026-06-05

Quick source check on 2026-06-05: OpenAI's current public stack centers on GPT-5.5 / GPT-5.4 / GPT-5 mini / GPT-5 nano, GPT-5.3-Codex, gpt-realtime-2 / 1.5 / translate / whisper, gpt-audio-1.5 / mini, and gpt-oss; Anthropic's current Claude line is Opus 4.7 / Sonnet 4.6 / Haiku 4.5; Google's Gemini page is on Gemini 3 Pro Preview with Gemini 2.5 Pro / Flash / Flash-Lite below; xAI centers on Grok 4.3 and Grok Build 0.1; China-side providers now emphasize DeepSeek V4-Pro / V4-Flash, Qwen3.5 / Qwen3-Max / Qwen3-Omni / Qwen3-ASR / Qwen3-TTS, ERNIE 5.0 / X1.1 / X1 Turbo, GLM-5.1 / 4.7, Kimi K2.6, MiniMax M2.1, and Tencent Hy3 preview.

Frontier Providers

OpenAI

OpenAI remains the broadest stack for agentic coding and multimodal app building. The current line spans GPT-5.5 / GPT-5.4 frontiers, GPT-5 mini-nano for cheaper throughput, Codex for repo work, realtime/audio APIs, and open-weight gpt-oss.

Latest model families

GPT-5.5 / GPT-5.5 pro - flagship reasoning and coding
GPT-5.4 / GPT-5.4 pro - affordable frontier workhorse
GPT-5.4 mini / GPT-5.4 nano - low-cost, high-volume tiers
GPT-5 mini / GPT-5 nano - cheaper low-latency GPT tiers
GPT-5.3-Codex - agentic coding and repo operations
gpt-realtime-2 / gpt-realtime-1.5 / gpt-realtime-mini / gpt-realtime-translate / gpt-realtime-whisper - realtime voice and transcription
gpt-audio-1.5 / gpt-audio / gpt-audio-mini - audio-in, audio-out workflows
gpt-oss-120b / gpt-oss-20b - open-weight reasoning line

Best for

Agentic coding and repo operations
Realtime multimodal and voice products
High-iteration prototypes with tools, speech, and open weights

Sources: developers.openai.com/api/docs/models/all/ · developers.openai.com/api/docs/models/model-archive · developers.openai.com/api/docs/models/gpt-5.4-nano · developers.openai.com/api/docs/models/gpt-5.3-codex · developers.openai.com/api/docs/models/gpt-realtime · openai.com/open-models

Anthropic

Anthropic's current Claude line is concentrated around Opus 4.7, Sonnet 4.6, and Haiku 4.5. Opus is the long-horizon specialist, Sonnet is the balanced workhorse, and Haiku is the low-latency tier.

Latest model families

Claude Opus 4.7 / Opus 4.6 / Opus 4.1 - top-end reasoning and coding
Claude Sonnet 4.6 / 4.5 / 4 - balanced high-performance workhorse
Claude Haiku 4.5 / 3.5 - fast lightweight tiers

Best for

Complex coding and refactoring
Long-running agent tasks
Writing, analysis, and planning

Sources: anthropic.com/product · anthropic.com/system-cards · anthropic.com/news/claude-opus-4-1 · anthropic.com/news/claude-haiku-4-5

Google / Gemini

Google's flagship model family is Gemini. The current public stack is centered on Gemini 3 Pro Preview, with Gemini 2.5 Pro / Flash / Flash-Lite beneath it and Gemma as the open-weight line.

Latest model families

Gemini 3 Pro Preview / Gemini 3 Pro Image Preview - flagship multimodal models
Gemini 2.5 Pro - advanced reasoning and long-context work
Gemini 2.5 Flash / Gemini 2.5 Flash-Lite - lower-cost, high-throughput tiers
Gemma 3 / Gemma 3n - open-weight Google family

Best for

Multimodal reasoning and document synthesis
Long-context workflows
Search, Workspace, and device-first apps

Sources: ai.google.dev/models/gemini · ai.google.dev/gemini-api/docs/models/gemini · ai.google.dev/gemma/docs/core · ai.google.dev/gemma/docs/gemma-3n

xAI / Grok

xAI is a Grok-centric stack with chat, coding, voice, and media APIs. Treat it as a product platform: Grok 4.3 is the current general model, Grok Build 0.1 is the coding line, and Voice / Imagine cover audio and media.

Latest model families

Grok 4.3 - current flagship text model
Grok Build 0.1 - coding and agentic workflows
Voice API - realtime conversation, TTS, and STT
Imagine API - image and video generation

Best for

Conversational products with search
Agentic coding and tool use
Voice and media generation

Sources: docs.x.ai/developers/models · docs.x.ai/developers/models/grok-4.3 · docs.x.ai/docs · docs.x.ai/developers/pricing

Tier 1 - Global Scale Leaders

Mistral AI (EU)

Europe's strongest independent model lab. Mistral is the best choice when you want efficient multilingual models, open weights, enterprise deployment, and serious code / document / speech tooling.

Latest model families

Mistral Large 3 - flagship open-weight multimodal model
Mistral Medium 3.5 - frontier-class multimodal workhorse
Mistral Small 4 - compact general-purpose model
Ministral 3 - edge-friendly 3B / 8B / 14B family
Magistral Medium 1.2 / Small 1.2 - reasoning-focused family
Devstral 2 - frontier code agent model
Codestral - low-latency code generation
OCR 3 Premier / Voxtral Mini Transcribe 2 / Voxtral Mini Transcribe Realtime / Voxtral TTS - document and audio stack

Best for

Multilingual European workloads
Enterprise deployment and customization
Efficient coding and agent workflows

Sources: mistral.ai/models · docs.mistral.ai/getting-started/models/ · mistral.ai/news/mistral-3 · mistral.ai/en/news/devstral-2-vibe-cli/

DeepSeek (China)

DeepSeek remains the cost/performance pressure valve in the market. The current API line is V4-Pro / V4-Flash, with V3.2 and V3.1-Terminus still important for migration and compatibility.

Latest model families

DeepSeek-V4-Pro / V4-Flash - current API line
DeepSeek-V3.2 / V3.2-Speciale - reasoning-first successor family
DeepSeek-V3.1 / V3.1-Terminus - compatibility and migration line
DeepSeek-R1 / R1-0528 - legacy reasoning reference line

Best for

Reasoning-heavy workloads
Code generation and analysis
High-value deployments with tight cost control

Sources: api-docs.deepseek.com/updates/ · api-docs.deepseek.com/news/news251201 · api-docs.deepseek.com/news/news250929 · api-docs.deepseek.com/news/news250120

Qwen / Alibaba (China)

One of the broadest stacks anywhere. Qwen spans proprietary flagships, open-weight multimodal models, coding, translation, audio, image, and retrieval families.

Latest model families

Qwen3.5-Plus / Qwen3.5-Flash - flagship multimodal general models
Qwen3-Max - current text flagship
Qwen3-VL-Plus / Qwen3-VL-Flash - vision-language families
Qwen3-Omni-Flash - text, image, audio, and video in one model
Qwen3-ASR-Flash / Qwen3-TTS-Flash - speech recognition and speech synthesis
Qwen3.5-LiveTranslate-Flash - simultaneous interpretation
Qwen3-Coder - agentic coding
Qwen-MT, Qwen-Image-Plus, Qwen-Image-Edit - translation and image stack

Best for

Chinese-English and multilingual workflows
Open-weight experimentation
Vision, translation, speech, and coding tools

Sources: qwen.ai/apiplatform · qwenlm.github.io/blog/qwen3-coder/ · qwen.ai/blog?id=qwen3asr · qwen.ai/blog?id=qwen3-tts · qwen.ai/blog?id=qwen3.5-livetranslate

Meta / Llama

The dominant open-weight ecosystem outside China. Meta's current headline models are Llama 4 Scout and Maverick, with Llama Guard 4 as the safety line.

Latest model families

Llama 4 Scout - natively multimodal, long-context
Llama 4 Maverick - multimodal general-purpose model
Llama Guard 4 - safety and policy filtering
Llama 3.2 / Llama 3.1 - legacy broad-deployment baseline

Best for

Self-hosting and fine-tuning
Open-weight multimodal apps
Safety filtering and local control

Sources: ai.meta.com/llama/get-started/ · ai.meta.com/blog/llama-4-multimodal-intelligence/ · ai.meta.com/open/

Tier 2 - Enterprise, Regional, and Platform Providers

Cohere

Enterprise-first provider focused on retrieval, grounding, multilingual control, and private deployment. Command A+ is the newest open-source workhorse, while Command A Reasoning, Rerank, Embed, and Transcribe stay central to the stack.

Latest model families

Command A+ - newest open-source enterprise workhorse
Command A / Command A Reasoning - flagship enterprise models
Command A Vision / Command A Translate - multimodal and translation lines
Command R7B / Command R+ / Command R - retrieval and grounding family
Aya Expanse / Aya Vision - multilingual and multimodal lines
Embed / Rerank / Transcribe - retrieval and audio stack

Best for

RAG and enterprise search
Grounded assistants
Private and sovereign deployments

Sources: docs.cohere.com/docs/models · cohere.com/blog/command-a-plus · docs.cohere.com/docs/command-a-reasoning · cohere.com/models-overview

Amazon Nova / Bedrock

A production platform rather than a single model lab. Bedrock is the control plane; Nova is Amazon's own model line for multimodal, reasoning, and speech workloads.

Latest model families

Amazon Nova 2 Omni / Nova 2 Pro / Nova 2 Lite
Nova 2 Sonic - speech and conversational voice
Nova Multimodal Embeddings - semantic retrieval
Third-party model access via Bedrock
Agent and guardrail tooling around model use

Best for

AWS-native production apps
Enterprise governance and routing
Multi-provider deployments

Sources: docs.aws.amazon.com/nova/ · docs.aws.amazon.com/nova/latest/nova2-userguide/what-is-nova-2.html · docs.aws.amazon.com/nova/latest/nova2-userguide/using-conversational-speech.html

Microsoft Phi

Microsoft's small-model family is optimized for strong performance per parameter and edge use cases. The current work centers on compact reasoning, multimodal SLMs, and vision reasoning.

Latest model families

Phi-4-reasoning-vision-15B - compact multimodal reasoning
Phi-4-reasoning / Phi-4-reasoning-plus - compact reasoning
Phi-4 - small-model flagship
Phi-4-mini - compact variant
Phi-4-multimodal - vision, audio, text
Phi-3.5 - still widely used in light-footprint setups

Best for

Small-footprint deployments
STEM-heavy tasks
Edge and local inference

Sources: microsoft.com/en-us/research/blog/phi-4-reasoning-vision-and-the-lessons-of-training-a-multimodal-reasoning-model/ · microsoft.com/en-us/research/articles/phi-reasoning-once-again-redefining-what-is-possible-with-small-and-efficient-ai/ · microsoft.com/en-us/research/publication/phi-4-technical-report/

Baidu ERNIE (China)

Baidu's China enterprise stack is still one of the clearest in the market, with a strong emphasis on search, multimodal understanding, and document-heavy workflows.

Latest model families

ERNIE 5.0 / ERNIE 5.0 Thinking / ERNIE 5.0 Preview
ERNIE X1.1 / X1.1 Preview
ERNIE X1 Turbo / X1 Turbo Preview
ERNIE 4.5 Turbo / ERNIE 4.5 Turbo VL
PaddleOCR-VL / PP-StructureV3 - document stack

Best for

Chinese enterprise deployments
Search-augmented assistants
Document and multimodal knowledge work

Sources: cloud.baidu.com/product-s/qianfan_home · cloud.baidu.com/doc/qianfan/s/rmh4stp0j · cloud.baidu.com/doc/qianfan-docs/s/7m95lyy43 · cloud.baidu.com/product/model.html

Zhipu AI / GLM (China)

One of China's strongest general-purpose model lines, with a clear push into long-horizon coding, multimodal reasoning, and image generation.

Latest model families

GLM-5.1 / GLM-5.1-HighSpeed - flagship agentic coding line
GLM-4.7 / GLM-4.7-FlashX - high-intelligence general line
GLM-5V-Turbo / GLM-4.6V-Flash - multimodal coding and vision
GLM-4.6 / GLM-4.5 - still used in broader deployments
CogView-4 - image generation

Best for

Agentic coding and long tasks
Chinese-language workflows
Multimodal app stacks

Sources: docs.bigmodel.cn/cn/update/new-releases · docs.bigmodel.cn/cn/guide/models/text/glm-5.1 · docs.bigmodel.cn/cn/guide/models/text/glm-4.7 · docs.bigmodel.cn/cn/guide/models/image-generation/cogview-4

Moonshot / Kimi (China)

A fast-moving China-side assistant platform with a strong long-context and coding angle. Kimi is increasingly relevant for agentic workflows.

Latest model families

Kimi K2.6 - latest flagship
Kimi K2.5 - multimodal general model
Kimi K2-thinking / Kimi K2-thinking-turbo - reasoning models
moonshot-v1 vision previews - older long-context line still in circulation

Best for

Long-context assistants
Chinese first-party product experiences
Agent and code-heavy workflows

Sources: platform.kimi.com/docs/models · platform.kimi.com/docs/guide/use-kimi-k2-thinking-model · platform.kimi.com

MiniMax (China)

MiniMax is broader than many outsiders realize: text, speech, video, and music are all active product lines.

Latest model families

MiniMax M2.1 / M2.1-lightning / M2 - text and agent models
MiniMax M2-her - role-play and long-dialogue model
MiniMax Speech 2.6 / 2.6 HD - voice agents
MiniMax Hailuo 2.3 / 2.3 Fast - video
MiniMax Music 2.6 / 2.5+ - music generation

Best for

Voice and media products
Agentic coding
Multimodal consumer apps

Sources: platform.minimaxi.com/docs/guides/models-intro · platform.minimaxi.com/docs/guides/pricing-paygo · platform.minimaxi.com/docs/token-plan/faq

Tencent Hunyuan (China)

Tencent's Hunyuan stack is moving through a major refresh. The current public signal is Hy3 preview for language work, Hunyuan-role-latest for role-play, and HY-3D-3.1 for 3D.

Latest model families

Hy3 preview - latest flagship language / reasoning model
Hunyuan-role-latest - role-play and conversational character model
HY-MT2-Pro - translation-focused model
HY-3D-3.1 / HY-3D-3.0 - 3D generation
HunyuanVideo / HunyuanVideo-Avatar - video generation

Best for

3D, simulation, and interactive content
Creative tooling
Media-heavy AI experiences

Sources: cloud.tencent.com/product/hunyuan · cloud.tencent.com/document/product/1823/130051 · cloud.tencent.com/document/product/1729/131925 · cloud.tencent.com/product/ai3d/

Tier 3 - Open-Weight, Local, and Ecosystem

Ollama

Runtime and ecosystem layer, not a model lab. It is the simplest way to run open models locally, switch families, and keep work private.

What it provides

Local model runtime and serving
Cloud offload for larger models
Easy switching between open families
Private/offline experimentation

Common families people run

gpt-oss
Qwen3 / Qwen3.5 / Qwen3-Coder
DeepSeek-R1 / DeepSeek-V4-Flash / DeepSeek-V4-Pro
Gemma 3 / Gemma 3n
Llama 4 / Llama 3.2 / Llama 3.1
Phi-4 / Kimi K2.6 / gpt-oss-safeguard

Best for

Local prototyping
Privacy-first workflows
Developer experimentation

Sources: docs.ollama.com · ollama.com/library · ollama.com/blog/cloud-models · ollama.com/library/gpt-oss

Hugging Face

The distribution and discovery layer for open models. It is the default place to watch releases, host weights, and track ecosystem momentum.

What it provides

Model hub and hosting
Open-weight distribution
Community evaluation and tooling

Best for

Model discovery
Open-model release tracking
Community experimentation

Source: huggingface.co