Skip to content

chore(pricing): Update vertex-ai pricing#708

Closed
siddharthsambharia-portkey wants to merge 2 commits intomainfrom
pricing-update/vertex-ai-24415671801
Closed

chore(pricing): Update vertex-ai pricing#708
siddharthsambharia-portkey wants to merge 2 commits intomainfrom
pricing-update/vertex-ai-24415671801

Conversation

@siddharthsambharia-portkey
Copy link
Copy Markdown
Collaborator

🔄 Pricing Update: vertex-ai

📊 Summary (complete_diff mode)

Change Type Count
➕ Models added 49
🔄 Models updated (merged) 21

➕ New Models

  • gemini-2.5-pro-exp-03-25
  • gemini-2.5-pro-computer-use-preview
  • gemini-2.0-flash-lite-preview-12-17
  • gemini-2.0-flash-thinking-exp-01-21
  • gemini-2.0-flash-image-generation
  • gemini-2.5-flash-preview-tts
  • gemini-2.5-pro-preview-tts
  • gemini-3-1-pro-preview
  • gemini-3-1-flash-image-preview
  • gemini-3-1-flash-lite-preview
  • imagen-3.0-capability-preview-0930
  • imagen-4.0-capability-001
  • veo-3.1-lite-generate-001
  • gemini-embedding-exp-03-07
  • gemma-4-26b-a4b-it-maas
  • gemini-2.0-pro-exp-02-05
  • gemini-2.0-pro-exp
  • gemini-1.5-flash-8b-001
  • claude-opus-4-1
  • gpt-oss-20b-maas
  • ... and 29 more

🔄 Updated Models

  • gemini-2.5-pro
  • gemini-2.5-pro-preview-05-06
  • gemini-2.5-pro-preview-03-25
  • gemini-2.5-flash-preview-04-17
  • gemini-2.5-flash-preview-05-20
  • gemini-2.5-flash-lite-preview-06-17
  • gemini-2.0-flash-lite
  • gemini-2.0-flash-exp
  • gemini-3-pro-image-preview
  • gemini-3-flash-preview
  • veo-3.0-generate-001
  • veo-3.1-generate-001
  • veo-3.1-fast-generate-001
  • text-embedding-005
  • text-embedding-004
  • text-multilingual-embedding-002
  • text-embedding-large-exp-03-07
  • multimodalembedding@001
  • textembedding-gecko@001
  • textembedding-gecko@003
  • textembedding-gecko-multilingual@001

Model-to-Pricing-Page Mapping

Google – Gemini (token pricing, $/1M)

Model ID Publisher / Section Source Notes
gemini-2.5-pro Google – Gemini 2.5 API Standard ≤200K input $1.25, output $10; cache hit $0.13; batch $0.625/$5; web_search 3.5¢, enterprise 4.5¢
gemini-2.5-pro-preview-05-06 Google – Gemini 2.5 API Same as gemini-2.5-pro
gemini-2.5-pro-preview-03-25 Google – Gemini 2.5 API Same as gemini-2.5-pro
gemini-2.5-pro-exp-03-25 Google – Gemini 2.5 API Same as gemini-2.5-pro
gemini-2.5-flash Google – Gemini 2.5 API Input $0.30, output $2.50; cache $0.03; batch $0.15/$1.25; web_search 3.5¢, enterprise 4.5¢
gemini-2.5-flash-preview-04-17 Google – Gemini 2.5 API Same as gemini-2.5-flash
gemini-2.5-flash-preview-05-20 Google – Gemini 2.5 API Same as gemini-2.5-flash
gemini-2.5-flash-image Google – Gemini 2.5 API Input $0.30, output $2.50; image_token $30/1M; web_search 3.5¢
gemini-2.5-flash-lite Google – Gemini 2.5 API Input $0.10, output $0.40; cache $0.01; batch $0.05/$0.20; web_search 3.5¢
gemini-2.5-flash-lite-preview-06-17 Google – Gemini 2.5 API Same as gemini-2.5-flash-lite
gemini-2.5-pro-computer-use-preview Google – Gemini 2.5 API Same as gemini-2.5-pro (Computer Use variant)
gemini-2.5-flash-preview-tts Google – Gemini 2.5 API – price not found TTS variant; no pricing row found
gemini-2.5-pro-preview-tts Google – Gemini 2.5 API – price not found TTS variant; no pricing row found
gemini-2.0-flash-001 Google – Gemini 2.0 API Input $0.15, output $0.60; batch $0.075/$0.30; web_search 3.5¢, enterprise 4.5¢
gemini-2.0-flash Google – Gemini 2.0 API Alias for gemini-2.0-flash-001; same pricing
gemini-2.0-flash-lite-001 Google – Gemini 2.0 API Input $0.075, output $0.30; batch $0.0375/$0.15; web_search 3.5¢
gemini-2.0-flash-lite Google – Gemini 2.0 API Alias for gemini-2.0-flash-lite-001; same pricing
gemini-2.0-flash-lite-preview-12-17 Google – Gemini 2.0 API Same as gemini-2.0-flash-lite
gemini-2.0-flash-exp Google – Gemini 2.0 API Same pricing as gemini-2.0-flash-001
gemini-2.0-flash-thinking-exp-01-21 Google – Gemini 2.0 API Same base pricing as gemini-2.0-flash; thinking bundled into output
gemini-2.0-flash-image-generation Google – Gemini 2.0 API Input $0.15, output $0.60; image_token $30/1M
gemini-2.0-pro-exp-02-05 Google – Gemini 2.0 API – price not found Experimental; no dedicated pricing row
gemini-2.0-pro-exp Google – Gemini 2.0 API – price not found Experimental; no dedicated pricing row
gemini-1.5-pro-001 Google – Gemini 1.5 API – price not found Legacy; no longer on pricing page
gemini-1.5-pro-002 Google – Gemini 1.5 API – price not found Legacy; no longer on pricing page
gemini-1.5-flash-001 Google – Gemini 1.5 API – price not found Legacy; no longer on pricing page
gemini-1.5-flash-002 Google – Gemini 1.5 API – price not found Legacy; no longer on pricing page
gemini-1.5-flash-8b-001 Google – Gemini 1.5 API – price not found Legacy; no longer on pricing page
gemini-1.0-pro-001 Google – Gemini 1.0 API – price not found Legacy; no longer on pricing page
gemini-1.0-pro-002 Google – Gemini 1.0 API – price not found Legacy; no longer on pricing page
gemini-1.0-pro-vision-001 Google – Gemini 1.0 API – price not found Legacy; no longer on pricing page
gemini-3-pro-preview Google – Gemini 3 API Input $2, output $12; cache $0.20; batch $1/$6; web_search 1.4¢
gemini-3-pro-image-preview Google – Gemini 3 API Input $2, output $12; image_token $120/1M; web_search 1.4¢
gemini-3-flash-preview Google – Gemini 3 API Input $0.50, output $3; cache $0.05; batch $0.25/$1.5; web_search 1.4¢
gemini-3-1-pro-preview Google – Gemini 3.1 API Input $2, output $12; cache $0.20; batch $1/$6; web_search 1.4¢
gemini-3-1-flash-image-preview Google – Gemini 3.1 API Input $0.50, output $3; image_token $60/1M; web_search 1.4¢
gemini-3-1-flash-lite-preview Google – Gemini 3.1 API Input $0.25, output $1.50; cache $0.03; batch $0.13/$0.75; web_search 1.4¢

Google – Gemma

Model ID Publisher / Section Source Notes
gemma-4-26b-a4b-it-maas Google – Gemma API Input $0.15, output $0.60; free until Apr 16, 2026

Google – Imagen (per-image)

Model ID Publisher / Section Source Notes
imagen-4.0-ultra-generate-001 Google – Imagen 4 Ultra API $0.06/image
imagen-4.0-generate-001 Google – Imagen 4 API $0.04/image
imagen-4.0-fast-generate-001 Google – Imagen 4 Fast API $0.02/image
imagen-3.0-generate-001 Google – Imagen 3 API $0.04/image
imagen-3.0-generate-002 Google – Imagen 3 API $0.04/image
imagen-3.0-fast-generate-001 Google – Imagen 3 Fast API $0.02/image
imagen-3.0-capability-001 Google – Imagen 3 capability API Uses imagen-3.0-generate pricing: $0.04/image
imagen-3.0-capability-preview-0930 Google – Imagen 3 capability API Uses imagen-3.0-generate pricing: $0.04/image
imagen-4.0-capability-001 Google – Imagen 4 capability API Uses imagen-4.0-generate pricing: $0.04/image

Google – Veo (per-second video)

Model ID Publisher / Section Source Notes
veo-2.0-generate-001 Google – Veo 2 API $0.50/sec; default 8s, 1 sample
veo-3.0-generate-001 Google – Veo 3 API $0.40/sec (video+audio); default 8s, 1 sample
veo-3.0-fast-generate-001 Google – Veo 3 Fast API $0.10/sec; default 8s, 1 sample
veo-3.1-generate-001 Google – Veo 3.1 API $0.40/sec (video+audio 720p/1080p); default 8s, 1 sample
veo-3.1-fast-generate-001 Google – Veo 3.1 Fast API $0.10/sec (720p); default 8s, 1 sample
veo-3.1-lite-generate-001 Google – Veo 3.1 Lite API $0.05/sec (720p); default 8s, 1 sample

Google – Embedding

Model ID Publisher / Section Source Notes
gemini-embedding-001 Google – Gemini Embedding API $0.00015/1K tokens
gemini-embedding-exp-03-07 Google – Gemini Embedding API Same as gemini-embedding-001
text-embedding-005 Google – Text Embedding API $0.000025/1K chars
text-embedding-004 Google – Text Embedding API $0.000025/1K chars (same family)
text-multilingual-embedding-002 Google – Text Multilingual Embedding API $0.000025/1K chars
text-embedding-large-exp-03-07 Google – Text Embedding Large API $0.00015/1K tokens (same as Gemini Embedding)
multimodalembedding@001 Google – Multimodal Embedding API $0.0002/1K chars; image $0.0001; video plus $0.0020/s; standard $0.0010/s; essential $0.0005/s
textembedding-gecko@001 Google – Text Embedding (legacy) API Legacy; $0.000025/1K chars
textembedding-gecko@003 Google – Text Embedding (legacy) API Legacy; $0.000025/1K chars
textembedding-gecko-multilingual@001 Google – Text Embedding (legacy) API Legacy; $0.000025/1K chars

Anthropic – Claude

Model ID Publisher / Section Source Notes
claude-opus-4-6 Anthropic – Claude Opus API @default stripped; input $5, output $25; 5m cache write $6.25, cache hit $0.50; batch $2.50/$12.50
claude-opus-4-5 Anthropic – Claude Opus API @default stripped; input $5, output $25; 5m cache write $6.25, cache hit $0.50; batch $2.50/$12.50
claude-opus-4-1 Anthropic – Claude Opus API @default stripped; input $15, output $75; cache write $18.75, hit $1.50; batch $7.50/$37.50
claude-opus-4@20250514 Anthropic – Claude Opus API Pinned date; input $15, output $75; cache write $18.75, hit $1.50; batch $7.50/$37.50
claude-sonnet-4-6 Anthropic – Claude Sonnet API @default stripped; input $3, output $15; cache write $3.75, hit $0.30; batch $1.50/$7.50
claude-sonnet-4-5@20250929 Anthropic – Claude Sonnet API Pinned date; input $3, output $15; cache write $3.75, hit $0.30; batch $1.50/$7.50
claude-sonnet-4@20250514 Anthropic – Claude Sonnet API Pinned date; input $3, output $15; cache write $3.75, hit $0.30; batch $1.50/$7.50
claude-haiku-4-5@20251001 Anthropic – Claude Haiku API Pinned date; input $1, output $5; cache write $1.25, hit $0.10; batch $0.50/$2.50

OpenAI – GPT

Model ID Publisher / Section Source Notes
gpt-oss-120b-maas OpenAI – GPT OSS API input $0.09, output $0.36; batch $0.045/$0.18
gpt-oss-20b-maas OpenAI – GPT OSS API input $0.07, output $0.25; cache hit $0.007; batch $0.035/$0.125

Excluded OpenAI models: openclip (non-generative vision classification), gpt-oss-20b (self-deploy, no -maas), gpt-oss (self-deploy).

Meta – Llama

Model ID Publisher / Section Source Notes
llama-4-maverick-17b-128e-instruct-maas Meta – Llama 4 API input $0.35, output $1.15; batch $0.175/$0.575
llama-4-scout-17b-16e-instruct-maas Meta – Llama 4 API input $0.25, output $0.70; batch $0.125/$0.35
llama-3.3-70b-instruct-maas Meta – Llama 3.3 API input $0.72, output $0.72; batch $0.36/$0.36
llama-3.1-405b-instruct-maas Meta – Llama 3.1 API input $5, output $16
llama-3.1-70b-instruct-maas Meta – Llama 3.1 API – price not found No dedicated row; added with price 0
llama-3.1-8b-instruct-maas Meta – Llama 3.1 API – price not found No dedicated row; added with price 0
llama-3.2-90b-vision-instruct-maas Meta – Llama 3.2 API – price not found No dedicated row; added with price 0
llama-3.2-11b-vision-instruct-maas Meta – Llama 3.2 API – price not found No dedicated row; added with price 0

Excluded Meta models: llama-guard-* (safety/guard models), non-maas self-deploy variants.

DeepSeek

Model ID Publisher / Section Source Notes
deepseek-r1-0528-maas DeepSeek – R1 API input $1.35, output $5.40; batch $0.675/$2.70
deepseek-r1-0528-instruct-maas DeepSeek – R1 API Same as deepseek-r1-0528-maas
deepseek-v3.1-maas DeepSeek – V3.1 API input $0.60, output $1.70; cache hit $0.06; batch $0.30/$0.85
deepseek-v3.2-maas DeepSeek – V3.2 API input $0.56, output $1.68; cache hit $0.056; batch $0.28/$0.84
deepseek-r1-maas DeepSeek – R1 API – price not found Older variant; added with price 0
deepseek-v3-maas DeepSeek – V3 API – price not found Older variant; added with price 0
deepseek-v2-5-maas DeepSeek – V2.5 API – price not found Older variant; added with price 0

Excluded DeepSeek models: deepseek-ocr-maas (OCR model excluded per global rules).

MiniMax

Model ID Publisher / Section Source Notes
minimax-m2-maas MiniMax – M2 API input $0.30, output $1.20; cache hit $0.03
minimax-text-01-maas MiniMax API – price not found No dedicated row; added with price 0

Moonshot / Kimi

Model ID Publisher / Section Source Notes
kimi-k2-thinking-maas Moonshot – Kimi K2 API input $0.60, output $2.50; cache hit $0.06
kimi-k1.5-thinking-maas Moonshot – Kimi API – price not found No dedicated row; added with price 0
kimi-k1.5-maas Moonshot – Kimi API – price not found No dedicated row; added with price 0

Qwen (Alibaba)

Model ID Publisher / Section Source Notes
qwen3-235b-a22b-instruct-2507-maas Qwen – Qwen3 API input $0.22, output $0.88; batch $0.11/$0.44
qwen3-coder-480b-a35b-instruct-maas Qwen – Qwen3 Coder API input $0.22, output $1.80; cache hit $0.022; batch $0.11/$0.90
qwen3-next-80b-a3b-instruct-maas Qwen – Qwen3 Next API input $0.15, output $1.20
qwen3-next-80b-a3b-thinking-maas Qwen – Qwen3 Next Thinking API input $0.15, output $1.20
qwen2-5-72b-instruct-maas Qwen – Qwen2.5 API – price not found Older variant; added with price 0
qwen2-5-coder-32b-instruct-maas Qwen – Qwen2.5 API – price not found Older variant; added with price 0
qwen2-5-vl-72b-instruct-maas Qwen – Qwen2.5 VL API – price not found Older variant; added with price 0
qwen2-5-vl-7b-instruct-maas Qwen – Qwen2.5 VL API – price not found Older variant; added with price 0
qwen2-5-32b-instruct-maas Qwen – Qwen2.5 API – price not found Older variant; added with price 0
qwen2-72b-instruct-maas Qwen – Qwen2 API – price not found Older variant; added with price 0

Excluded Qwen models: qwen-image (explicitly excluded per policy).

ZAI.org / GLM

Model ID Publisher / Section Source Notes
glm-4.7-maas ZAI.org – GLM-4.7 API input $0.60, output $2.20
glm-5-maas ZAI.org – GLM-5 API input $1, output $3.20; cache hit $0.10; free until Feb 19, 2026
glm-4-maas ZAI.org – GLM-4 API – price not found Older variant; added with price 0
glm-4v-maas ZAI.org – GLM-4V API – price not found Older variant; added with price 0
glm-4v-plus-maas ZAI.org – GLM-4V Plus API – price not found Older variant; added with price 0

Excluded ZAI.org models: glm-image (explicitly excluded per policy).

Mistral AI

Model ID Publisher / Section Source Notes
mistral-medium-3-maas Mistral – Medium 3 API input $0.40, output $2.00
mistral-small-3-1-25-03-maas Mistral – Small 3.1 API input $0.10, output $0.30
codestral-2-maas Mistral – Codestral 2 API input $0.30, output $0.90
mistral-nemo-instruct-2407-maas Mistral – Nemo API – price not found Older variant; added with price 0
mistral-large-2411-maas Mistral – Large API – price not found Older variant; added with price 0
mistral-large-instruct-2407-maas Mistral – Large API – price not found Older variant; added with price 0
codestral-2405-maas Mistral – Codestral API – price not found Older variant; added with price 0

Excluded Mistral models: mistral-ocr-* / codestral-2501-self-deploy (self-deploy, excluded per global rules; OCR excluded per policy).

AI21

Model ID Publisher / Section Source Notes
jamba-large-1.6-maas AI21 – Jamba API – price not found No dedicated row on pricing page; added with price 0

Generated by Pricing Agent on 2026-04-14

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant