Skip to content

chore(pricing): Update vertex-ai pricing#691

Closed
siddharthsambharia-portkey wants to merge 2 commits intomainfrom
pricing-update/vertex-ai-24359569787
Closed

chore(pricing): Update vertex-ai pricing#691
siddharthsambharia-portkey wants to merge 2 commits intomainfrom
pricing-update/vertex-ai-24359569787

Conversation

@siddharthsambharia-portkey
Copy link
Copy Markdown
Collaborator

🔄 Pricing Update: vertex-ai

📊 Summary (complete_diff mode)

Change Type Count
➕ Models added 35
🔄 Models updated (merged) 13

➕ New Models

  • gemini-2.5-pro-tts
  • gemini-2.5-flash-tts
  • gemma-4-26b-a4b-it-maas
  • veo-3.1-lite-generate-001
  • llama-4-scout-17b-16e-instruct-maas
  • llama-3.1-70b-instruct-maas
  • llama-3.1-8b-instruct-maas
  • llama-3.2-90b-vision-instruct-maas
  • llama3-405b-instruct-maas
  • llama3-70b-instruct-maas
  • llama3-8b-instruct-maas
  • llama3_1-70b-instruct-maas
  • llama3_1-8b-instruct-maas
  • llama3_1-405b-instruct-maas
  • codellama-34b-instruct-maas
  • deepseek-r1-maas
  • deepseek-v3-maas
  • deepseek-r1-0528
  • minimax-m2
  • glm-4.7
  • ... and 15 more

🔄 Updated Models

  • gemini-2.5-pro
  • gemini-3-flash-preview
  • gemini-3.1-pro-preview
  • gemini-3.1-flash-lite-preview
  • gemini-3-pro-image-preview
  • gemini-3.1-flash-image-preview
  • veo-2.0-generate-001
  • veo-3.0-generate-001
  • veo-3.0-fast-generate-001
  • veo-3.1-generate-001
  • veo-3.1-fast-generate-001
  • multimodalembedding@001
  • gemini-embedding-2-preview

Model-to-Pricing-Page Mapping

Model ID Publisher / Section Source Notes
gemini-2.5-pro Google – Gemini 2.5 Pro API Standard tier pricing ($1.25/$10/M); long-context >200K ($2.50/$15/M) noted
gemini-2.5-flash Google – Gemini 2.5 Flash API $0.30/$2.50/M; audio input $1/M
gemini-2.5-flash-lite Google – Gemini 2.5 Flash-Lite API $0.10/$0.40/M
gemini-2.5-flash-preview-09-2025 Google – Gemini 2.5 Flash API Preview alias; matched to Gemini 2.5 Flash pricing
gemini-2.5-flash-lite-preview-09-2025 Google – Gemini 2.5 Flash-Lite API Preview alias; matched to Gemini 2.5 Flash-Lite pricing
gemini-2.5-computer-use-preview-10-2025 Google – Gemini 2.5 Pro Computer Use Preview API Matched to Computer Use Preview row ($1.25/$10/M)
gemini-2.5-flash-image Google – Gemini 2.5 Flash Image API Token + image_token $30/M
gemini-2.5-pro-tts Google – Gemini TTS API – price not found No dedicated TTS pricing row on page; added with price 0
gemini-2.5-flash-tts Google – Gemini TTS API – price not found No dedicated TTS pricing row on page; added with price 0
gemini-2.0-flash-001 Google – Gemini 2.0 Flash API $0.15/$0.60/M; audio $1/M
gemini-2.0-flash-lite-001 Google – Gemini 2.0 Flash-Lite API $0.075/$0.30/M
gemini-3-pro-preview Google – Gemini 3 Pro Preview API $2/$12/M standard; long-context >200K ($4/$18/M) noted
gemini-3-flash-preview Google – Gemini 3 Flash Preview API $0.50/$3/M
gemini-3-pro-image-preview Google – Gemini 3 Pro Image Preview API Token + image_token $120/M
gemini-3.1-pro-preview Google – Gemini 3.1 Pro Preview API $2/$12/M; long-context >200K ($4/$18/M) noted
gemini-3.1-flash-lite-preview Google – Gemini 3.1 Flash-Lite Preview API $0.25/$1.50/M
gemini-3.1-flash-image-preview Google – Gemini 3.1 Flash Image Preview API Token + image_token $60/M
gemma-4-26b-a4b-it-maas Google – Gemma 4 26B (MaaS) API $0.15/$0.60/M; free until Apr 16, 2026 per page note
gemini-embedding-001 Google – Gemini Embedding API $0.00015/1K tokens (online)
gemini-embedding-2-preview Google – Gemini Embedding 2 Preview API Text $0.2/M; image $0.00012/image; video $0.00079/frame
text-embedding-005 Google – Text Embedding API $0.000025/1K chars
text-multilingual-embedding-002 Google – Text Multilingual Embedding API $0.000025/1K chars
text-embedding-large-exp-03-07 Google – Text Embedding Large (experimental) API Matched to text-embedding-005 pricing ($0.000025/1K chars)
textembedding-gecko@003 Google – Text Embedding (legacy) API – price not found Legacy model; no pricing row found; price 0
textembedding-gecko-multilingual@001 Google – Text Embedding (legacy multilingual) API – price not found Legacy model; no pricing row found; price 0
multimodalembedding@001 Google – Multimodal Embedding API Text $0.0002/1K chars; image $0.0001/image; video plus $0.0020/sec
imagen-4.0-ultra-generate-001 Google – Imagen 4 Ultra API $0.06/image
imagen-4.0-generate-001 Google – Imagen 4 API $0.04/image
imagen-4.0-fast-generate-001 Google – Imagen 4 Fast API $0.02/image
imagen-3.0-generate-002 Google – Imagen 3 API $0.04/image
imagen-3.0-capability-001 Google – Imagen 3 (capability) API Uses Imagen 3 generate pricing ($0.04/image); no own row
imagen-3.0-capability-002 Google – Imagen 3 (capability) API Uses Imagen 3 generate pricing ($0.04/image); no own row
veo-2.0-generate-001 Google – Veo 2 API $0.50/sec; default 8s, 1 sample
veo-3.0-generate-001 Google – Veo 3 API $0.20/sec (video-only 720p/1080p); default 8s, 1 sample
veo-3.0-fast-generate-001 Google – Veo 3 Fast API $0.08/sec (720p); default 8s, 1 sample
veo-3.1-generate-001 Google – Veo 3.1 API $0.20/sec; default 8s, 1 sample
veo-3.1-fast-generate-001 Google – Veo 3.1 Fast API $0.08/sec (720p); default 8s, 1 sample
veo-3.1-lite-generate-001 Google – Veo 3.1 Lite API $0.03/sec (720p); default 8s, 1 sample
claude-opus-4-6 Anthropic – Claude Opus 4.6 API @default stripped; $5/$25/M; cache write 5m $6.25/M; cache read $0.50/M
claude-sonnet-4-6 Anthropic – Claude Sonnet 4.6 API @default stripped; $3/$15/M; cache write 5m $3.75/M; cache read $0.30/M
claude-opus-4-5@20251101 Anthropic – Claude Opus 4.5 API $5/$25/M; cache write 5m $6.25/M; cache read $0.50/M
claude-haiku-4-5@20251001 Anthropic – Claude Haiku 4.5 API $1/$5/M; cache write 5m $1.25/M; cache read $0.10/M
claude-sonnet-4-5@20250929 Anthropic – Claude Sonnet 4.5 API $3/$15/M; cache write 5m $3.75/M; cache read $0.30/M
claude-opus-4-1@20250805 Anthropic – Claude Opus 4.1 API $15/$75/M; cache write 5m $18.75/M; cache read $1.50/M
claude-sonnet-4@20250514 Anthropic – Claude Sonnet 4 API $3/$15/M; cache write 5m $3.75/M; cache read $0.30/M
claude-opus-4@20250514 Anthropic – Claude Opus 4 API $15/$75/M; cache write 5m $18.75/M; cache read $1.50/M
gpt-oss-120b-maas OpenAI – GPT OSS 120B API $0.09/$0.36/M; matched to GPT OSS 120B row
clip-vit-base-patch32 OpenAI – CLIP Excluded Non-generative embedding/vision; excluded per openai.md
openclip OpenAI – OpenCLIP Excluded Non-generative; excluded per openai.md
whisper-large OpenAI – Whisper Excluded Audio transcription model; excluded per openai.md
gpt-oss OpenAI – GPT OSS (self-deploy) Excluded has_deploy:true without -maas; excluded per openai.md
llama-3.1-405b-instruct-maas Meta – Llama 3.1 405B API $5/$16/M
llama-3.3-70b-instruct-maas Meta – Llama 3.3 70B API $0.72/$0.72/M
llama-4-maverick-17b-128e-instruct-maas Meta – Llama 4 Maverick API $0.35/$1.15/M
llama-4-scout-17b-16e-instruct-maas Meta – Llama 4 Scout API $0.25/$0.70/M
llama-3.1-70b-instruct-maas Meta – Llama 3.1 70B API – price not found No dedicated row found; price 0
llama-3.1-8b-instruct-maas Meta – Llama 3.1 8B API – price not found No dedicated row found; price 0
llama-3.2-90b-vision-instruct-maas Meta – Llama 3.2 90B Vision API – price not found No dedicated row found; price 0
llama3-405b-instruct-maas Meta – Llama 3 405B (legacy ID) API – price not found Legacy model ID; price 0
llama3-70b-instruct-maas Meta – Llama 3 70B (legacy ID) API – price not found Legacy model ID; price 0
llama3-8b-instruct-maas Meta – Llama 3 8B (legacy ID) API – price not found Legacy model ID; price 0
llama3_1-70b-instruct-maas Meta – Llama 3.1 70B (legacy ID) API – price not found Legacy model ID; price 0
llama3_1-8b-instruct-maas Meta – Llama 3.1 8B (legacy ID) API – price not found Legacy model ID; price 0
llama3_1-405b-instruct-maas Meta – Llama 3.1 405B (legacy ID) API – price not found Legacy model ID; price 0
codellama-34b-instruct-maas Meta – Code Llama 34B API – price not found No dedicated row; price 0
mistral-small-2503 Mistral AI – Mistral Small 3.1 API $0.10/$0.30/M; matched to Mistral Small 3.1 (25.03)
mistral-medium-3 Mistral AI – Mistral Medium 3 API $0.40/$2.00/M
codestral-2 Mistral AI – Codestral 2 API $0.30/$0.90/M
mistral-ocr-2503 Mistral AI – OCR Excluded OCR model; excluded per partners.md
deepseek-r1-0528-maas DeepSeek – DeepSeek-R1 0528 API $1.35/$5.40/M
deepseek-v3.1-maas DeepSeek – DeepSeek-V3.1 API $0.60/$1.70/M; cache read $0.06/M
deepseek-v3.2-maas DeepSeek – DeepSeek-V3.2 API $0.56/$1.68/M; cache read $0.056/M
deepseek-r1-maas DeepSeek – DeepSeek-R1 (legacy) API – price not found Older R1; no dedicated row; price 0
deepseek-v3-maas DeepSeek – DeepSeek-V3 (legacy) API – price not found Older V3; no dedicated row; price 0
deepseek-r1-0528 DeepSeek – DeepSeek-R1 0528 (no-maas) API – price not found Non-MaaS variant; price 0
deepseek-ocr DeepSeek – OCR Excluded OCR model; excluded per global rules
kimi-k2-thinking-maas Moonshot AI – Kimi K2 Thinking API $0.60/$2.50/M; cache read $0.06/M
kimi-k2-5 Moonshot AI – Kimi K2.5 API – price not found No dedicated row; price 0
kimi-k2 Moonshot AI – Kimi K2 API – price not found No dedicated row; price 0
minimax-m2-maas MiniMax – MiniMax M2 API $0.30/$1.20/M; cache read $0.03/M
minimax-m2 MiniMax – MiniMax M2 (non-MaaS) API – price not found Non-MaaS; price 0
glm-4.7-maas ZAI.org – GLM-4.7 API $0.60/$2.20/M
glm-5-maas ZAI.org – GLM-5 API $1/$3.20/M; cache read $0.10/M; free until Feb 19, 2026 per page note
glm-4.7 ZAI.org – GLM-4.7 (non-MaaS) API – price not found Non-MaaS; price 0
glm-5 ZAI.org – GLM-5 (non-MaaS) API – price not found Non-MaaS; price 0
glm-4.5 ZAI.org – GLM-4.5 API – price not found No dedicated row; price 0
glm-ocr ZAI.org – GLM OCR Excluded OCR model; excluded per global rules
glm-image ZAI.org – GLM Image Excluded Image generation; excluded per explicit exception in partners.md
jamba-large-1.6 AI21 – Jamba Large 1.6 API – price not found Self-deploy only; no MaaS row found; price 0
qwen3-235b-a22b-instruct-2507 Qwen – Qwen3 235B API $0.22/$0.88/M
qwen3-coder-480b-a35b-instruct-maas Qwen – Qwen3 Coder 480B API $0.22/$1.80/M; cache read $0.022/M
qwen3-next-80b-instruct Qwen – Qwen3 Next 80B Instruct API $0.15/$1.20/M
qwen3-next-80b-thinking Qwen – Qwen3 Next 80B Thinking API $0.15/$1.20/M
qwen2.5-72b-instruct-maas Qwen – Qwen2.5 72B API – price not found No dedicated row found; price 0
qwen2.5-coder-32b-instruct-maas Qwen – Qwen2.5 Coder 32B API – price not found No dedicated row found; price 0
qwen-vl-plus-0809 Qwen – Qwen VL Plus API – price not found No dedicated row found; price 0
qwen-vl-max-0809 Qwen – Qwen VL Max API – price not found No dedicated row found; price 0
qwen-long Qwen – Qwen Long API – price not found No dedicated row found; price 0
qwen-plus-latest Qwen – Qwen Plus API – price not found No dedicated row found; price 0
qwen-turbo-latest Qwen – Qwen Turbo API – price not found No dedicated row found; price 0
qwen-max-latest Qwen – Qwen Max API – price not found No dedicated row found; price 0
qwen-image Qwen – Qwen Image Excluded Image generation; excluded per explicit exception in partners.md
grok-* (xAI) xAI – Grok Pricing page only Grok models appear on pricing page but not returned by any get_vertex_models publisher call; not added per skill rules

Generated by Pricing Agent on 2026-04-13

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant