Skip to content

fix(ai): map Anthropic max_tokens to OpenAI max_completion_tokens#225

Open
dimakis wants to merge 2 commits into
praxis-proxy:mainfrom
dimakis:fix/max-completion-tokens
Open

fix(ai): map Anthropic max_tokens to OpenAI max_completion_tokens#225
dimakis wants to merge 2 commits into
praxis-proxy:mainfrom
dimakis:fix/max-completion-tokens

Conversation

@dimakis

@dimakis dimakis commented Jul 2, 2026

Copy link
Copy Markdown

Summary

  • The anthropic_to_openai request transform maps Anthropic's max_tokens to OpenAI's max_tokens in Chat Completions requests
  • GPT-5.x and reasoning models (o1, o3) reject max_tokens, returning: "Unsupported parameter: 'max_tokens' is not supported with this model. Use 'max_completion_tokens' instead."
  • One-line fix: emit max_completion_tokens instead, which is backwards-compatible with older models

Context

OpenAI has three different token limit parameters across their APIs:

API Parameter Endpoint
Anthropic Messages max_tokens /v1/messages
OpenAI Chat Completions max_completion_tokens /v1/chat/completions
OpenAI Responses max_output_tokens /v1/responses

The max_completion_tokens parameter was introduced with the o1 reasoning models to distinguish between reasoning tokens and output tokens. GPT-5.x models continued this convention, rejecting the legacy max_tokens entirely.

This fix addresses the Chat Completions path only. The Responses API (/v1/responses) uses max_output_tokens and would need separate translation logic.

Evidence

Direct API error (GPT-5.5 via Chat Completions through praxis-proxy):

"Unsupported parameter: 'max_tokens' is not supported with this model.
 Use 'max_completion_tokens' instead."

OpenAI API reference (developers.openai.com/api/docs/api-reference/evals):

The max_completion_tokens parameter sets the maximum number of tokens that can be included in the generated output from the model.

Backwards compatibility: max_completion_tokens is accepted by both legacy models (GPT-4.1, GPT-3.5-turbo) and current frontier models (GPT-5.x, o3). Safe to apply unconditionally.

Verification

Tested locally with praxis-proxy proxying Anthropic-format requests to OpenAI:

Test Model Result
Non-streaming gpt-4.1-mini Pass
Streaming SSE gpt-4.1-mini Pass
Tool calling gpt-4.1-mini Pass
Non-streaming gpt-5.5 Pass (failed before fix)
Streaming SSE gpt-5.5 Pass

Existing unit tests updated and passing (14/14).

Test plan

  • Unit tests pass (14/14)
  • Manual curl: GPT-5.5 non-streaming through proxy
  • Manual curl: GPT-5.5 streaming through proxy
  • Manual curl: GPT-4.1-mini backwards compat

@dimakis dimakis requested a review from shaneutt as a code owner July 2, 2026 01:27
@praxis-bot-app

praxis-bot-app Bot commented Jul 2, 2026

Copy link
Copy Markdown

Unsigned commits: b1f811f. Please sign your commits.

OpenAI's GPT-5.x and reasoning models (o1/o3) reject the legacy
`max_tokens` parameter in Chat Completions requests, returning:

  "Unsupported parameter: 'max_tokens' is not supported with this
   model. Use 'max_completion_tokens' instead."

The `anthropic_to_openai` request transform was mapping Anthropic's
`max_tokens` field to OpenAI's `max_tokens`, which works for older
models (GPT-4.1, GPT-3.5) but fails on all current frontier models.

`max_completion_tokens` is backwards-compatible with older models
that still accept `max_tokens`, so this change is safe to apply
unconditionally.

Signed-off-by: Dimitri Saridakis <dimitri.saridakis@gmail.com>
Signed-off-by: dimakis <dimitri.saridakis@gmail.com>
@dimakis dimakis force-pushed the fix/max-completion-tokens branch from b1f811f to 6b4e9a9 Compare July 2, 2026 01:35
@praxis-bot praxis-bot requested review from a team and aslakknutsen and removed request for a team and shaneutt July 2, 2026 01:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants