Skip to content

Reasoning loss diagnostic logging#123

Open
mikeperry-tor wants to merge 3 commits into
13rac1:mainfrom
mikeperry-tor:reasoning_logging
Open

Reasoning loss diagnostic logging#123
mikeperry-tor wants to merge 3 commits into
13rac1:mainfrom
mikeperry-tor:reasoning_logging

Conversation

@mikeperry-tor

@mikeperry-tor mikeperry-tor commented Jul 2, 2026

Copy link
Copy Markdown
Collaborator

Diagnostic logging for common sources of agent reasoning loss.

Bug 1: Most frameworks omit reasoning_content for "openai compatible" apis like teep -> WARN if detected in same turn; INFO if detected across turns.

Bug 2: Onyx and maybe others append a reminder user message, which causes most model types (GLM, DeepSeek, Qwen) to purge prior reasoning, thinking it is a new turn -> WARN if detected.

Bug 3: If an API request comes in trying to preserve all reasoning including prior turns (typical of properly written coding agents), but the API omits a model-specific reasoning preservation flag (can happen if framework does not detect teep's model type) -> WARN.

None are bugs caused by teep per-se, but they are easy hazards that agent frameworks can mess up, especially when using teep's oddly formatted model names and unknown base url and specific provider API type.

Arguably we could make a camoflage mode that might help mitigate these (for example, by normalizing model names to their OpenRouter versions, and officially speaking OpenRouter API instead of "OpenAI Compatible", or some similar option, but then this becomes a user configuration hazard or flat out impossibility because of the lack of API base url input field).

This should become a teep factor for reports, if it works.
This causes GLM to strip prior reasoning via JINJA template.
Copilot AI review requested due to automatic review settings July 2, 2026 04:49
@mikeperry-tor mikeperry-tor changed the title Reasoning logging Reasoning loss diagnostic logging Jul 2, 2026

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds diagnostic logging in the proxy’s chat-completions path to help identify common causes of “reasoning loss” (e.g., frameworks stripping assistant reasoning fields, model-specific chat template flags that discard reasoning, and trailing user addendums after tool output).

Changes:

  • Introduces request-metadata extraction (chatRequestStats) and structured slog output (logChatRequestStats) for chat requests.
  • Adds a per-server hourly log limiter to rate-limit repeated reasoning-loss warnings.
  • Adds unit tests covering reasoning-loss classification, model-flag warnings/suppression, and rate limiting behavior.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
internal/proxy/proxy.go Adds chat request reasoning-loss metadata parsing, rate-limited diagnostic logging, and hooks it into the chat endpoint handler.
internal/proxy/proxy_internal_test.go Adds focused unit tests for stats extraction and emitted log signals (including rate-limiting behavior).

Comment thread internal/proxy/proxy.go
Comment on lines +626 to +628
if string(bytes.TrimSpace(raw)) == "null" {
return "", false, true
}
Comment thread internal/proxy/proxy.go
Comment on lines +648 to +650
if string(bytes.TrimSpace(raw)) == "null" {
return false, true, false
}
Comment thread internal/proxy/proxy.go
Comment on lines +767 to +770
func logChatRequestStats(ctx context.Context, limiter *hourlyLogLimiter, model, providerName, upstreamModel, path string, body []byte) {
stats, err := chatRequestStats(body)
if err != nil {
if allowHourlyLog(limiter, "chat_reasoning_metadata_unavailable") {

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This recommended fix drops the rate limiter for some reason. We should instead also use the rate limiter to avoid parsing, in addition to the log level checks.

@mikeperry-tor

Copy link
Copy Markdown
Collaborator Author

We should also reference #124 in these log messages for more information.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants