Skip to content

feat: add native DeepSeek provider#11

Merged
yanmxa merged 3 commits into
genai-io:mainfrom
zhujian7:feat/deepseek-provider
May 15, 2026
Merged

feat: add native DeepSeek provider#11
yanmxa merged 3 commits into
genai-io:mainfrom
zhujian7:feat/deepseek-provider

Conversation

@zhujian7
Copy link
Copy Markdown
Contributor

@zhujian7 zhujian7 commented May 15, 2026

Summary

  • Add native DeepSeek API provider (api.deepseek.com) using OpenAI-compatible SDK
  • Support deepseek-v4-flash and deepseek-v4-pro models with reasoning_effort extraction
  • Include model catalog with token limits and USD pricing, with cost tracking wired into model.go

Test plan

  • Build passes
  • All new unit tests pass (model list error propagation, stream request, cost estimation, thinking support)
  • Existing provider tests unaffected
  • Manual end-to-end test with valid DEEPSEEK_API_KEY

🤖 Generated with Claude Code

Add native DeepSeek API provider (api.deepseek.com) using the OpenAI-compatible SDK pattern. Supports deepseek-chat (V3) and deepseek-reasoner (R1) models with reasoning_content extraction for the reasoner model.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: zhujian <jiazhu@redhat.com>
@zhujian7
Copy link
Copy Markdown
Contributor Author

/hold

- Fix llm.IsReady to use secret.Resolve instead of os.Getenv, so provider
  readiness checks are consistent with the UI layer (RenderEnvVarStatus and
  providerIsEnvReady). This fixes the API key input prompt not appearing.
- Update DeepSeek catalog to V4 models (deepseek-v4-flash, deepseek-v4-pro)
  with 1M context and current pricing; keep legacy chat/reasoner as aliases.
- Default model changed to deepseek-v4-flash.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: zhujian <jiazhu@redhat.com>
@zhujian7
Copy link
Copy Markdown
Contributor Author

/hold cancel

Copy link
Copy Markdown
Member

@yanmxa yanmxa left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Six inline comments on the items I think need addressing before merge. Full review context in the PR conversation.

var catalog = []modelCatalogEntry{
{
info: llm.ModelInfo{
ID: "deepseek-v4-flash",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description says V3/R1, but the catalog ships deepseek-v4-flash / deepseek-v4-pro as primary entries with 1M input / 384K output limits and a specific deprecation date for the legacy names (line 38). These IDs aren't in DeepSeek's published lineup — please cite the source for the V4 family + pricing, or drop the V4 entries and keep only deepseek-chat / deepseek-reasoner.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DeepSeek V4 launched April 2025 and is the current production lineup. deepseek-v4-flash and deepseek-v4-pro are the canonical model IDs per https://api-docs.deepseek.com/quick_start/pricing. The legacy names (deepseek-chat, deepseek-reasoner) will be deprecated July 24, 2026 — they currently map to V4 Flash on the backend. Updated the PR description accordingly.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the PR description updated, we still use the deepseek-v4-flash / deepseek-v4-pro as "The model names deepseek-chat and deepseek-reasoner will be deprecated on 2026/07/24"

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and I removed the deepseek-chat / deepseek-reasoner, because when I call the deepseek list models api: these 2 are removed:

╰─$ curl -s https://api.deepseek.com/models -H "Authorization: Bearer $DEEPSEEK_API_KEY"                  

{"object":"list","data":[{"id":"deepseek-v4-flash","object":"model","owned_by":"deepseek"},{"id":"deepseek-v4-pro","object":"model","owned_by":"deepseek"}]}%  

return llm.ModelInfo{}, false
}

func EstimateCost(modelID string, usage llm.Usage) (llm.Money, bool) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EstimateCost is dead code — internal/app/model.go only switches on llm.MinMax for cost roll-up, nothing calls this. Either add a case llm.DeepSeek: branch in model.go, or remove EstimateCost + the per-entry pricing data + TestDeepSeekEstimateCost from this PR.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment thread internal/llm/deepseek/client.go Outdated
// V4 models (deepseek-v4-*) and legacy deepseek-reasoner all support thinking.
func supportsThinking(model string) bool {
lower := strings.ToLower(model)
return strings.Contains(lower, "reasoner") || strings.Contains(lower, "v4")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Allow-list defaults unknown IDs to non-thinking. BigModel uses the opposite shape (bigmodel/client.go:58-68): switch on known non-thinking IDs and default true. As written, every future DeepSeek model silently loses the thinking-effort UI until someone updates this string match. Recommend flipping to a deny-list.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment thread internal/llm/deepseek/client.go Outdated
func (c *Client) ListModels(ctx context.Context) ([]llm.ModelInfo, error) {
page, err := c.client.Models.List(ctx)
if err != nil {
return StaticModels(), nil
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Silently swallowing the API error and returning the static catalog masks real failures (bad/expired key, network) — user sees a populated model list, then gets a confusing 401 mid-stream. Moonshot/BigModel propagate the error; please match. (Also lets you drop the static fallback entirely, consistent with the dynamic-list pattern.)

func NewAPIKeyClient(ctx context.Context) (llm.Provider, error) {
baseURL := os.Getenv("DEEPSEEK_BASE_URL")
if baseURL == "" {
baseURL = "https://api.deepseek.com"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing /v1. The openai-go SDK appends /chat/completions, so this resolves to https://api.deepseek.com/chat/completions — DeepSeek's documented endpoint is /v1/chat/completions. Moonshot/BigModel both include /v1 in the default. Suggest https://api.deepseek.com/v1.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per the DeepSeek API docs, both /chat/completions and /v1/chat/completions are valid endpoints. The base URL https://api.deepseek.com is the documented default. Additionally, the Models.List endpoint is at /models — adding /v1 to the base URL would break it to /v1/models. Given both paths work for chat and the models endpoint requires the non-v1 path, leaving the base URL as-is is safer.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

case "bigmodel":
return "glm-5.1"
case "deepseek":
return "deepseek-v4-flash"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default depends on deepseek-v4-flash actually existing at the API. If the V4 IDs aren't real (see catalog.go comment), every fresh user picks a model the API will reject. Recommend deepseek-chat as a safer default until V4 ships.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DeepSeek V4 is the current production lineup (launched April 2025). deepseek-v4-flash is the recommended default — it is the canonical model ID per the API docs. The legacy deepseek-chat name will be deprecated July 24, 2026.

- Wire up DeepSeek cost estimation in model.go
- Remove legacy deepseek-chat/deepseek-reasoner (API only returns V4)
- Propagate ListModels error instead of silently falling back
- Simplify supportsThinking (all V4 models support reasoning_effort)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Signed-off-by: zhujian <jiazhu@redhat.com>
@zhujian7
Copy link
Copy Markdown
Contributor Author

@yanmxa PTAL

@yanmxa yanmxa merged commit b7dca93 into genai-io:main May 15, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants