Skip to content

feat(dotAI): Dot AI LangChain4J integration (First provider/OpenAI)#35150

Merged
ihoffmann-dot merged 77 commits intomainfrom
dot-ai-langchain-integration
Apr 20, 2026
Merged

feat(dotAI): Dot AI LangChain4J integration (First provider/OpenAI)#35150
ihoffmann-dot merged 77 commits intomainfrom
dot-ai-langchain-integration

Conversation

@ihoffmann-dot
Copy link
Copy Markdown
Member

@ihoffmann-dot ihoffmann-dot commented Mar 27, 2026

Summary

Replaces the direct OpenAI HTTP client (OpenAIClient) with a LangChain4J abstraction layer, enabling multi-provider AI support in dotCMS. This PR covers Phase 1: OpenAI via LangChain4J.

Changes

  • LangChain4jAIClient.java: New AIProxiedClient implementation that delegates chat, embeddings, and image requests to LangChain4J models.
  • LangChain4jModelFactory.java: Factory that builds ChatModel, EmbeddingModel, and ImageModel instances from a ProviderConfig. Only place with provider-specific builder logic.
  • ProviderConfig.java: Deserializable POJO for the providerConfig JSON secret (per provider section: model, apiKey, endpoint, maxTokens, maxCompletionTokens, etc.).
  • AppConfig.java: Replaced legacy individual-field secrets (apiKey, model, etc.) with a single providerConfig JSON string. isEnabled() now only checks this field.
  • AIAppValidator.java: Removed the OpenAI /v1/models validation call, which is incompatible with the multi-provider architecture.
  • CompletionsResource.java: Updated /api/v1/ai/completions/config to derive model names and config values from AppConfig getters instead of iterating raw AppKeys.
  • dotAI.yml: Removed legacy hidden fields; added providerConfig as the single configuration entry point.
  • Tests: Added unit tests for ProviderConfig, LangChain4jModelFactory, and LangChain4jAIClient; updated AIProxyClientTest integration test to use providerConfig-based setup.

Motivation

The previous implementation was tightly coupled to OpenAI's API contract (hardcoded HTTP calls, OpenAI-specific parameters, model validation via /v1/models). LangChain4J provides a provider-agnostic model interface, allowing future phases to add Azure OpenAI, AWS Bedrock, and Vertex AI without touching the core request/response flow.

The providerConfig JSON secret replaces multiple individual secrets with a single structured configuration, supporting per-section (chat/embeddings/image) provider and model settings.

Related Issue

This PR fixes #35183
EPIC: dotAI Multi-Provider Support #33970


Note

High Risk
High risk because it replaces the core AI client/provider path (OpenAI HTTP + model fallback/validation) with a new LangChain4J-backed implementation and a new providerConfig secret format, impacting chat, embeddings, and image generation behavior and configuration compatibility.

Overview
dotAI now routes chat, embeddings, and image requests through a new LangChain4J-backed client (LangChain4jAIClient) and sets LangChain4J as the default provider, replacing the direct OpenAI HTTP client and removing the model-fallback strategy.

Configuration is migrated from many per-field secrets to a single providerConfig JSON (with hashing + per-host model caching), updating AppConfig.isEnabled(), dotAI.yml, and the /v1/ai/completions/config output (including credential redaction). Several OpenAI-specific model management/validation classes and tests are removed, and integration/unit tests are updated/added for the new providerConfig + LangChain4J flow.

Embeddings/image handling is adjusted: embeddings requests now send raw text (with token-count fallback when encoding is unavailable), the async thread pool key is renamed to AIThreadPool, max-token resolution is made more resilient, and image temp-file creation now supports base64 (b64_json) responses.

Reviewed by Cursor Bugbot for commit 31cb86e. Bugbot is set up for automated code reviews on this repo. Configure here.

@github-actions github-actions Bot added the Area : Backend PR changes Java/Maven backend code label Mar 27, 2026
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Mar 27, 2026

Claude finished @ihoffmann-dot's task in 3m 15s —— View job


Rollback Safety Analysis

  • Read rollback-unsafe categories reference
  • Analyzed full PR diff against all categories
  • Posted analysis comment and applied label

Result: Unsafe to Rollback — two categories matched:

  1. 🟠 HIGH — H-1 (analogous): One-Way Application Configuration MigrationdotAI.yml drops all individual secret fields (apiKey, textModelNames, etc.) and replaces them with a single providerConfig JSON. Once an admin saves under N, N-1's isEnabled() silently returns false (requires apiKey which no longer exists in the secrets store).

  2. 🟡 MEDIUM — M-3: REST API Contract Change/api/v1/ai/completions/config drops ~14 response fields and replaces them with providerConfig. Clients built against N's contract break on rollback.

Label AI: Not Safe To Rollback has been applied.

@ihoffmann-dot ihoffmann-dot marked this pull request as ready for review March 30, 2026 23:15
Comment thread dotcms-integration/src/test/java/com/dotcms/ai/AiTest.java Outdated
Comment thread dotCMS/src/main/java/com/dotcms/ai/api/EmbeddingsAPIImpl.java
Comment thread dotCMS/src/main/java/com/dotcms/ai/api/EmbeddingsAPIImpl.java Outdated
Comment thread dotCMS/src/main/java/com/dotcms/ai/client/langchain4j/ProviderConfig.java Outdated
Comment thread dotCMS/src/main/java/com/dotcms/ai/client/langchain4j/ProviderConfig.java Outdated
Comment thread dotCMS/src/main/java/com/dotcms/ai/app/AppConfig.java Outdated
@dotCMS dotCMS deleted a comment from claude Bot Apr 17, 2026
@dotCMS dotCMS deleted a comment from claude Bot Apr 17, 2026
Comment thread dotCMS/src/main/java/com/dotcms/ai/api/CompletionsAPIImpl.java Outdated
@dotCMS dotCMS deleted a comment from claude Bot Apr 17, 2026
@dotCMS dotCMS deleted a comment from claude Bot Apr 17, 2026
@claude
Copy link
Copy Markdown
Contributor

claude Bot commented Apr 20, 2026

Pull Request Unsafe to Rollback!!!

  • Category: M-3 — REST / GraphQL / Headless API Contract Change
  • Risk Level: 🟡 MEDIUM
  • Why it's unsafe: The /api/v1/ai/completions/config endpoint response has been restructured. N removes ~14 response fields (textModelNames, imageModelNames, embeddingsModelNames, textModelMaxTokens, imageModelMaxTokens, embeddingsModelMaxTokens, all *PerMinute fields, availableModels, and apiKey) and replaces them with a single providerConfig JSON field plus a handful of prompt fields. Any consumer of this endpoint — including the dotCMS Angular admin if it is updated separately — built against N's contract will break on rollback to N-1, which restores the old per-field format.
  • Code that makes it unsafe: dotCMS/src/main/java/com/dotcms/ai/rest/CompletionsResource.java lines 177–207 — the config endpoint now builds a map with AppKeys.PROVIDER_CONFIG.key, AppKeys.ROLE_PROMPT.key, etc. instead of iterating over all AppKeys.values() and appending availableModels.
  • Alternative (if possible): Keep the old per-field keys in the response alongside providerConfig (derived from it) so N-1 and N consumers both receive what they expect. Remove the legacy keys in a subsequent release once N-1 is outside the rollback window.

  • Category: H-1 (analogous) — One-Way Application Configuration Migration
  • Risk Level: 🟠 HIGH
  • Why it's unsafe: dotAI.yml removes all individual secret fields (apiKey, textModelNames, imageModelNames, embeddingsModelNames, and all per-model rate/token fields) and replaces them with a single providerConfig JSON secret. App secrets are persisted in the dotcms_apps_secrets database table. Once an administrator saves the dotAI configuration under N (which writes only providerConfig to the DB), rolling back to N-1 is effectively a one-way migration: N-1's AppConfig.isEnabled() evaluates Stream.of(apiUrl, apiImageUrl, apiEmbeddingsUrl, apiKey).allMatch(StringUtils::isNotBlank) and returns false because apiKey is now blank in the secrets store. All AI features (chat completions, embeddings, image generation) are silently disabled after rollback — with no error thrown and no automatic recovery path.
  • Code that makes it unsafe:
    • dotCMS/src/main/resources/apps/dotAI.yml — removes apiKey, textModelNames, imageModelNames, embeddingsModelNames, and 12 other individual fields; adds providerConfig as the sole required entry.
    • dotCMS/src/main/java/com/dotcms/ai/app/AppConfig.java lines 382–392 — N's new isEnabled() requires only providerConfig to be non-blank; N-1's isEnabled() requires apiKey + URL fields to be non-blank. After N's config save, apiKey no longer exists in the secret store.
  • Alternative (if possible): Follow the two-phase migration pattern: in N, keep apiKey (and the other legacy fields) in dotAI.yml as hidden/deprecated but still present, so N-1 can still read them after rollback. Populate providerConfig from those fields as a convenience in N. Remove the legacy fields only in N+1, once N-1 is outside the rollback window.

@dotCMS dotCMS deleted a comment from claude Bot Apr 20, 2026
@ihoffmann-dot ihoffmann-dot added this pull request to the merge queue Apr 20, 2026
Merged via the queue into main with commit 4286773 Apr 20, 2026
49 checks passed
@ihoffmann-dot ihoffmann-dot deleted the dot-ai-langchain-integration branch April 20, 2026 21:22
riccardoruocco pushed a commit to riccardoruocco/core that referenced this pull request Apr 27, 2026
…MS#35445)

## Summary

Replaces the generic Apps UI textarea for `providerConfig` with a
dedicated Angular screen that shows the current config and an example
JSON side-by-side. Also ports backend fixes from dotCMS#35426 and adds a `PUT
/api/v1/ai/completions/config` endpoint with credential-preserving
merge.

- New `DotAiConfigDetailComponent`: two-column layout — editable
textarea on the left, formatted example JSON on the right
- `DotAiConfigDetailResolver`: dedicated resolver that hardcodes `dotAI`
as `appKey`, fixing the 404 caused by the generic resolver reading
`null` from route params
- Route `dotAI/edit/:id` added before the generic `:appKey/edit/:id` so
dotAI navigates to the custom screen
- `ChangeDetectorRef.detectChanges()` after async config load to fix
textarea not rendering value until user interaction
- `PUT /api/v1/ai/completions/config` (admin-only): saves
`providerConfig` JSON; `ProviderConfigMerger` preserves stored
credentials when the payload contains `*****` sentinel values
- `dotAI.yml` description updated to reference OpenAI/ChatGPT directly
- Ported from dotCMS#35426: flush SSE chunks, `cancelled` flag on
`IOException`, `maxRetries` warn for streaming, null check in
`parseSection`, `deepCopy` in `injectApiKeyIntoSections`, `maxTokens` →
`max_completion_tokens` routing for OpenAI o-series models, PR review
refactors

## Configuration

```json
{
  "chat": {
    "provider": "openai",
    "apiKey": "sk-...",
    "model": "gpt-4o",
    "maxTokens": 16384,
    "temperature": 1.0,
    "maxRetries": 3,
    "rolePrompt": "You are dotCMSbot, an AI assistant to help content creators.",
    "textPrompt": "Use Descriptive writing style."
  },
  "embeddings": {
    "provider": "openai",
    "apiKey": "sk-...",
    "model": "text-embedding-ada-002",
    "listenerIndexer": { "default": "blog,news,webPageContent" }
  },
  "image": {
    "provider": "openai",
    "apiKey": "sk-...",
    "model": "dall-e-3",
    "size": "1792x1024",
    "imagePrompt": "Use 16:9 aspect ratio."
  }
}
```

## Notes

- The custom UI is accessed via `#/apps/dotAI/edit/:siteId`; the "App
screen" link in `render.jsp` already points to this URL
- Credential masking: fields with `*****` in a PUT payload are replaced
with the stored values before saving, so partial edits don't wipe
secrets
- `providerConfig` is not required — omitting it disables the AI
features gracefully

## Related Issues

- [feat(dotAI): Dot AI LangChain4J integration (First provider/OpenAI)
dotCMS#35150](dotCMS#35150)
- [fix(dotAI): Dot AI LangChain4J - ProviderConfig fixes
dotCMS#35426](dotCMS#35426)
riccardoruocco pushed a commit to riccardoruocco/core that referenced this pull request Apr 27, 2026
…uery param (dotCMS#35456)

## Summary

Config endpoints were always resolving the target host from the HTTP
`Host` header, making it impossible to read or save configuration for a
specific site (including `SYSTEM_HOST`) through the API.

- `GET /api/v1/ai/completions/config` and `PUT
/api/v1/ai/completions/config` now accept an optional
`?siteId=<identifier>` query param to scope the operation to a specific
site
- `SYSTEM_HOST` is supported as a special-case value
- Falls back to HTTP host resolution when `siteId` is not provided
(backward-compatible)
- Frontend passes the site identifier from the route param
(`dotAI/edit/:id`) on both load and save

## Notes

- Without this fix, saving config through the custom UI always targeted
the site resolved from the HTTP `Host` header (e.g. `demo.dotcms.com`),
never `SYSTEM_HOST`. Background threads use `SYSTEM_HOST` config, so
embeddings would silently pick up stale configuration.
- `SYSTEM_HOST` can now be explicitly targeted by passing
`?siteId=SYSTEM_HOST`.

## Related Issue

This PR fixes dotCMS#35150
ihoffmann-dot added a commit that referenced this pull request Apr 28, 2026
ihoffmann-dot added a commit that referenced this pull request Apr 28, 2026
…35456) (#35494)

## Summary
Reverts the LangChain4J integration and related changes merged in:
- #35150 — LangChain4J integration (Phase 1 / OpenAI)
- #35445 — Custom UI for provider config
- #35456 — Per-site config support via siteId

Restores dotAI to the state at tag `v26.04.28-01` prior to those merges.

## Test plan
- Verify dotAI app configuration UI works as before
- Verify AI completions, embeddings, and image generation function with
the original OpenAI client
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AI: Not Safe To Rollback Area : Backend PR changes Java/Maven backend code

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

[FEATURE] dotAI: LangChain4J integration — Phase 1 (OpenAI)

4 participants