task(Migrate ESMappingAPIImpl) Refs #34933#35289
Conversation
|
Claude finished @fabrizzio-dotCMS's task in 4m 11s —— View job PR Review
Four open issues worth addressing before merge: 1. Thread-unsafe
|
|
Pull Request Unsafe to Rollback!!!
|
…/core into issue-34933-Mapping-Layer
|
Pull Request Unsafe to Rollback!!!
|
…/core into issue-34933-Mapping-Layer
…/core into issue-34933-Mapping-Layer
…Suite Verifies the full createContentIndex pipeline on OS: - index creation and existence - dynamic templates stored in mapping (template_1, textmapping, geomapping, keywordmapping) - dynamic templates fire on real documents (*_dotraw→keyword, *_text→text, *latlon→geo_point) - auto_expand_replicas=0-1 index setting - custom analysers (my_analyzer, dot_comma_analyzer) from os-content-settings.json Also adds OS endpoint config to dotcms-config-cluster.properties for the test runner. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- BulkProcessorListener: workingRecords HashMap → ConcurrentHashMap (concurrent access from ReindexThread + BulkProcessor callback thread) - BulkProcessorListener: contentletsIndexed marked volatile (written on callback thread, read on ReindexThread main loop) - BulkProcessorListener: explicit (float) cast in 50% failure-rate guard (defensive; avoids silent integer division if operands are reordered) - ContentletIndexAPIImpl: fullReindexSwitchover marked synchronized (aligns with fullReindexStart, initIndex, createContentIndex pattern) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Fix misleading log: drop "ES write succeeded" from OS shadow-write failure message — esException may already be set at that point - Fix double addCustomMapping: remove redundant call from 1-arg createContentIndex overload; the 2-arg overload already calls it Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
ConfigurableOpenSearchProvider is a connection-provider class, not the right home for a timeout constant. Mirrors ESIndexAPI.INDEX_OPERATIONS_TIMEOUT_IN_MS pattern — the timeout now lives next to the index operations it governs. Resolves TODO in ConfigurableOpenSearchProvider. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
mbiuki
left a comment
There was a problem hiding this comment.
All four issues from my earlier review are addressed in the latest commits — , , explicit cast, and on . The additional fixes for the misleading log message and field naming are also good improvements. Looks solid. Approving.
|
Pull Request Unsafe to Rollback!!!
|
## Summary Migrates `ContentletIndexAPIImpl` — the central content indexing implementation — to the vendor-neutral Phase-aware router pattern. All Elasticsearch-specific types (`BulkRequest`, `BulkProcessor`, `ActionListener`) are replaced with domain-layer abstractions (`IndexBulkRequest`, `IndexBulkProcessor`, `IndexBulkListener`), enabling dual-write routing to both ES and OS backends during the ES→OS migration without leaking vendor types to callers. ## Changes ### Backend — Core Indexing - **`ContentletIndexAPI` (interface)**: Removed all `org.elasticsearch.*` imports from method signatures; replaced with vendor-neutral `IndexBulkRequest`, `IndexBulkProcessor`, and `IndexBulkListener` domain types. Added `@IndexLibraryIndependent` annotation. Changed `fullReindexStart()` return type from `String` to `IndexStartResult`. Added thread-safe `DateTimeFormatter` alongside deprecated `SimpleDateFormat`. - **`ContentletIndexAPIImpl`**: Full rewrite to a phase-aware router: - Holds two `ContentletIndexOperations` instances (`operationsES`, `operationsOS`) and delegates via `PhaseRouter<ContentletIndexOperations>` - New `DualIndexBulkRequest` inner class fans out synchronous bulk batches to both ES and OS in dual-write phases - New `CompositeBulkProcessor` inner class fans out async bulk processing; OS failures are fire-and-forget (shadow) in phases 1/2, propagating only in phase 3 - `ProviderIndices` inner class resolves working/live/reindex index names per provider, stripping `os::` vendor tags before passing to the OS client - Lazy-initialized `ContentMappingAPI` via `AtomicReference` to break the circular dependency chain - Dropped all direct `org.elasticsearch.*` imports from the implementation - **`IndexTag`**: New utility for stripping vendor-specific index name prefixes (`os::`) - **`IndexStartResult`**: New domain value object replacing the raw `String` returned by `fullReindexStart()` - **`BulkProcessorListener`**: Updated to implement `IndexBulkListener` instead of the ES-specific listener - **`ReindexThread`**: Minor updates to use `IndexBulkProcessor`/`IndexBulkRequest` instead of ES types - **`MappingHelper`**: Replaces `ESMappingUtilHelper` as the vendor-neutral mapping utility - **`ContentletIndexOperationsES` / `ContentletIndexOperationsOS`**: Updated to expose the new domain type APIs ### Testing - `ContentletIndexAPIImplTest`: Updated to use the new domain types and refactored test constructors - `ESSiteSearchAPITest`, `EMAWebInterceptorTest`, `CleanUpFieldReferencesJobTest`: Minor compile fixes following the API changes ## Testing 1. Start the integration test environment: `just test-integration-ide` 2. Run the primary integration test: ``` ./mvnw verify -pl :dotcms-integration -Dcoreit.test.skip=false -Dit.test=ContentletIndexAPIImplTest ``` 3. Verify reindexing still works end-to-end by triggering a full reindex from the dotCMS Admin → System → Reindex 4. Check that `ReindexThread` completes without errors in both Phase 0 (ES only) and Phase 1 (dual-write) configurations ## Breaking Changes - `ContentletIndexAPI.fullReindexStart()` now returns `IndexStartResult` instead of `String`. Callers previously using the raw timestamp string must call `.timestampSuffix()` on the result. - `BulkRequest`, `BulkProcessor`, and `ActionListener<BulkResponse>` removed from the `ContentletIndexAPI` interface. Any code directly casting or importing these types against the interface will not compile. This PR fixes: #34933 --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
|
Meticulous was unable to execute a test run for this PR because the most recent commit is associated with multiple PRs. To execute a test run, please try pushing up a new commit that is only associated with this PR. Last updated for commit |
Summary
Migrates
ContentletIndexAPIImpl— the central content indexing implementation — to the vendor-neutral Phase-aware router pattern. All Elasticsearch-specific types (BulkRequest,BulkProcessor,ActionListener) are replaced with domain-layer abstractions (IndexBulkRequest,IndexBulkProcessor,IndexBulkListener), enabling dual-write routing to both ES and OS backends during the ES→OS migration without leaking vendor types to callers.Changes
Backend — Core Indexing
ContentletIndexAPI(interface): Removed allorg.elasticsearch.*imports from method signatures; replaced with vendor-neutralIndexBulkRequest,IndexBulkProcessor, andIndexBulkListenerdomain types. Added@IndexLibraryIndependentannotation. ChangedfullReindexStart()return type fromStringtoIndexStartResult. Added thread-safeDateTimeFormatteralongside deprecatedSimpleDateFormat.ContentletIndexAPIImpl: Full rewrite to a phase-aware router:ContentletIndexOperationsinstances (operationsES,operationsOS) and delegates viaPhaseRouter<ContentletIndexOperations>DualIndexBulkRequestinner class fans out synchronous bulk batches to both ES and OS in dual-write phasesCompositeBulkProcessorinner class fans out async bulk processing; OS failures are fire-and-forget (shadow) in phases 1/2, propagating only in phase 3ProviderIndicesinner class resolves working/live/reindex index names per provider, strippingos::vendor tags before passing to the OS clientContentMappingAPIviaAtomicReferenceto break the circular dependency chainorg.elasticsearch.*imports from the implementationIndexTag: New utility for stripping vendor-specific index name prefixes (os::)IndexStartResult: New domain value object replacing the rawStringreturned byfullReindexStart()BulkProcessorListener: Updated to implementIndexBulkListenerinstead of the ES-specific listenerReindexThread: Minor updates to useIndexBulkProcessor/IndexBulkRequestinstead of ES typesMappingHelper: ReplacesESMappingUtilHelperas the vendor-neutral mapping utilityContentletIndexOperationsES/ContentletIndexOperationsOS: Updated to expose the new domain type APIsTesting
ContentletIndexAPIImplTest: Updated to use the new domain types and refactored test constructorsESSiteSearchAPITest,EMAWebInterceptorTest,CleanUpFieldReferencesJobTest: Minor compile fixes following the API changesTesting
just test-integration-ideReindexThreadcompletes without errors in both Phase 0 (ES only) and Phase 1 (dual-write) configurationsBreaking Changes
ContentletIndexAPI.fullReindexStart()now returnsIndexStartResultinstead ofString. Callers previously using the raw timestamp string must call.timestampSuffix()on the result.BulkRequest,BulkProcessor, andActionListener<BulkResponse>removed from theContentletIndexAPIinterface. Any code directly casting or importing these types against the interface will not compile.This PR fixes: #34933
This PR fixes: #34933