feat(preference-elicitation): integrate vignettes + BWS card behind feature flag (CORE-428) by nraffa · Pull Request #695 · tabiya-tech/compass

nraffa · 2026-05-28T22:38:44Z

Summary

Ports the preference elicitation conversation (vignettes + Best-Worst Scaling card + Bayesian preference inference) from the Njila / Zambia fork into core compass as an opt-in sub-phase of COUNSELING.

The feature is OFF by default in core main. Partners enable via preferenceElicitation.enabled in config/default.json (which projects to GLOBAL_ENABLE_PREFERENCE_ELICITATION for both backend and frontend), mirroring the existing cv.enabled → GLOBAL_ENABLE_CV_UPLOAD pattern.

Addresses CORE-428.

Why a port, not a cherry-pick

The fork's main was rebased onto an orphan-snapshot commit (1,966 files, no common ancestor with core's pre-cutover history). The Epic 2 / Epic 3 commits live only on now-orphaned feature branches with no shared base with core main, so git cherry-pick is mechanically impossible. This PR copies the agent + supporting modules verbatim and hand-applies the integration diff.

What's included

backend/app/agent/preference_elicitation_agent/ — full agent (vignette engine, personalizer, Bayesian posterior, D-optimal selection, BWS, conversation manager) copied verbatim from the fork.
backend/app/job_preferences/ — new module: Pydantic types, mongo repository, service, dependency provider, REST routes with Firebase auth + per-session ownership check matching the conversation routes pattern.
backend/app/config/vignettes.json + vignette_templates.json + backend/offline_output/*.json — pre-computed vignette assets the agent loads at boot.
New CounselingSubPhase enum on AgentDirectorState; sub-phase routing in LLMAgentDirector (deterministic when the flag is on, falls back to the existing LLM router when off).
ExploreExperiencesAgentDirector emits a bridge message (exploreExperiences.transitionToPreferences) when the flag is on and explore finishes.
Frontend BWS card, QuickReplyButtons for forced-choice vignette answers (self-disables after one click to prevent double-submission), input-disable state while a BWS card is the latest message, PREFERENCE_ELICITATION phase added to the progress bar with a "Discovering preferences" label.
i18n keys for the new strings across all 4 backend (en-GB, en-US, es-ES, es-AR) and 4 frontend locales.

Out of scope (deliberately not ported)

The fork's recommender / matching service (Epic 3).
language_detector.py — Swahili-only classifier, doesn't fit core's locale set.
persona_detector.py — symmetric with dropping language detection; keeps existing agents byte-identical.
DB6 youth database integration — DB6Client / YouthProfile imports, _save_preference_vector_to_db6, and the use_db6_for_fresh_data persisted state field have all been stripped from the agent so they don't pollute every serialized session going forward. 4 DB6-specific test classes are @pytest.mark.skip'd.
Step-skip flow — no use case in core without external preference-source pipelines.

Architectural notes

Sub-phase routing is deterministic, not LLM-routed. Within COUNSELING, when the flag is on the director maps counseling_sub_phase → agent directly. When the flag is off, the existing LLM router runs unchanged.
_compute_next_state is pure. Phase + sub-phase transitions are computed without mutating self._state; the caller in execute() applies the result.
Lazy-init for the new state collection. Sessions created before this PR shipped won't have a row in preference_elicitation_agent_state. The store builds a default PreferenceElicitationAgentState(session_id=...) in memory and persists it on the next save, so a partner flipping the flag on mid-flight doesn't break in-progress sessions. The other 6 required collections preserve their existing "session not found → None" semantic — the lazy-init is scoped to only the new collection.
AgentDirectorState.from_document defaults counseling_sub_phase to EXPLORE_EXPERIENCES for documents written before the field existed.
Conditional route registration. /job-preferences/* is only mounted when the flag is on. With the flag off, the routes return 404 and the agent is never instantiated.
Persistence trigger. ConversationService._maybe_save_preference_vector writes the learned PreferenceVector to the job_preferences collection the turn the agent first reports COMPLETE. Idempotent.
Transition-time history skip is narrowed to WELCOME_AGENT only. The original blanket-skip (commit 6fa2200, inspired by the fork) suppressed every outgoing agent's final message during a phase transition, which would have dropped the PE wrap-up summary and the Farewell closing. We narrow the skip to WELCOME_AGENT so its redundant transition stub is still suppressed while PE and Farewell content reaches the chat. Verified end-to-end.

Country calibration — heads-up for partners outside Kenya

The architecture is country-aware (DEFAULT_COUNTRY_OF_USER → ApplicationConfig.default_country_of_user → agent → personalizer + context extractor), but the content is currently Kenya-calibrated:

Offline vignettes (backend/offline_output/*.json) contain hardcoded KES wage values.
VignettePersonalizer._build_country_context() has a rich if country == Country.KENYA branch (Safaricom, KCB, KES ranges, M-Pesa, NHIF/NSSF); other countries fall back to a generic 4-line block.
UserContextExtractor has hardcoded Kenyan company recognition.
PreferenceVector salary field docstrings explicitly say "in KES".

At runtime the agent runs in hybrid mode (use_offline_with_personalization=True), so the LLM personalizer rewrites the static KES vignettes for the configured country. For Kenya the output is high-quality; for other countries the LLM has to localize without anchor examples. Partners outside Kenya wanting parity should re-run the offline optimization pipeline (backend/app/agent/preference_elicitation_agent/offline_optimization/run_offline_optimization.py) with their own wage ranges and add a country-specific branch in _build_country_context(). Tracking as a follow-up rather than blocking this PR.

Follow-ups (not blocking)

Tests for the new job_preferences/ module (smoke tests for repository + service + auth-gated routes).
Test coverage for the lazy-init store path and the AgentDirectorState.from_document migration.
ConversationService test exercising both flag states.
Chat.test.tsx test for BWS card rendering + input-disabled.
Country calibration for non-Kenya partners (see above) — re-run offline optimization pipeline, extend _build_country_context().
Trim Kenya-specific references in types.py docstrings and the BWS card's hardcoded English "{taskNumber} of {totalTasks}" (cosmetic).

sentry · 2026-06-01T05:14:10Z

+        self._use_offline_with_personalization = use_offline_with_personalization
+        self._personalization_logs: list[PersonalizationLog] = []
+        self._session_logs_dir = Path(__file__).parent.parent.parent.parent / "session_logs"
+        self._session_logs_dir.mkdir(parents=True, exist_ok=True)


Bug: The agent's constructor unconditionally creates a directory on every request. This will crash the server on read-only filesystems when the feature flag is enabled.
_{Severity: CRITICAL}

Suggested Fix

Wrap the self._session_logs_dir.mkdir() call in a try-except block to catch potential PermissionError exceptions and log a warning instead of crashing. Alternatively, defer directory creation until a log is actually written (lazy initialization).

Prompt for AI Agent

Review the code at the location below. A potential bug has been identified by an AI agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not valid. Location: backend/app/agent/preference_elicitation_agent/agent.py#L219 Potential issue: The `PreferenceElicitationAgent` constructor unconditionally attempts to create a `session_logs` directory. Because a new agent is instantiated on every HTTP request when the `enable_preference_elicitation` flag is enabled, this directory creation is attempted repeatedly. In production environments with read-only filesystems, a common practice for containers, this `mkdir()` call will raise a `PermissionError`. Since this operation is not wrapped in a try-except block, the error will be unhandled, causing every API request to fail when the feature is active. This would render the conversation API unusable in such deployments.

_{Did we get this right? 👍 / 👎 to inform future reviews.}

Copilot

Copilot wasn't able to review this pull request because it exceeds the maximum number of lines (20,000). Try reducing the number of changed lines and requesting a review from Copilot again.

sentry · 2026-06-19T06:42:30Z

+        }
+
+        # If no attribute has a meaningful difference, return "mixed"
+        if max(differences.values()) < 0.1:


Bug: The infer_category function can crash with a ValueError if the configuration file contains an empty attributes list, due to a missing validation check before calling max().
_{Severity: LOW}

Suggested Fix

Add a validation check in ProfileGenerator.__init__() to ensure that the self.attributes list is not empty after loading the configuration. If it is empty, raise a ValueError with a descriptive error message, such as "Configuration must contain at least one attribute".

Prompt for AI Agent

Review the code at the location below. A potential bug has been identified by an AI agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not valid. Location: backend/app/agent/preference_elicitation_agent/offline_optimization/vignette_converter.py#L82 Potential issue: The `infer_category()` function in `vignette_converter.py` can crash with a `ValueError` if it processes a configuration file where the `attributes` list is empty. The function builds a `differences` dictionary by iterating over `self.profile_generator.attributes`. If this list is empty, the dictionary remains empty. A subsequent call to `max(differences.values())` on line 82 will then fail, as `max()` cannot be called on an empty sequence. The root cause is a lack of input validation in `ProfileGenerator._load_config()`, which does not ensure the `attributes` array in the config is non-empty. This issue is currently masked in the main offline optimization pipeline because it provides a `category_rotation`, but it is exposed in tests and could affect users providing custom configuration files.

…eature flag (CORE-428) Opt-in conversation sub-phase after experience collection, OFF by default in core. Partners enable via preferenceElicitation.enabled in config/default.json.

sentry · 2026-06-19T06:48:37Z

+            log_file = self._session_logs_dir / f"personalization_log_{self._state.session_id}.jsonl"
+            with open(log_file, 'a', encoding='utf-8') as f:
+                # Write as JSONL (one JSON object per line)
+                json.dump(personalization_log.model_dump(), f, default=str, ensure_ascii=False)
+                f.write('\n')


Bug: The _log_personalization method writes to a file without a try...except block, which can cause unhandled exceptions and feature failure on filesystem errors.
_{Severity: HIGH}

Suggested Fix

Wrap the file open() and write() operations within the _log_personalization method in a try...except block to gracefully handle potential IOError or OSError exceptions. Log the error for debugging purposes but prevent it from propagating and causing the user-facing feature to fail.

Prompt for AI Agent

Review the code at the location below. A potential bug has been identified by an AI agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not valid. Location: backend/app/agent/preference_elicitation_agent/agent.py#L310-L314 Potential issue: The `_log_personalization` method in `agent.py` writes to a log file using a direct `open()` call without any error handling. This method is invoked synchronously on a critical path whenever a user interacts with the vignette personalization feature. If the application encounters a filesystem issue, such as a read-only filesystem, insufficient disk space, or permission errors, the file write will fail and raise an unhandled exception. While a higher-level exception handler prevents a server crash, it will cause the feature to fail and return an error response to the user, effectively breaking the vignette personalization flow.

nraffa marked this pull request as ready for review June 1, 2026 05:11

sentry Bot reviewed Jun 1, 2026

View reviewed changes

irumvanselme requested a review from Copilot June 1, 2026 08:48

Copilot AI reviewed Jun 1, 2026

irumvanselme force-pushed the feat/core-428-preference-elicitation branch from 14a7d10 to 3a92da2 Compare June 19, 2026 06:38

sentry Bot reviewed Jun 19, 2026

View reviewed changes

feat(preference-elicitation): integrate vignettes + BWS card behind f…

39a3643

…eature flag (CORE-428) Opt-in conversation sub-phase after experience collection, OFF by default in core. Partners enable via preferenceElicitation.enabled in config/default.json.

irumvanselme force-pushed the feat/core-428-preference-elicitation branch from 3a92da2 to 39a3643 Compare June 19, 2026 06:45

sentry Bot reviewed Jun 19, 2026

View reviewed changes

rav3n11 merged commit 8470bea into main Jun 22, 2026
8 checks passed

rav3n11 deleted the feat/core-428-preference-elicitation branch June 22, 2026 07:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(preference-elicitation): integrate vignettes + BWS card behind feature flag (CORE-428)#695

feat(preference-elicitation): integrate vignettes + BWS card behind feature flag (CORE-428)#695
rav3n11 merged 1 commit into
mainfrom
feat/core-428-preference-elicitation

nraffa commented May 28, 2026 •

edited

Loading

Uh oh!

sentry Bot Jun 1, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

sentry Bot Jun 19, 2026

Uh oh!

sentry Bot Jun 19, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

nraffa commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why a port, not a cherry-pick

What's included

Out of scope (deliberately not ported)

Architectural notes

Country calibration — heads-up for partners outside Kenya

Follow-ups (not blocking)

Uh oh!

sentry Bot Jun 1, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

sentry Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

sentry Bot Jun 19, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

nraffa commented May 28, 2026 •

edited

Loading