Skip to content

feat(preference-elicitation): integrate vignettes + BWS card behind feature flag (CORE-428)#695

Merged
rav3n11 merged 1 commit into
mainfrom
feat/core-428-preference-elicitation
Jun 22, 2026
Merged

feat(preference-elicitation): integrate vignettes + BWS card behind feature flag (CORE-428)#695
rav3n11 merged 1 commit into
mainfrom
feat/core-428-preference-elicitation

Conversation

@nraffa

@nraffa nraffa commented May 28, 2026

Copy link
Copy Markdown
Collaborator

Summary

Ports the preference elicitation conversation (vignettes + Best-Worst Scaling card + Bayesian preference inference) from the Njila / Zambia fork into core compass as an opt-in sub-phase of COUNSELING.

The feature is OFF by default in core main. Partners enable via preferenceElicitation.enabled in config/default.json (which projects to GLOBAL_ENABLE_PREFERENCE_ELICITATION for both backend and frontend), mirroring the existing cv.enabledGLOBAL_ENABLE_CV_UPLOAD pattern.

Addresses CORE-428.

Why a port, not a cherry-pick

The fork's main was rebased onto an orphan-snapshot commit (1,966 files, no common ancestor with core's pre-cutover history). The Epic 2 / Epic 3 commits live only on now-orphaned feature branches with no shared base with core main, so git cherry-pick is mechanically impossible. This PR copies the agent + supporting modules verbatim and hand-applies the integration diff.

What's included

  • backend/app/agent/preference_elicitation_agent/ — full agent (vignette engine, personalizer, Bayesian posterior, D-optimal selection, BWS, conversation manager) copied verbatim from the fork.
  • backend/app/job_preferences/ — new module: Pydantic types, mongo repository, service, dependency provider, REST routes with Firebase auth + per-session ownership check matching the conversation routes pattern.
  • backend/app/config/vignettes.json + vignette_templates.json + backend/offline_output/*.json — pre-computed vignette assets the agent loads at boot.
  • New CounselingSubPhase enum on AgentDirectorState; sub-phase routing in LLMAgentDirector (deterministic when the flag is on, falls back to the existing LLM router when off).
  • ExploreExperiencesAgentDirector emits a bridge message (exploreExperiences.transitionToPreferences) when the flag is on and explore finishes.
  • Frontend BWS card, QuickReplyButtons for forced-choice vignette answers (self-disables after one click to prevent double-submission), input-disable state while a BWS card is the latest message, PREFERENCE_ELICITATION phase added to the progress bar with a "Discovering preferences" label.
  • i18n keys for the new strings across all 4 backend (en-GB, en-US, es-ES, es-AR) and 4 frontend locales.

Out of scope (deliberately not ported)

  • The fork's recommender / matching service (Epic 3).
  • language_detector.py — Swahili-only classifier, doesn't fit core's locale set.
  • persona_detector.py — symmetric with dropping language detection; keeps existing agents byte-identical.
  • DB6 youth database integration — DB6Client / YouthProfile imports, _save_preference_vector_to_db6, and the use_db6_for_fresh_data persisted state field have all been stripped from the agent so they don't pollute every serialized session going forward. 4 DB6-specific test classes are @pytest.mark.skip'd.
  • Step-skip flow — no use case in core without external preference-source pipelines.

Architectural notes

  • Sub-phase routing is deterministic, not LLM-routed. Within COUNSELING, when the flag is on the director maps counseling_sub_phase → agent directly. When the flag is off, the existing LLM router runs unchanged.
  • _compute_next_state is pure. Phase + sub-phase transitions are computed without mutating self._state; the caller in execute() applies the result.
  • Lazy-init for the new state collection. Sessions created before this PR shipped won't have a row in preference_elicitation_agent_state. The store builds a default PreferenceElicitationAgentState(session_id=...) in memory and persists it on the next save, so a partner flipping the flag on mid-flight doesn't break in-progress sessions. The other 6 required collections preserve their existing "session not found → None" semantic — the lazy-init is scoped to only the new collection.
  • AgentDirectorState.from_document defaults counseling_sub_phase to EXPLORE_EXPERIENCES for documents written before the field existed.
  • Conditional route registration. /job-preferences/* is only mounted when the flag is on. With the flag off, the routes return 404 and the agent is never instantiated.
  • Persistence trigger. ConversationService._maybe_save_preference_vector writes the learned PreferenceVector to the job_preferences collection the turn the agent first reports COMPLETE. Idempotent.
  • Transition-time history skip is narrowed to WELCOME_AGENT only. The original blanket-skip (commit 6fa2200, inspired by the fork) suppressed every outgoing agent's final message during a phase transition, which would have dropped the PE wrap-up summary and the Farewell closing. We narrow the skip to WELCOME_AGENT so its redundant transition stub is still suppressed while PE and Farewell content reaches the chat. Verified end-to-end.

Country calibration — heads-up for partners outside Kenya

The architecture is country-aware (DEFAULT_COUNTRY_OF_USERApplicationConfig.default_country_of_user → agent → personalizer + context extractor), but the content is currently Kenya-calibrated:

  • Offline vignettes (backend/offline_output/*.json) contain hardcoded KES wage values.
  • VignettePersonalizer._build_country_context() has a rich if country == Country.KENYA branch (Safaricom, KCB, KES ranges, M-Pesa, NHIF/NSSF); other countries fall back to a generic 4-line block.
  • UserContextExtractor has hardcoded Kenyan company recognition.
  • PreferenceVector salary field docstrings explicitly say "in KES".

At runtime the agent runs in hybrid mode (use_offline_with_personalization=True), so the LLM personalizer rewrites the static KES vignettes for the configured country. For Kenya the output is high-quality; for other countries the LLM has to localize without anchor examples. Partners outside Kenya wanting parity should re-run the offline optimization pipeline (backend/app/agent/preference_elicitation_agent/offline_optimization/run_offline_optimization.py) with their own wage ranges and add a country-specific branch in _build_country_context(). Tracking as a follow-up rather than blocking this PR.

Follow-ups (not blocking)

  • Tests for the new job_preferences/ module (smoke tests for repository + service + auth-gated routes).
  • Test coverage for the lazy-init store path and the AgentDirectorState.from_document migration.
  • ConversationService test exercising both flag states.
  • Chat.test.tsx test for BWS card rendering + input-disabled.
  • Country calibration for non-Kenya partners (see above) — re-run offline optimization pipeline, extend _build_country_context().
  • Trim Kenya-specific references in types.py docstrings and the BWS card's hardcoded English "{taskNumber} of {totalTasks}" (cosmetic).

@nraffa nraffa marked this pull request as ready for review June 1, 2026 05:11
self._use_offline_with_personalization = use_offline_with_personalization
self._personalization_logs: list[PersonalizationLog] = []
self._session_logs_dir = Path(__file__).parent.parent.parent.parent / "session_logs"
self._session_logs_dir.mkdir(parents=True, exist_ok=True)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: The agent's constructor unconditionally creates a directory on every request. This will crash the server on read-only filesystems when the feature flag is enabled.
Severity: CRITICAL

Suggested Fix

Wrap the self._session_logs_dir.mkdir() call in a try-except block to catch potential PermissionError exceptions and log a warning instead of crashing. Alternatively, defer directory creation until a log is actually written (lazy initialization).

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.

Location: backend/app/agent/preference_elicitation_agent/agent.py#L219

Potential issue: The `PreferenceElicitationAgent` constructor unconditionally attempts
to create a `session_logs` directory. Because a new agent is instantiated on every HTTP
request when the `enable_preference_elicitation` flag is enabled, this directory
creation is attempted repeatedly. In production environments with read-only filesystems,
a common practice for containers, this `mkdir()` call will raise a `PermissionError`.
Since this operation is not wrapped in a try-except block, the error will be unhandled,
causing every API request to fail when the feature is active. This would render the
conversation API unusable in such deployments.

Did we get this right? 👍 / 👎 to inform future reviews.

@irumvanselme irumvanselme requested a review from Copilot June 1, 2026 08:48

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot wasn't able to review this pull request because it exceeds the maximum number of lines (20,000). Try reducing the number of changed lines and requesting a review from Copilot again.

@irumvanselme irumvanselme force-pushed the feat/core-428-preference-elicitation branch from 14a7d10 to 3a92da2 Compare June 19, 2026 06:38
}

# If no attribute has a meaningful difference, return "mixed"
if max(differences.values()) < 0.1:

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: The infer_category function can crash with a ValueError if the configuration file contains an empty attributes list, due to a missing validation check before calling max().
Severity: LOW

Suggested Fix

Add a validation check in ProfileGenerator.__init__() to ensure that the self.attributes list is not empty after loading the configuration. If it is empty, raise a ValueError with a descriptive error message, such as "Configuration must contain at least one attribute".

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.

Location:
backend/app/agent/preference_elicitation_agent/offline_optimization/vignette_converter.py#L82

Potential issue: The `infer_category()` function in `vignette_converter.py` can crash
with a `ValueError` if it processes a configuration file where the `attributes` list is
empty. The function builds a `differences` dictionary by iterating over
`self.profile_generator.attributes`. If this list is empty, the dictionary remains
empty. A subsequent call to `max(differences.values())` on line 82 will then fail, as
`max()` cannot be called on an empty sequence. The root cause is a lack of input
validation in `ProfileGenerator._load_config()`, which does not ensure the `attributes`
array in the config is non-empty. This issue is currently masked in the main offline
optimization pipeline because it provides a `category_rotation`, but it is exposed in
tests and could affect users providing custom configuration files.

…eature flag (CORE-428)

Opt-in conversation sub-phase after experience collection, OFF by default in core. Partners enable via preferenceElicitation.enabled in config/default.json.
@irumvanselme irumvanselme force-pushed the feat/core-428-preference-elicitation branch from 3a92da2 to 39a3643 Compare June 19, 2026 06:45
Comment on lines +310 to +314
log_file = self._session_logs_dir / f"personalization_log_{self._state.session_id}.jsonl"
with open(log_file, 'a', encoding='utf-8') as f:
# Write as JSONL (one JSON object per line)
json.dump(personalization_log.model_dump(), f, default=str, ensure_ascii=False)
f.write('\n')

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: The _log_personalization method writes to a file without a try...except block, which can cause unhandled exceptions and feature failure on filesystem errors.
Severity: HIGH

Suggested Fix

Wrap the file open() and write() operations within the _log_personalization method in a try...except block to gracefully handle potential IOError or OSError exceptions. Log the error for debugging purposes but prevent it from propagating and causing the user-facing feature to fail.

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.

Location: backend/app/agent/preference_elicitation_agent/agent.py#L310-L314

Potential issue: The `_log_personalization` method in `agent.py` writes to a log file
using a direct `open()` call without any error handling. This method is invoked
synchronously on a critical path whenever a user interacts with the vignette
personalization feature. If the application encounters a filesystem issue, such as a
read-only filesystem, insufficient disk space, or permission errors, the file write will
fail and raise an unhandled exception. While a higher-level exception handler prevents a
server crash, it will cause the feature to fail and return an error response to the
user, effectively breaking the vignette personalization flow.

@rav3n11 rav3n11 merged commit 8470bea into main Jun 22, 2026
8 checks passed
@rav3n11 rav3n11 deleted the feat/core-428-preference-elicitation branch June 22, 2026 07:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants