feat(preference-elicitation): integrate vignettes + BWS card behind feature flag (CORE-428)#695
Conversation
| self._use_offline_with_personalization = use_offline_with_personalization | ||
| self._personalization_logs: list[PersonalizationLog] = [] | ||
| self._session_logs_dir = Path(__file__).parent.parent.parent.parent / "session_logs" | ||
| self._session_logs_dir.mkdir(parents=True, exist_ok=True) |
There was a problem hiding this comment.
Bug: The agent's constructor unconditionally creates a directory on every request. This will crash the server on read-only filesystems when the feature flag is enabled.
Severity: CRITICAL
Suggested Fix
Wrap the self._session_logs_dir.mkdir() call in a try-except block to catch potential PermissionError exceptions and log a warning instead of crashing. Alternatively, defer directory creation until a log is actually written (lazy initialization).
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.
Location: backend/app/agent/preference_elicitation_agent/agent.py#L219
Potential issue: The `PreferenceElicitationAgent` constructor unconditionally attempts
to create a `session_logs` directory. Because a new agent is instantiated on every HTTP
request when the `enable_preference_elicitation` flag is enabled, this directory
creation is attempted repeatedly. In production environments with read-only filesystems,
a common practice for containers, this `mkdir()` call will raise a `PermissionError`.
Since this operation is not wrapped in a try-except block, the error will be unhandled,
causing every API request to fail when the feature is active. This would render the
conversation API unusable in such deployments.
Did we get this right? 👍 / 👎 to inform future reviews.
14a7d10 to
3a92da2
Compare
| } | ||
|
|
||
| # If no attribute has a meaningful difference, return "mixed" | ||
| if max(differences.values()) < 0.1: |
There was a problem hiding this comment.
Bug: The infer_category function can crash with a ValueError if the configuration file contains an empty attributes list, due to a missing validation check before calling max().
Severity: LOW
Suggested Fix
Add a validation check in ProfileGenerator.__init__() to ensure that the self.attributes list is not empty after loading the configuration. If it is empty, raise a ValueError with a descriptive error message, such as "Configuration must contain at least one attribute".
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.
Location:
backend/app/agent/preference_elicitation_agent/offline_optimization/vignette_converter.py#L82
Potential issue: The `infer_category()` function in `vignette_converter.py` can crash
with a `ValueError` if it processes a configuration file where the `attributes` list is
empty. The function builds a `differences` dictionary by iterating over
`self.profile_generator.attributes`. If this list is empty, the dictionary remains
empty. A subsequent call to `max(differences.values())` on line 82 will then fail, as
`max()` cannot be called on an empty sequence. The root cause is a lack of input
validation in `ProfileGenerator._load_config()`, which does not ensure the `attributes`
array in the config is non-empty. This issue is currently masked in the main offline
optimization pipeline because it provides a `category_rotation`, but it is exposed in
tests and could affect users providing custom configuration files.
…eature flag (CORE-428) Opt-in conversation sub-phase after experience collection, OFF by default in core. Partners enable via preferenceElicitation.enabled in config/default.json.
3a92da2 to
39a3643
Compare
| log_file = self._session_logs_dir / f"personalization_log_{self._state.session_id}.jsonl" | ||
| with open(log_file, 'a', encoding='utf-8') as f: | ||
| # Write as JSONL (one JSON object per line) | ||
| json.dump(personalization_log.model_dump(), f, default=str, ensure_ascii=False) | ||
| f.write('\n') |
There was a problem hiding this comment.
Bug: The _log_personalization method writes to a file without a try...except block, which can cause unhandled exceptions and feature failure on filesystem errors.
Severity: HIGH
Suggested Fix
Wrap the file open() and write() operations within the _log_personalization method in a try...except block to gracefully handle potential IOError or OSError exceptions. Log the error for debugging purposes but prevent it from propagating and causing the user-facing feature to fail.
Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent. Verify if this is a real issue. If it is, propose a fix; if not, explain why it's
not valid.
Location: backend/app/agent/preference_elicitation_agent/agent.py#L310-L314
Potential issue: The `_log_personalization` method in `agent.py` writes to a log file
using a direct `open()` call without any error handling. This method is invoked
synchronously on a critical path whenever a user interacts with the vignette
personalization feature. If the application encounters a filesystem issue, such as a
read-only filesystem, insufficient disk space, or permission errors, the file write will
fail and raise an unhandled exception. While a higher-level exception handler prevents a
server crash, it will cause the feature to fail and return an error response to the
user, effectively breaking the vignette personalization flow.
Summary
Ports the preference elicitation conversation (vignettes + Best-Worst Scaling card + Bayesian preference inference) from the Njila / Zambia fork into core compass as an opt-in sub-phase of
COUNSELING.The feature is OFF by default in core
main. Partners enable viapreferenceElicitation.enabledinconfig/default.json(which projects toGLOBAL_ENABLE_PREFERENCE_ELICITATIONfor both backend and frontend), mirroring the existingcv.enabled→GLOBAL_ENABLE_CV_UPLOADpattern.Addresses CORE-428.
Why a port, not a cherry-pick
The fork's
mainwas rebased onto an orphan-snapshot commit (1,966 files, no common ancestor with core's pre-cutover history). The Epic 2 / Epic 3 commits live only on now-orphaned feature branches with no shared base with coremain, sogit cherry-pickis mechanically impossible. This PR copies the agent + supporting modules verbatim and hand-applies the integration diff.What's included
backend/app/agent/preference_elicitation_agent/— full agent (vignette engine, personalizer, Bayesian posterior, D-optimal selection, BWS, conversation manager) copied verbatim from the fork.backend/app/job_preferences/— new module: Pydantic types, mongo repository, service, dependency provider, REST routes with Firebase auth + per-session ownership check matching the conversation routes pattern.backend/app/config/vignettes.json+vignette_templates.json+backend/offline_output/*.json— pre-computed vignette assets the agent loads at boot.CounselingSubPhaseenum onAgentDirectorState; sub-phase routing inLLMAgentDirector(deterministic when the flag is on, falls back to the existing LLM router when off).ExploreExperiencesAgentDirectoremits a bridge message (exploreExperiences.transitionToPreferences) when the flag is on and explore finishes.QuickReplyButtonsfor forced-choice vignette answers (self-disables after one click to prevent double-submission), input-disable state while a BWS card is the latest message,PREFERENCE_ELICITATIONphase added to the progress bar with a "Discovering preferences" label.en-GB,en-US,es-ES,es-AR) and 4 frontend locales.Out of scope (deliberately not ported)
language_detector.py— Swahili-only classifier, doesn't fit core's locale set.persona_detector.py— symmetric with dropping language detection; keeps existing agents byte-identical.DB6Client/YouthProfileimports,_save_preference_vector_to_db6, and theuse_db6_for_fresh_datapersisted state field have all been stripped from the agent so they don't pollute every serialized session going forward. 4 DB6-specific test classes are@pytest.mark.skip'd.Architectural notes
COUNSELING, when the flag is on the director mapscounseling_sub_phase→ agent directly. When the flag is off, the existing LLM router runs unchanged._compute_next_stateis pure. Phase + sub-phase transitions are computed without mutatingself._state; the caller inexecute()applies the result.preference_elicitation_agent_state. The store builds a defaultPreferenceElicitationAgentState(session_id=...)in memory and persists it on the next save, so a partner flipping the flag on mid-flight doesn't break in-progress sessions. The other 6 required collections preserve their existing "session not found → None" semantic — the lazy-init is scoped to only the new collection.AgentDirectorState.from_documentdefaultscounseling_sub_phasetoEXPLORE_EXPERIENCESfor documents written before the field existed./job-preferences/*is only mounted when the flag is on. With the flag off, the routes return 404 and the agent is never instantiated.ConversationService._maybe_save_preference_vectorwrites the learnedPreferenceVectorto thejob_preferencescollection the turn the agent first reportsCOMPLETE. Idempotent.Country calibration — heads-up for partners outside Kenya
The architecture is country-aware (
DEFAULT_COUNTRY_OF_USER→ApplicationConfig.default_country_of_user→ agent → personalizer + context extractor), but the content is currently Kenya-calibrated:backend/offline_output/*.json) contain hardcoded KES wage values.VignettePersonalizer._build_country_context()has a richif country == Country.KENYAbranch (Safaricom, KCB, KES ranges, M-Pesa, NHIF/NSSF); other countries fall back to a generic 4-line block.UserContextExtractorhas hardcoded Kenyan company recognition.PreferenceVectorsalary field docstrings explicitly say "in KES".At runtime the agent runs in hybrid mode (
use_offline_with_personalization=True), so the LLM personalizer rewrites the static KES vignettes for the configured country. For Kenya the output is high-quality; for other countries the LLM has to localize without anchor examples. Partners outside Kenya wanting parity should re-run the offline optimization pipeline (backend/app/agent/preference_elicitation_agent/offline_optimization/run_offline_optimization.py) with their own wage ranges and add a country-specific branch in_build_country_context(). Tracking as a follow-up rather than blocking this PR.Follow-ups (not blocking)
job_preferences/module (smoke tests for repository + service + auth-gated routes).AgentDirectorState.from_documentmigration.ConversationServicetest exercising both flag states.Chat.test.tsxtest for BWS card rendering + input-disabled._build_country_context().types.pydocstrings and the BWS card's hardcoded English"{taskNumber} of {totalTasks}"(cosmetic).