Fix/mongo location index#19
Open
rav3n11 wants to merge 2 commits into
Open
Conversation
Add structured timing blocks so production requests log where time goes, not just total fetch_parallel_wall_ms / scoring_thread_pool_ms blobs. - backend/app/database.py: split get_all_jobs_with_timing into filter_build_ms / cursor_create_ms / cursor_first_doc_ms / cursor_drain_remaining_ms / python_build_jobs_ms, emitted as a "mongo /jobs" log_match_step block. Lets us see whether Mongo cost is server-side exec, network transfer, or BSON deserialisation. - backend/app/services/match_concat_gemini_ce_service.py: instrument the v3/v4/v5 engine path with prep_job_vectors_ms / job_mat_build_ms / gemini_embed_ms / cosine_shortlist_ms / ce_rerank_ms / format_user_recs_ms / engine_total_ms, emitted as an "engine concat_gemini_ce" block per call (fires twice per request — once for jobs, once for occupations). - backend/app/services/bm25_scoring/bm25library.py: guard BM25Okapi against empty corpora that would otherwise raise ZeroDivisionError in rank_bm25._calc_idf. Substitutes a sentinel token in empty docs; downstream scoring returns 0 for those docs as before. Why: prod /match_v4 averages 13s wall-clock with scoring_thread_pool_ms ≈ 7s, but we can't see whether the engine cost is Gemini API, CPU-bound CrossEncoder rerank, or downstream preference scoring. These blocks make that visible in Cloud Logging without changing any scoring behavior.
…te index collation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
https://tabiya-tech.atlassian.net/jira/software/c/projects/CORE/boards/105?issueType=10017%2C10015%2C10014&selectedIssue=CORE-481