Conversation
- Improved hallucination detection logic in `hallucination_detector.py` with better claim extraction and sensitive token handling. - Added stable ID generation for documents in `ingest_html_adaptive.py` to ensure unique identification of sections and documents. - Updated RAG pipeline in `rag_pipeline.py` to incorporate dynamic top-k estimation and improved document retrieval methods. - Enhanced reranking logic in `reranker.py` to include keyword taxonomy and scoring adjustments based on document reliability and information density. - Refactored server API in `server.py` to include API key validation and improved request handling with Pydantic models.
- Updated `ingest_html_adaptive.py` to improve document processing and metadata enrichment. - Replaced direct usage of `OllamaEmbeddings` with a factory method `get_embeddings` for better flexibility. - Enhanced section extraction logic to streamline summary and individual section handling. - Improved metadata enrichment with additional reliability signals and automatic categorization. - Refined the `rag_pipeline.py` to simplify retriever creation and remove legacy methods. - Updated the `ask` method in `SimpleRAG` to enhance response generation and context formatting. - Modified `server.py` to initialize the RAG pipeline within the lifespan context, ensuring proper setup before handling requests.
|
@codex passe en revue ces optimisations et vérifie si elles induisent des failles de sécurité et si elles apportent des améliorations réelles. Réponds en français |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 60906f2327
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex des modifications ont été apportées suite à ton premier reotur. passe à nouveau en revue la Pull Request et vérifie si elle induist des failles de sécurité et si elle apporte des améliorations réelles. Réponds en français |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: bb71cd2d4a
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| result = await asyncio.to_thread( | ||
| rag.ask, | ||
| question, | ||
| True, | ||
| request.dynamic_k, |
There was a problem hiding this comment.
Sérialiser l'accès à SimpleRAG lors des appels threadés
Le passage à asyncio.to_thread fait exécuter plusieurs /ask en parallèle sur l’instance globale rag, alors que le pipeline modifie un état partagé pendant ask (par exemple _bm25_retriever.k est réécrit puis restauré dans rag_pipeline.py). Avec deux requêtes simultanées sur un même worker, ces écritures peuvent s’entrecroiser et produire des résultats de retrieval non déterministes (top-k incorrect, réponse/caching incohérents).
Useful? React with 👍 / 👎.
| if sys.platform.startswith("linux"): | ||
| return "/home/rag/chroma_db" |
There was a problem hiding this comment.
Rétablir un fallback local pour CHROMA_DB_PATH sur Linux
Ce changement force désormais /home/rag/chroma_db sur toute machine Linux, sans vérifier l’existence ni les permissions. En environnement dev/CI Linux hors utilisateur rag, l’initialisation Chroma peut échouer par défaut alors qu’elle fonctionnait auparavant avec un chemin local du projet ; cela introduit une régression de démarrage si RAG_CHROMA_DB_PATH n’est pas défini explicitement.
Useful? React with 👍 / 👎.
No description provided.