Skip to content

Big opti#4

Open
masskrdjn wants to merge 4 commits into
mainfrom
big-opti
Open

Big opti#4
masskrdjn wants to merge 4 commits into
mainfrom
big-opti

Conversation

@masskrdjn

Copy link
Copy Markdown
Owner

No description provided.

masskrdjn added 2 commits May 6, 2026 00:12
- Improved hallucination detection logic in `hallucination_detector.py` with better claim extraction and sensitive token handling.
- Added stable ID generation for documents in `ingest_html_adaptive.py` to ensure unique identification of sections and documents.
- Updated RAG pipeline in `rag_pipeline.py` to incorporate dynamic top-k estimation and improved document retrieval methods.
- Enhanced reranking logic in `reranker.py` to include keyword taxonomy and scoring adjustments based on document reliability and information density.
- Refactored server API in `server.py` to include API key validation and improved request handling with Pydantic models.
- Updated `ingest_html_adaptive.py` to improve document processing and metadata enrichment.
- Replaced direct usage of `OllamaEmbeddings` with a factory method `get_embeddings` for better flexibility.
- Enhanced section extraction logic to streamline summary and individual section handling.
- Improved metadata enrichment with additional reliability signals and automatic categorization.
- Refined the `rag_pipeline.py` to simplify retriever creation and remove legacy methods.
- Updated the `ask` method in `SimpleRAG` to enhance response generation and context formatting.
- Modified `server.py` to initialize the RAG pipeline within the lifespan context, ensuring proper setup before handling requests.
@masskrdjn

Copy link
Copy Markdown
Owner Author

@codex passe en revue ces optimisations et vérifie si elles induisent des failles de sécurité et si elles apportent des améliorations réelles. Réponds en français

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 60906f2327

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread rag_pipeline.py Outdated
Comment thread config.py Outdated
Comment thread server.py Outdated
@masskrdjn

Copy link
Copy Markdown
Owner Author

@codex des modifications ont été apportées suite à ton premier reotur. passe à nouveau en revue la Pull Request et vérifie si elle induist des failles de sécurité et si elle apporte des améliorations réelles. Réponds en français

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: bb71cd2d4a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread server.py Outdated
Comment on lines +117 to +121
result = await asyncio.to_thread(
rag.ask,
question,
True,
request.dynamic_k,

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Sérialiser l'accès à SimpleRAG lors des appels threadés

Le passage à asyncio.to_thread fait exécuter plusieurs /ask en parallèle sur l’instance globale rag, alors que le pipeline modifie un état partagé pendant ask (par exemple _bm25_retriever.k est réécrit puis restauré dans rag_pipeline.py). Avec deux requêtes simultanées sur un même worker, ces écritures peuvent s’entrecroiser et produire des résultats de retrieval non déterministes (top-k incorrect, réponse/caching incohérents).

Useful? React with 👍 / 👎.

Comment thread config.py Outdated
Comment on lines +95 to +96
if sys.platform.startswith("linux"):
return "/home/rag/chroma_db"

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Rétablir un fallback local pour CHROMA_DB_PATH sur Linux

Ce changement force désormais /home/rag/chroma_db sur toute machine Linux, sans vérifier l’existence ni les permissions. En environnement dev/CI Linux hors utilisateur rag, l’initialisation Chroma peut échouer par défaut alors qu’elle fonctionnait auparavant avec un chemin local du projet ; cela introduit une régression de démarrage si RAG_CHROMA_DB_PATH n’est pas défini explicitement.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant