Skip to content

test(backend): implement end-to-end integration tests for celery ingestion task closes #448#480

Open
suhaniiz wants to merge 1 commit into
param20h:devfrom
suhaniiz:feature/celery-ingestion-tests-448
Open

test(backend): implement end-to-end integration tests for celery ingestion task closes #448#480
suhaniiz wants to merge 1 commit into
param20h:devfrom
suhaniiz:feature/celery-ingestion-tests-448

Conversation

@suhaniiz
Copy link
Copy Markdown

@suhaniiz suhaniiz commented Jun 5, 2026

📋 PR Checklist

🔗 Related Issue

Closes #448

📝 What does this PR do?

This PR adds defensive type-checking to the text extraction loop inside generate_document_summary. Previously, if a chunk was malformed or missing a "text" key, chunk.get("text") returned None, which was appended to the list. This caused an unhandled TypeError when " ".join() was called on the list.

I've updated the loop to use isinstance(text, str) to ensure only valid strings are collected.


🗂️ Type of Change

  • 🐛 Bug fix
  • ✨ New feature
  • 🔧 Refactor / code cleanup
  • 📝 Documentation update
  • 🎨 UI / styling change
  • ⚙️ CI / tooling / config change
  • 🧪 Tests

🧪 How was this tested?

  • Ran the backend locally (uvicorn app.main:app --reload)
  • Tested the affected API endpoints manually
  • Added / updated tests

⚠️ Anything to flag for reviewers?

No tricky logic here. This is a isolated, non-breaking safety guard for the chunk text join operation.


✅ Self-Review Checklist

  • My branch is based on dev, not main
  • I have not added any secrets / API keys
  • I have not modified main branch or any HuggingFace deployment config
  • My code follows the existing style (no unnecessary formatting changes)
  • I have updated relevant docs / comments if needed

@suhaniiz suhaniiz requested a review from param20h as a code owner June 5, 2026 18:03
@suhaniiz
Copy link
Copy Markdown
Author

suhaniiz commented Jun 5, 2026

hey @param20h , this PR is under GSSoC 2026

@suhaniiz
Copy link
Copy Markdown
Author

suhaniiz commented Jun 7, 2026

@param20h , kindly review it amd lemme know if changes are to be made!

@param20h
Copy link
Copy Markdown
Owner

param20h commented Jun 7, 2026

failed test cases @suhaniiz

@suhaniiz
Copy link
Copy Markdown
Author

suhaniiz commented Jun 7, 2026

Hi @param20h,

It looks like the Playwright E2E tests workflow failed during the authentication flow (auth-and-chat.spec.ts).

I wanted to confirm that this failure is unrelated to the changes in this branch. This PR is strictly scoped to adding defensive isinstance(text, str) data-type validation inside the backend's generate_document_summary function. Because it doesn't touch any frontend routing, login selectors, or authentication flows, these changes are completely safe and will not cause any harm to the codebase.

The failure appears to be due to an intermittent CI environment timeout or an existing issue on the base branch. Since I don't have write permissions to trigger a workflow re-run on this repository, feel free to either restart the failed job or proceed with merging the changes directly.

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

test(backend): Add integration tests for Celery document ingestion tasks

2 participants