Fix/summary typeerror 476#503
Conversation
|
@param20h , this pr is under GSSoC 2026! |
|
Hi @param20h , It looks like the I wanted to confirm that this failure is unrelated to the changes in this branch. This PR is strictly scoped to adding defensive The failure appears to be due to an intermittent CI environment timeout or an existing issue on the base branch. Since I don't have write permissions to trigger a workflow re-run on this repository, feel free to either restart the failed job or proceed with merging the changes directly. Thank you! |
|
this pr is under GSSoC 2026, so you can freely merge it and add the relevant labels! |
📋 PR Checklist
🔗 Related Issue
Closes #476
📝 What does this PR do?
Defensively validates the data type extracted from document chunks in
generate_document_summary.Previously, if a chunk was malformed and contained a non-string type (like an integer or boolean) that evaluated to
True, it would bypass theif text:check and get appended tochunk_texts. This caused aTypeError: sequence item: expected str instance, NoneType/int/bool foundwhen" ".join(chunk_texts)was executed, crashing the summarization process.This PR replaces the implicit truthy check with an explicit
isinstance(text, str)validation and usestext.strip()to ensure blank whitespace chunks are also safely skipped.🗂️ Type of Change
🧪 How was this tested?
TypeError.📸 Screenshots (if UI change)
N/A
The fix was kept tightly scoped to just the loop block inside
generate_document_summaryto maintain a clean git diff and prevent any unnecessary file formatting churn.✅ Self-Review Checklist
dev, notmainmainbranch or any HuggingFace deployment config