chore: added bg task for chucking#6
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Changelog
All notable changes to LexAI are documented here.
[Unreleased] — In Progress
Fixed
ingest_pdf()call insiderun_ingest_jobwhich was accidentally left commented out, causing all PDF uploads to silently do nothingrun_ingest_jobas a non-fatal operation — Spaces failure now logs a warning and continues rather than blocking the ingestion pipelineChanged
[1.1.0] — 2026-03-12
Fixed — 504 Gateway Timeout on
/ingest/ingestnow returns ajob_idimmediately (< 1 second) and processes the document in a FastAPIBackgroundTask. The frontend polls/ingest/status/{job_id}every 2 seconds until status isdoneorerror/ingest/status/{job_id}endpoint returning job status, full result on completion, and error message on failurerequest_timeout=25toChatOpenAIinagent.pyto prevent/chatfrom hitting the same 30-second wall on slow LLM responsesChanged
uploadDocument()inclient.jsupdated to two-phase flow: POST to/ingestfor job submission, then poll/ingest/status/{job_id}until completionDocumentUpload.jsxnow shows live elapsed time during processing (Analysing document... 12s) so users know the upload is progressingjobsdict added tomain.pyto track background job state perjob_id[1.0.1] — 2026-03-12
Fixed — fastembed model downloading at runtime (exit code 128 / OOM)
/root/.cachebut the app ran asmyuser, whose cache is at/home/myuser/.cache. Cache miss on every startup triggered a 90MB HuggingFace download, exhausting the 512MB container RAM and crashing with exit code 128FASTEMBED_CACHE_PATH=/app/.cache/fastembedandHF_HOME=/app/.cache/huggingfaceenvironment variables set before the model download step — build and runtime now use identical paths, guaranteed cache hitlist(model.embed(['test']))forces all 5 model files to download)Fixed — CORS blocking frontend requests
https://lexai-frontend-43cn4.ondigitalocean.app) toallow_originsinmain.pyFRONTEND_URLnow read from environment variable so CORS origins update without code changes[1.0.0] — 2026-03-11
Added — Initial release
Backend
POST /ingest— PDF upload pipeline: text extraction (pdfplumber + pytesseract OCR fallback), section-aware chunking, fastembed embeddings, FAISS indexing, and proactive risky clause detectionPOST /chat— RAG question answering scoped to uploaded document via LangGraphretrieve_node → generate_nodepipelinePOST /ticket— Lawyer review request with mock ticket ID (LEX-XXXXXX)GET /health— Health check endpoint for DO App Platform and CI/CDAI Pipeline
llama3-8b-instruct) using OpenAI-compatible endpoint (https://inference.do-ai.run/v1)fastembed(BAAI/bge-small-en-v1.5, 384-dim) — no torch, no CUDA, no external embedding APIIndexFlatIPwith L2 normalisation for cosine similarity search, persisted to diskStateGraphwrapped withgradient-adk@entrypointdecorator for ADK compatibility and automatic trace captureFrontend
Infrastructure
docker-compose.ymlfor single-command local developmenttest → deploy-backend → deploy-frontendviadoctl apps create-deployment