Skip to content

fix(memory): escape user input in LanceDBStorage SQL filters (#5728)#5729

Open
devin-ai-integration[bot] wants to merge 1 commit intomainfrom
devin/1778048378-fix-lancedb-sqli
Open

fix(memory): escape user input in LanceDBStorage SQL filters (#5728)#5729
devin-ai-integration[bot] wants to merge 1 commit intomainfrom
devin/1778048378-fix-lancedb-sqli

Conversation

@devin-ai-integration
Copy link
Copy Markdown
Contributor

Summary

Fixes #5728. LanceDBStorage interpolated caller-supplied scope paths and record IDs directly into the WHERE clauses passed to LanceDB's where(), which accepts a raw Apache DataFusion SQL expression and does not support parameterized queries. This allowed:

  • Scope-isolation bypass. A caller restricted to one scope could escape that sandbox and read or delete records belonging to any other scope. The most damaging example, reproduced before the fix:
    storage.delete(scope_prefix="/alpha' OR scope LIKE '/%")
    deleted every record in the table instead of just /alpha's subtree, because the resulting WHERE evaluated to scope LIKE '/alpha' OR scope LIKE '/%' OR scope = '/'.
  • Crashes on legitimate input. Ordinary scope paths or record IDs containing an apostrophe (e.g. "/O'Brien", "O'Reilly-42") raised RuntimeError: Unterminated string literal in list_records, list_scopes, get_scope_info, list_categories, delete(record_ids=…), and reset(scope_prefix=…).

The 4 reported sinks all routed through lib/crewai/src/crewai/memory/storage/lancedb_storage.py:

  1. search(scope_prefix=…)
  2. delete(scope_prefix=…)
  3. delete(record_ids=…)
  4. reset(scope_prefix=…) (plus everything that goes through _scan_rows: list_records, list_scopes, get_scope_info, list_categories, count).

Fix

Added two private helpers on LanceDBStorage:

  • _escape_sql_str(value) — doubles single quotes for string literals (O'BrienO''Brien).
  • _escape_like(value) — additionally escapes the SQL LIKE metacharacters %, _, and \, so that a caller-supplied prefix is matched as a literal, not as a glob.

Every user-controlled value in search, delete, reset, _scan_rows, update, get_record, and touch_records is now routed through the appropriate helper. LIKE clauses now use ESCAPE '\\' so %/_ in scope paths are treated as literals.

Note: update, get_record, and touch_records already escaped IDs inline via replace("'", "''"). Those callsites were switched to the shared helper for consistency, but their behaviour is unchanged.

Tests

Added lib/crewai/tests/memory/test_lancedb_storage_security.py with 12 regression tests covering every sink:

  • helper unit tests for _escape_sql_str and _escape_like,
  • injection payloads against search, delete(scope_prefix), delete(record_ids), and reset — including the delete(scope_prefix="/alpha' OR scope LIKE '/%") payload from the report — asserting that no unintended record is touched,
  • legitimate-input round-trips for apostrophe-containing scopes and IDs across all scan-based readers,
  • assertions that % in a caller-supplied scope_prefix is treated as a literal, not a wildcard.

All existing memory tests in lib/crewai/tests/memory/ continue to pass.

Review & Testing Checklist for Human

  • Confirm the ESCAPE '\\' clause is supported by the minimum LanceDB version pinned in lib/crewai/pyproject.toml. DataFusion has supported it for a long time, but worth a quick sanity check against the lockfile.
  • Verify that no production caller relied on % or _ in a scope_prefix actually behaving as a wildcard (this PR turns them into literals — which matches the documented "scope path" semantics, but is a behaviour change for any code that was using LIKE-style globs).
  • Skim the diff against lancedb_storage.py to confirm no other f-string-built WHERE clause was missed (I searched for all of them; the placeholder-row delete on line 210 is the only remaining one and uses a hard-coded literal).

Notes

Reproduction script (pre-fix output) and a per-sink demonstration are included in the linked session. The 4 sinks called out in the issue all flow through this file; no other module in crewai (and no module in crewai-tools) builds a LanceDB where() from caller-supplied input.

Reporter (@ibondarenko1) does not need their CVSS-9.6 PoC to verify — the deterministic delete(scope_prefix="/alpha' OR scope LIKE '/%") test in this PR is the same class of bug.

Link to Devin session: https://app.devin.ai/sessions/75235e5dd2a74ca4a347a558ef7cc052

LanceDBStorage interpolated caller-supplied scope paths and record IDs
directly into the WHERE clauses passed to LanceDB's where(), which
accepts a raw DataFusion SQL expression and does not support
parameterized queries. A malicious or unprivileged caller could escape
the configured scope sandbox -- for example, calling
delete(scope_prefix="/alpha' OR scope LIKE '/%") would wipe every
record in the table instead of just the /alpha subtree -- and ordinary
strings containing apostrophes (e.g. 'O''Brien') could crash the SQL
parser.

Add _escape_sql_str() and _escape_like() helpers and route every
user-controlled value through them in search(), delete(), reset(), and
the shared _scan_rows() reader. The LIKE clauses now also use
ESCAPE '\\' so % and _ in caller-supplied prefixes are treated as
literals instead of wildcards.

Adds tests/memory/test_lancedb_storage_security.py covering each
sink (search, delete by scope, delete by id, reset, scan-based
readers) with both injection payloads and legitimate apostrophe-
containing scopes/IDs.
@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

Prompt hidden (unlisted session)

@devin-ai-integration
Copy link
Copy Markdown
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@github-actions github-actions Bot added the size/L label May 6, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Security: Request to enable Private Vulnerability Reporting / coordinate channel

0 participants