Skip to content

feat: disambiguation firewall — halt on undefined data concepts instead of guessing #4

@stackbilt-admin

Description

@stackbilt-admin

Context

LLMs don't fix garbage data architectures — they hallucinate faster with more confidence. When a user asks a vague question about data/metrics and the definition isn't strictly defined in the MCP tool schema, AEGIS should halt and demand clarification instead of guessing.

This is the OSS-applicable variant of Stackbilt-dev/aegis#344.

What to implement

Disambiguation middleware (pre-routing interceptor)

When an incoming request references a data concept:

  1. Check if the concept maps to a typed MCP tool parameter or known schema definition
  2. If yes → route normally
  3. If no → invoke request_clarification instead of forwarding to an executor

System prompt guidance

"If a user asks for data or metrics and the definition of that data is not strictly defined in your context window, DO NOT GUESS. You must invoke the request_clarification function. Point out the ambiguity. For example: 'Do you mean churn by seat count or MRR?' You are an auditor, not a psychic."

Tool-mediated access enforcement

  • All data access should go through typed MCP tool interfaces
  • LLMs should never see raw D1 schemas directly
  • Document the pattern: expose strict, Zod-validated endpoints as the canonical data model. The LLM calls tools, not tables.

Relevance to OSS users

Any team deploying AEGIS-OSS with D1 backends and MCP tool registries faces the same antipattern. This gives OSS users a built-in guard against the text-to-SQL trap without requiring a separate semantic layer product.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions