Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
8235c01
chore(audit): fix env smoke scripts and proof pack status
hungthinh1104 Jun 25, 2026
8f99a85
chore: harden post-mvp runtime safety and quality gates
hungthinh1104 Jun 25, 2026
3c4e5cc
refactor: extract shared runtime and policy utilities
hungthinh1104 Jun 25, 2026
b909a69
refactor(auth): isolate dev-login policy and explicit auth mode
hungthinh1104 Jun 25, 2026
817eb0d
refactor(worker): extract embedding application boundary
hungthinh1104 Jun 25, 2026
42fbf8b
feat(multi-repo): expose merged report capabilities
hungthinh1104 Jun 26, 2026
48f8e7b
fix(multi-repo): harden merged report lifecycle
hungthinh1104 Jun 26, 2026
2ec3bed
refactor(application): move impact analysis runtime to application pa…
hungthinh1104 Jun 26, 2026
df7073d
refactor(worker): move document job processor to worker app
hungthinh1104 Jun 26, 2026
eceaf60
fix(domain-packs): harden runtime scope and lint debt
hungthinh1104 Jun 27, 2026
2263485
fix(domain-packs): remove multi-domain capability debt
hungthinh1104 Jun 27, 2026
5afebce
chore(ci): add stability verification gate
hungthinh1104 Jun 27, 2026
1f4e90d
feat(contracts): add domain pack selection metadata
hungthinh1104 Jun 27, 2026
96d2bdc
feat(api): resolve and persist selected domain pack
hungthinh1104 Jun 27, 2026
ef20fbb
feat(web): add backend-driven domain pack selector
hungthinh1104 Jun 27, 2026
166aa6d
test(domain-packs): cover healthcare partial invariants
hungthinh1104 Jun 27, 2026
e418653
feat(document): include domain pack provenance in reports
hungthinh1104 Jun 27, 2026
f91b383
docs(domain-packs): document explicit partial domain selection
hungthinh1104 Jun 27, 2026
cd0bb1a
feat(domain-packs): persist domain pack provenance first-class
hungthinh1104 Jun 27, 2026
8861e13
feat(domain-packs): expose full report provenance
hungthinh1104 Jun 27, 2026
b904100
test(domain-packs): add pack governance validation
hungthinh1104 Jun 27, 2026
084659a
test(domain-packs): enforce versioning and alias guardrails
hungthinh1104 Jun 27, 2026
74b2e1d
feat(domain-packs): add ecommerce partial domain pack
hungthinh1104 Jun 27, 2026
7dea9c6
test(domain-packs): add cross-domain evaluation summary
hungthinh1104 Jun 27, 2026
d283dc5
refactor(scanner): harden scan pipeline persistence boundary
hungthinh1104 Jun 27, 2026
8ca7945
refactor(scanner): harden snapshot publication boundaries
hungthinh1104 Jun 28, 2026
2258617
feat(evidence): classify evidence quality for impact reports
hungthinh1104 Jun 29, 2026
756ceba
feat(review): add report review coverage summary
hungthinh1104 Jun 29, 2026
786ffd6
feat(review): enforce critical review coverage before approval
hungthinh1104 Jun 29, 2026
435824b
feat(security): add public beta runtime guardrails
hungthinh1104 Jun 29, 2026
e1b9393
feat(scanner): add scan workspace retention cleanup
hungthinh1104 Jun 29, 2026
9a6f51b
feat(system): expose beta operations health summary
hungthinh1104 Jun 29, 2026
2eb6389
fix(review): sync traceability decision status
hungthinh1104 Jun 29, 2026
01faf2d
fix(release): harden preview release guardrails
hungthinh1104 Jun 29, 2026
9159b40
docs(release): clarify controlled beta boundary
hungthinh1104 Jun 29, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
The table of contents is too big for display.
Diff view
Diff view
  •  
  •  
  •  
7 changes: 0 additions & 7 deletions .eslintrc.cjs

This file was deleted.

12 changes: 11 additions & 1 deletion .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,15 @@ name: ci

"on":
push:
branches: ["main"]
branches:
- "**"
pull_request:
branches:
- main

concurrency:
group: ci-${{ github.ref }}
cancel-in-progress: true

jobs:
quality-and-tests:
Expand Down Expand Up @@ -80,3 +87,6 @@ jobs:

- name: Golden path demo
run: pnpm demo:golden-path

- name: Multi-repo golden path demo
run: pnpm demo:multi-repo-golden-path
33 changes: 27 additions & 6 deletions AGENTS.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,21 @@

This repository builds a **Requirement-to-Code Impact Analyzer for Technical BA**.

The core product key is not "multi-domain" and not "AI code analysis".
The core path is:

```text
Requirement change
-> impacted code artifacts
-> source evidence
-> unknowns / risks / QA scenarios
-> human review
-> approved traceable report
```

The value is reducing risk when requirements change by making impact analysis
evidence-backed, reviewable, and provenance-locked.

The MVP is deliberately narrow:

```text
Expand Down Expand Up @@ -41,14 +56,18 @@ Update docs + contracts + tests before completion.
Work is currently focused on:

```text
1. Snapshot drift and freshness lifecycle
2. Drift-based stale/re-analysis warnings
3. Incremental scan foundation
4. Evaluation packs for impact quality
5. Domain Pack architecture
6. Public beta hardening
1. Scan pipeline atomicity and snapshot publication safety
2. Evidence quality scoring and weak/missing evidence detection
3. Impact precision evaluation packs
4. Review coverage gates
5. Report trust UX and provenance visibility
6. Snapshot drift/freshness and public beta hardening
```

Do not add new domains as the center of gravity. Domain packs are controlled
terminology/risk/QA hint layers. Evidence is the source of truth and human
review is the final authority.

## Instruction Loading And Workflow

This file is the repository-level instruction source.
Expand Down Expand Up @@ -115,6 +134,8 @@ errors/logging, TypeScript/lint/CI configuration, or async worker behavior.
- EVIDENCED = current MVP name for evidence-backed claim.
- Long-term target naming should be CONFIRMED / INFERRED / UNKNOWN / CONFLICTING.
- UI must not invent additional certainty labels.
- Domain-pack hints, LLM suggestions, and retrieval candidates cannot create
`EVIDENCED` claims without persisted source evidence.
4. Missing support becomes `UNKNOWN`, `CONFLICTING`, or a stakeholder question, never an invented business rule.
5. Every analysis and generated artifact is tied to a repository snapshot and
its `commitSha`; moving-ref freshness is computed through its selected
Expand Down
38 changes: 28 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,10 @@
# BA Helper: Requirement-to-Code Impact Analyzer

**BA Helper** is a specialized impact analyzer for backend teams. It bridges the gap between changing business requirements and backend architecture. In research contexts, the engine is referred to as **ReqImpact**.
**BA Helper** is an evidence-backed Requirement-to-Code Impact Analyzer for backend teams. It helps teams understand what a requirement change may affect in backend systems, with source evidence, unknowns, risks, QA scenarios, human review, and traceable reports. In research contexts, the engine is referred to as **ReqImpact**.

The core value is reducing risk when requirements change. The product is not a
generic repo chatbot, an AI coding assistant, an auto-BRD generator, or a
multi-domain intelligence platform.

## 1. The Problem
When a business requirement changes (e.g., "allow users to cancel paid bookings for a refund"), Technical Business Analysts (BAs) and QA Engineers must manually trace how that change cascades through the backend codebase. This process is historically slow, heavily reliant on tribal knowledge, and lacks an immutable audit trail—often resulting in missed edge cases and unhandled regression risks.
Expand All @@ -13,6 +17,15 @@ BA Helper automates the heavy lifting of traceability while enforcing strict hum
4. **Snapshot:** Freezes the reviewed decisions into an immutable reviewed snapshot.
5. **Final Export:** Generates a deterministic, audited markdown report directly from the locked snapshot.

```text
Requirement change
-> impacted code artifacts
-> source evidence
-> unknowns / risks / QA scenarios
-> human review
-> approved traceable report
```

## 3. Why It Is Different from a Repo Chatbot
Unlike generic AI coding assistants or repo chatbots:
- **No Hallucinated Claims:** Every insight must link to a persisted code `Evidence` record.
Expand All @@ -37,7 +50,7 @@ pnpm demo:golden-path
```

**Visual Case Study:**
For a step-by-step visual walkthrough of this workflow, see the [Demo Case Study](docs/portfolio/case-study.md), which features an 8-screen proof pack demonstrating the full end-to-end audit and lifecycle process.
For a step-by-step visual walkthrough of this workflow, see the [Demo Case Study](docs/portfolio/case-study.md), which features a partial visual proof pack demonstrating key milestones of the audit and lifecycle process.

**Sample Requirement:**
> "When a paid booking is cancelled, the system must refund the tenant, prevent double refunds, update booking/payment state, and notify relevant parties."
Expand Down Expand Up @@ -112,8 +125,7 @@ pnpm install
Create the environment files from their examples. The examples contain safe, pre-configured local placeholders (including a fake AI provider).

```bash
cp apps/api/.env.example apps/api/.env
cp apps/web/.env.example apps/web/.env.local
cp .env.example .env
```

For containerized web runtime, keep two URLs straight:
Expand Down Expand Up @@ -167,7 +179,9 @@ pnpm dev:worker
# Start frontend web app (Port 3000)
pnpm dev:web
```
Open `http://localhost:3000/login` and sign in using the dev-login bypass.
Open `http://localhost:3000/login` and use the dev sign-in form. In local
development, `ENABLE_DEV_LOGIN=true` lets you enter with a demo operator email
and role; do not expose that endpoint on a public API host.

### 10. Real Runtime Smoke Lanes
The default CI and golden path stay on fake providers. Real-provider smoke is explicit and manual:
Expand Down Expand Up @@ -238,21 +252,25 @@ Built as a TypeScript modular monolith to balance speed of development with even
- Unsupported route patterns, file scan blind spots, artifact uncertainty, and dependency boundaries become diagnostics, `UNKNOWN`, or `RISK` items requiring review.
- Experimental scanners must not be presented as production-grade language support.
- Domain packs are hints, not evidence.
- Domain packs are context adapters for terminology and risk/QA hints; evidence
and review remain the trust anchors.
- LLM output is constrained by extracted evidence and human review; it is not allowed to finalize reports by itself.
- Evaluation metrics are internal quality signals, not public benchmarks.
- Automated CI golden path uses fake providers; manual UI demo runs with Gemini real LLM when configured.
- Production SaaS concerns such as GitHub App auth, billing, and hosted multi-tenant deployment are not complete.

## Roadmap
1. Keep TypeScript/NestJS as the primary public demo story.
2. Harden pilot scanner adapters while keeping capability status explicit.
3. Improve visual review and traceability flows without weakening the evidence hierarchy.
4. Native OAuth and GitHub App integrations.
1. Harden scan pipeline atomicity and snapshot publication safety.
2. Add evidence quality scoring for weak/missing/conflicting support.
3. Improve impact precision evaluation packs and scorecards.
4. Tighten review coverage gates and report trust UX.
5. Continue drift/freshness hardening and controlled beta readiness.
6. Expand domains/languages only behind explicit capability status and evaluation coverage.

## Documentation & Assets
- **[Golden Path Demo Guide](docs/demo/golden-path.md)**
- **[Sample Requirement Change](docs/demo/sample-requirement-change.md)**
- **[Public Beta Release Note](docs/demo/public-beta-release-note.md)**
- **[Controlled Beta Release Note](docs/demo/public-beta-release-note.md)**
- **[Portfolio Proof Pack](docs/demo/portfolio-proof-pack.md)**
- **[Public Demo Checklist](docs/demo/public-demo-checklist.md)**
- **[Impact Evaluation Docs](docs/evaluation/impact-evaluation.md)**
Expand Down
5 changes: 3 additions & 2 deletions SECURITY.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,13 @@
# Security Policy

## Supported Status
This project is currently in **Public Beta / Experimental**.
While we take security seriously, please note that we do not have a formal security certification yet.
This project is currently in **Controlled Beta / Experimental**.
While we take security seriously, please note that we do not have a formal security certification yet.

## Current Limitations
- Production SaaS concerns such as GitHub App auth, billing, and hosted multi-tenant deployment are not complete.
- We rely on deterministic limits (bounded diagnostics, explicitly skipped large files) rather than formalized sandboxing for repo ingestion.
- Dev-login is for local development and private controlled demos only. Do not expose a hosted API publicly with dev-login enabled.

## Reporting a Vulnerability

Expand Down
7 changes: 4 additions & 3 deletions apps/api/package.json
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,10 @@
"scripts": {
"build": "tsc -p tsconfig.json",
"dev": "dotenv -e ../../.env -- pnpm exec ts-node -r tsconfig-paths/register --project tsconfig.json src/main.ts",
"lint": "echo \"lint api\"",
"lint": "eslint \"src/**/*.ts\" \"test/**/*.ts\"",
"smoke:public-github": "dotenv -e ../../.env -- tsx src/smoke-e2e.ts",
"smoke:public-github:real-llm": "REAL_LLM_SMOKE=true dotenv -e ../../.env -- tsx src/smoke-e2e.ts",
"smoke:public-github:real-path": "REAL_PATH_SMOKE=true dotenv -e ../../.env -- tsx src/smoke-e2e.ts",
"smoke:public-github:real-llm": "dotenv -e ../../.env -v REAL_LLM_SMOKE=true -- tsx src/smoke-e2e.ts",
"smoke:public-github:real-path": "dotenv -e ../../.env -v REAL_PATH_SMOKE=true -- tsx src/smoke-e2e.ts",
"test": "jest",
"typecheck": "tsc --noEmit",
"prisma:generate": "dotenv -e ../../.env -- prisma generate",
Expand All @@ -19,6 +19,7 @@
"dependencies": {
"@anthropic-ai/sdk": "^0.99.0",
"@ba-helper/analyzer": "workspace:*",
"@ba-helper/application": "workspace:*",
"@ba-helper/contracts": "workspace:*",
"@google/generative-ai": "^0.24.1",
"@nestjs/bullmq": "11.0.4",
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,88 @@
-- CreateEnum
CREATE TYPE "DomainPackCapabilityStatus" AS ENUM ('STABLE', 'PARTIAL', 'EXPERIMENTAL', 'FALLBACK');

-- CreateEnum
CREATE TYPE "DomainPackSelectionSource" AS ENUM ('EXPLICIT', 'REPOSITORY_PROFILE', 'FALLBACK');

-- AlterTable
ALTER TABLE "ImpactAnalysis"
ADD COLUMN "requestedDomainPackId" TEXT,
ADD COLUMN "resolvedDomainPackId" TEXT NOT NULL DEFAULT 'general',
ADD COLUMN "resolvedDomainPackVersion" TEXT NOT NULL DEFAULT '0.0.0',
ADD COLUMN "resolvedDomainPackStatus" "DomainPackCapabilityStatus" NOT NULL DEFAULT 'FALLBACK',
ADD COLUMN "domainPackSelectedBy" "DomainPackSelectionSource" NOT NULL DEFAULT 'FALLBACK',
ADD COLUMN "domainPackResolvedAt" TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP,
ADD COLUMN "domainPackManifestDigest" TEXT,
ADD COLUMN "domainPackRegistryVersion" TEXT;

-- AlterTable
ALTER TABLE "MultiRepoAnalysisRun"
ADD COLUMN "requestedDomainPackId" TEXT,
ADD COLUMN "resolvedDomainPackId" TEXT NOT NULL DEFAULT 'general',
ADD COLUMN "resolvedDomainPackVersion" TEXT NOT NULL DEFAULT '0.0.0',
ADD COLUMN "resolvedDomainPackStatus" "DomainPackCapabilityStatus" NOT NULL DEFAULT 'FALLBACK',
ADD COLUMN "domainPackSelectedBy" "DomainPackSelectionSource" NOT NULL DEFAULT 'FALLBACK',
ADD COLUMN "domainPackResolvedAt" TIMESTAMP(3) NOT NULL DEFAULT CURRENT_TIMESTAMP,
ADD COLUMN "domainPackManifestDigest" TEXT,
ADD COLUMN "domainPackRegistryVersion" TEXT;

-- Backfill ImpactAnalysis from the resolved selection persisted in metadata.
UPDATE "ImpactAnalysis"
SET
"requestedDomainPackId" = NULLIF("metadata" #>> '{selectedDomainPack,requestedDomainPackId}', ''),
"resolvedDomainPackId" = COALESCE(NULLIF("metadata" #>> '{selectedDomainPack,resolvedDomainPackId}', ''), "resolvedDomainPackId"),
"resolvedDomainPackVersion" = COALESCE(NULLIF("metadata" #>> '{selectedDomainPack,resolvedDomainPackVersion}', ''), "resolvedDomainPackVersion"),
"resolvedDomainPackStatus" = CASE
WHEN "metadata" #>> '{selectedDomainPack,resolvedDomainPackStatus}' IN ('STABLE', 'PARTIAL', 'EXPERIMENTAL', 'FALLBACK')
THEN ("metadata" #>> '{selectedDomainPack,resolvedDomainPackStatus}')::"DomainPackCapabilityStatus"
ELSE "resolvedDomainPackStatus"
END,
"domainPackSelectedBy" = CASE
WHEN "metadata" #>> '{selectedDomainPack,selectedBy}' IN ('EXPLICIT', 'REPOSITORY_PROFILE', 'FALLBACK')
THEN ("metadata" #>> '{selectedDomainPack,selectedBy}')::"DomainPackSelectionSource"
ELSE "domainPackSelectedBy"
END,
"domainPackResolvedAt" = CASE
WHEN ("metadata" #>> '{selectedDomainPack,resolvedAt}') ~ '^\d{4}-\d{2}-\d{2}T'
THEN ("metadata" #>> '{selectedDomainPack,resolvedAt}')::timestamp
ELSE "domainPackResolvedAt"
END
WHERE "metadata" ? 'selectedDomainPack';

-- Backfill multi-repo runs from the first child analysis. v1 creates child
-- analyses with the same explicit run-level selection when a run-level pack is
-- requested; mixed or legacy runs retain conservative defaults.
UPDATE "MultiRepoAnalysisRun" AS run
SET
"requestedDomainPackId" = child."requestedDomainPackId",
"resolvedDomainPackId" = child."resolvedDomainPackId",
"resolvedDomainPackVersion" = child."resolvedDomainPackVersion",
"resolvedDomainPackStatus" = child."resolvedDomainPackStatus",
"domainPackSelectedBy" = child."domainPackSelectedBy",
"domainPackResolvedAt" = child."domainPackResolvedAt",
"domainPackManifestDigest" = child."domainPackManifestDigest",
"domainPackRegistryVersion" = child."domainPackRegistryVersion"
FROM LATERAL (
SELECT
analysis."requestedDomainPackId",
analysis."resolvedDomainPackId",
analysis."resolvedDomainPackVersion",
analysis."resolvedDomainPackStatus",
analysis."domainPackSelectedBy",
analysis."domainPackResolvedAt",
analysis."domainPackManifestDigest",
analysis."domainPackRegistryVersion"
FROM "ImpactAnalysis" AS analysis
WHERE analysis."multiRepoRunId" = run."id"
ORDER BY analysis."createdAt" ASC
LIMIT 1
) AS child
WHERE child."resolvedDomainPackId" IS NOT NULL;

-- CreateIndex
CREATE INDEX "ImpactAnalysis_resolvedDomainPackId_resolvedDomainPackVersion_idx"
ON "ImpactAnalysis"("resolvedDomainPackId", "resolvedDomainPackVersion");

-- CreateIndex
CREATE INDEX "MultiRepoAnalysisRun_resolvedDomainPackId_resolvedDomainPackVersion_idx"
ON "MultiRepoAnalysisRun"("resolvedDomainPackId", "resolvedDomainPackVersion");
Loading
Loading