56 deterministic Leads scenarios · Excel reports with embedded screenshots · Risk-based testing · Independent defect verification · Staging-first safety
Quick start · How it works · Reports · Safety · Roadmap
QaAgent-Pro is a local-first TypeScript + Playwright QA system that approaches CRM testing like a careful human QA engineer:
- it starts with an explicit test mission;
- prioritizes by likelihood × business impact;
- applies experience-based testing heuristics;
- executes only registered deterministic actions;
- validates UI, network, persistence, search/table, console, accessibility, and performance evidence;
- independently decides whether a failure belongs to the application, automation, environment, or an unclear product rule;
- produces a stakeholder-ready Excel workbook.
It is intentionally not a random autonomous clicking agent.
Most browser agents optimize for completing a task. A QA system must do more: challenge assumptions, test negative paths, preserve evidence, distinguish test failures from product defects, disclose coverage gaps, and avoid damaging customer data.
QaAgent-Pro separates those responsibilities into explicit layers.
flowchart LR
M["QA Mission<br/>Given / When / Then"] --> P["Mission Planner"]
P --> R["Risk Analyst<br/>Likelihood × Impact"]
R --> H["Human-QA Heuristics<br/>Boundaries · Interruptions · Persistence"]
H --> S{"Safety Gate"}
S -->|"Verified staging"| E["Deterministic Executor"]
S -->|"Gate failed"| RO["Read-only Audit"]
E --> B["Playwright Browser Engine"]
RO --> B
B --> O["Multi-oracle Evidence"]
O --> V["Independent Defect Verifier"]
V --> J["Release Judge"]
J --> X["Excel Report<br/>Screenshots + Evidence + Backlog"]
flowchart TB
MP["Mission Planner<br/>What should be tested?"]
RA["Risk Analyst<br/>What matters most?"]
EX["Executor<br/>Only role allowed to operate Playwright"]
DV["Defect Verifier<br/>Product bug or test problem?"]
RJ["Release Judge<br/>Go / Conditional Go / No-Go"]
MP --> RA --> EX --> DV --> RJ
C["Shared typed run context"] --- MP
C --- RA
C --- EX
C --- DV
C --- RJ
The roles are deterministic TypeScript services—not external LLM agents. Codex is used to build, operate, and debug the repository, never to select arbitrary runtime clicks.
| Area | Capability |
|---|---|
| Leads coverage | 56 numbered scenarios with an independent result for every case |
| Blueprint conformance | Separates confirmed requirements, observed behavior, and product-confirmation questions |
| Functional testing | Validation, search, table, sorting, pagination, page size, detail and lifecycle contracts |
| Human-QA heuristics | Boundaries, interruptions, state transitions, data variation, error guessing and consistency |
| Non-functional testing | Responsive UI, accessibility basics and user-visible performance |
| Evidence | Screenshots, Playwright trace, console errors and action-scoped XHR/fetch evidence |
| Persistence | Reload, navigation and search/table verification |
| Safety | Environment, host, tenant, visible marker and dedicated-account gates |
| Reporting | Nine-sheet XLSX workbook with embedded screenshots |
| Quality | Strict TypeScript, ESLint, Vitest coverage, checksum verification and secret scanning |
QaAgent-Pro never treats a design blueprint as the only truth:
- Confirmed blueprint requirements — explicitly approved product expectations.
- Observed application behavior — what the running application demonstrably does.
- Needs Product Confirmation — behavior that is ambiguous, inferred, or conflicts with incomplete specifications.
This prevents missing features from being mislabeled as runtime bugs and prevents undocumented behavior from being silently accepted.
The Leads suite covers:
- authentication and page loading;
- blueprint controls;
- required, email, and mobile validation;
- valid creation and duplicate behavior;
- search by name, company, phone, and email;
- details and editable fields;
- notes and activities;
- safe Call, Email, and WhatsApp inspection;
- conversion, archive, read-only archive behavior, unarchive, and bulk archive;
- owner, label, city, activity-date, source, no-activity, overdue, and combined filters;
- sorting, pagination, page size, empty/no-results states;
- responsive UI, keyboard focus, accessible names, and performance;
- Import, Export, and Manage Columns conformance;
- persistence and cleanup reconciliation.
See the complete catalog in docs/leads-test-plan.md.
The user-facing deliverable is a single .xlsx workbook:
- Bug Report
- Summary
- UX Issues
- Feature Gaps
- Refresh Persistence
- Next Build Backlog
- Test Execution
- Evidence Log
- Run Metadata
flowchart LR
SR["Scenario Results"] --> BR["Bug Report"]
SR --> UX["UX Issues"]
SR --> FG["Feature Gaps"]
SR --> RP["Refresh Persistence"]
BR --> NB["Next Build Backlog"]
UX --> NB
FG --> NB
SR --> TE["Test Execution"]
SR --> EL["Evidence Log"]
TE --> SU["Summary + Release Verdict"]
EL --> SU
Failed findings can include embedded screenshots, reproduction steps, expected/actual behavior, risk score, oracle results, verification attribution, confidence, and release impact.
Before any mutation, all gates must pass:
flowchart TD
A["Mutation requested"] --> B{"CRM_ENVIRONMENT = staging?"}
B -->|No| RO["Read-only mode"]
B -->|Yes| C{"Hostname allowlisted?"}
C -->|No| RO
C -->|Yes| D{"Tenant allowlisted?"}
D -->|No| RO
D -->|Yes| E{"Dedicated QA account?"}
E -->|No| RO
E -->|Yes| F{"Visible test/staging marker?"}
F -->|No| RO
F -->|Yes| W["Staging mutation permitted"]
Blocked by default:
- production mutations;
- delete and destructive bulk operations;
- real email, WhatsApp, call, payment, billing, and invitation actions;
- sensitive exports;
- unrestricted crawling;
- arbitrary model-selected browser actions.
Credentials, auth state, reports, screenshots, traces, and local run data are Git-ignored.
- Node.js 20+
- npm
- Chromium installed through Playwright
- A dedicated QA account on a staging/test tenant
git clone https://github.com/BAKUGOS1/QaAgent-Pro.git
cd QaAgent-Pro
npm install
npx playwright install chromium
cp .env.example .envFill the local .env without committing it:
CRM_BASE_URL=https://your-staging-crm.example.com
CRM_EMAIL=qa@example.com
CRM_PASSWORD=
CRM_ENVIRONMENT=staging
CRM_TENANT=QA-TENANT
STAGING_HOST_ALLOWLIST=your-staging-crm.example.com
STAGING_TENANT_ALLOWLIST=QA-TENANT
QA_ACCOUNT_ALLOWLIST=qa@example.com
ALLOW_ARCHIVE=true
ALLOW_DELETE=false
ALLOW_DESTRUCTIVE=false
ALLOW_REAL_MESSAGES=false
ALLOW_SENSITIVE_EXPORT=falseAuthenticate and execute:
npm run config:check
npm run auth:setup
npm run qa:leadsDebug with a visible browser:
npm run qa:leads -- --headedRun only refresh/persistence scenarios:
npm run qa:refreshCompile a deterministic Human-QA mission:
npm run qa:mission -- --file config/blueprint/sample-mission.json| Command | Purpose |
|---|---|
npm run auth:setup |
Login and save local Playwright storage state |
npm run qa:leads |
Execute all 56 Leads scenarios |
npm run qa:refresh |
Execute the refresh/persistence subset |
npm run qa:mission |
Validate and expand a deterministic QA mission |
npm run report:sample |
Generate a sample XLSX report |
npm run config:check |
Validate environment configuration without exposing secrets |
npm run test:unit |
Run framework unit tests with coverage |
npm run quality:gate |
Run the complete repository quality gate |
src/
├── blueprint/ # Confirmed product requirements
├── browser/ # Auth, session, environment and evidence capture
├── config/ # Zod environment contract
├── data/ # Unique QA fixture factories
├── findings/ # Finding classification
├── human-qa/ # Missions, heuristics, risk, verification and release judgment
├── leads/ # 56-case scenario catalog, runner, registry and contracts
├── pages/ # Playwright page objects
├── reporting/ # Excel workbook generation
└── safety/ # Environment and action authorization
Read the deeper architecture, safety runbook, and reference study.
npm run quality:gate
npm audit --audit-level=moderateThe gate checks:
- approved product-input checksums;
- strict TypeScript;
- ESLint;
- unit tests and coverage;
- repository secret scan;
- workbook creation/reopening and embedded images;
- Playwright test discovery.
Current local validation: 54 tests passing, 93%+ line coverage, zero npm audit vulnerabilities.
Concepts were studied—not blindly copied—from:
- Microsoft Playwright
- Agentic QE Fleet
- Test Automation Skills & Agents
- GUITestBench / GUITester
- TestZeus Hercules — AGPL concepts only; no code copied
- BAKUGOS1/QaAgent
Exact reviewed revisions and licensing boundaries are documented in docs/reference-architecture-study.md.
- Framework, safety and Excel foundations
- Human-QA mission/risk/verification layers
- Full 56-scenario Leads runner
- Auth, evidence, persistence and cleanup contracts
- Stabilize CRM-owned test IDs and complete currently blocked lifecycle contracts
- Deals pipeline and won/lost lifecycle
- Activities and Products modules
- Visual regression baselines
- CI staging execution with protected secrets
- Optional local intelligence for report synthesis—never browser action selection
Contributions are welcome. Read CONTRIBUTING.md and follow the architecture and safety constitution in AGENTS.md.
MIT © 2026 QaAgent-Pro contributors.