⚡ UX-Ray

title	UX-Ray AI QA Agent
emoji	⚡
colorFrom	purple
colorTo	indigo
sdk	docker
pinned	false

⚡ UX-Ray

Autonomous AI QA Agent & User Testing Simulator

UX-Ray is an autonomous AI agent designed for startups and indie builders. It acts as a simulation of 1,000 real first-time users, intelligently navigating your web application to identify UX friction, discover behavior patterns, and generate developer-ready actionable checklists before you launch.

✨ Core Features

🧠 Senior QA Algorithmic Explorer: UX-Ray is prompted with strict algorithmic rules. It deduces visual hierarchy based on button sizes (width/height), explicitly tests edge cases, and maps out core workflows without getting stuck in loops.
👁️ Spatial & Semantic Awareness: The internal Playwright engine extracts rich DOM metadata including width, height, disabled states, aria-labels, and hrefs. This gives the AI true "sight" of your app.
⏳ Smart Auto-Wait Engine: Built directly into the execution engine, the AI automatically detects active spinners or loading text and waits for your app to finish loading before proceeding. This mimics true human patience.
🛠️ Heuristic UI Evaluation: The AI acts as a Senior UX/UI Engineer, grading your app against Nielsen's 10 Usability Heuristics. It outputs exact, developer-ready CSS and layout fixes (e.g., "Increase the contrast ratio from #555 to #333").
🎥 Live Session Replay: Watch the AI test your app in real-time, complete with a timeline of events, network interactions, and AI "thoughts".

🏗️ System Architecture

UX-Ray operates on a continuous, multi-modal autonomous loop using Set-of-Mark Visual Prompting.

graph TD
    A[Start Session] --> B[Playwright Launches Sandbox]
    
    subgraph Autonomous Vision Agent Loop
        B --> C[Extract DOM & Inject Bounding Boxes]
        C --> D[Capture Annotated Screenshot]
        D --> E{Is Page Loading?}
        E -- Yes --> F[Smart Auto-Wait]
        F --> C
        E -- No --> G[Gemini 2.0 Multimodal Reasoning]
        
        G --> H[Infer Next Optimal Target ID]
        H --> I[Playwright Executor: Click/Type/Scroll]
        I --> J[Save Annotated Screenshot to Timeline]
        J --> C
    end
    
    J -. Session Complete .-> K[Gemini Multimodal UX Audit]
    K --> L[Generate Actionable UI Checklist]

AI Pipeline Details

Set-of-Mark Annotation: Before making a decision, the headless browser dynamically injects red numeric bounding boxes over every interactable element on the screen.
Visual Autonomous Navigation: The agent streams the live, annotated screenshot directly to Gemini 2.0 Flash. The AI uses spatial reasoning to "look" at the screen, read the popups, and output the ID of the exact bounding box it wants to interact with.
Actionable UX Audit: Uses the captured visual timeline to write a highly technical, bias-free UI/UX report.

💻 Tech Stack

Component	Technology	Description
Framework	Next.js 14	App Router, API Routes for SSE streaming
Database	PostgreSQL	Managed via Prisma ORM for session/event storage
Automation	Playwright	Headless browser execution and visual DOM annotation
Reasoning	Gemini 2.0 Flash	Powers the deep visual navigation and decision logic
Reporting	Gemini Pro	Multimodal visual analysis for the final UX Audit report
Styling	Tailwind CSS	Custom highly-polished developer interface

⚙️ Environment Variables Reference

Create a .env file in the root directory.

Variable	Required	Description
`DATABASE_URL`	Yes	Connection string for your PostgreSQL database (e.g., Supabase)
`DIRECT_URL`	Yes	Direct connection string for Prisma migrations
`GEMINI_API_KEY`	Yes	API key from Google AI Studio for visual reasoning and reporting

🚀 Local Installation

Clone the repository

git clone https://github.com/yourusername/ux-ray.git
cd ux-ray

Install dependencies
```
npm install
```
Initialize Database
```
npx prisma generate
npx prisma db push
```
Run the Development Server
```
npm run dev
```
Open http://localhost:3000 to access the dashboard.

🕹️ Usage Guide

Enter your target URL in the main dashboard.
Select your personalized Developer Testing Preset (e.g., End-to-End Journey, Aggressive QA Tester, or Conversion Flow).
Watch the live Session Explorer as the AI isolates your site in a sandbox, reads the spatial layout, and systematically tests buttons, forms, and workflows.
Click Share report to team to distribute the actionable UX Audit directly to your engineers.

📜 License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
prisma		prisma
public		public
scratch		scratch
src		src
.dockerignore		.dockerignore
.eslintrc.json		.eslintrc.json
.gitignore		.gitignore
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
Design.md		Design.md
Dockerfile		Dockerfile
Execution_phases.md		Execution_phases.md
README.md		README.md
app_details.md		app_details.md
check_engine_live.ts		check_engine_live.ts
check_session.js		check_session.js
next.config.mjs		next.config.mjs
package-lock.json		package-lock.json
package.json		package.json
postcss.config.js		postcss.config.js
rules.md		rules.md
tailwind.config.js		tailwind.config.js
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

⚡ UX-Ray

📋 Table of Contents

✨ Core Features

🏗️ System Architecture

AI Pipeline Details

💻 Tech Stack

⚙️ Environment Variables Reference

🚀 Local Installation

🕹️ Usage Guide

📜 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

⚡ UX-Ray

📋 Table of Contents

✨ Core Features

🏗️ System Architecture

AI Pipeline Details

💻 Tech Stack

⚙️ Environment Variables Reference

🚀 Local Installation

🕹️ Usage Guide

📜 License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages