Word Embeddings Visualizer

An interactive web application for computing and visualizing word embeddings using transformer models.

Features

Compute embeddings for words and sentences using sentence-transformers/all-MiniLM-L6-v2
Display raw embedding vectors (384 dimensions)
Visualize embeddings in 2D using UMAP dimensionality reduction
Interactive scatter plot showing semantic relationships
Real-time updates as you add more inputs
Clean, minimal UI with responsive design

Architecture

┌─────────────────┐         ┌──────────────────┐
│   Frontend      │         │    Backend       │
│   (nginx)       │────────▶│   (FastAPI)      │
│   - HTML/JS     │  HTTP   │   - Transformers │
│   - Chart.js    │         │   - UMAP         │
└─────────────────┘         └──────────────────┘
      :80                         :8000

Backend

Framework: FastAPI
Package Manager: UV (modern Python package manager)
Model: sentence-transformers/all-MiniLM-L6-v2 (384-dim embeddings)
Dimensionality Reduction: UMAP
Storage: In-memory (no database required)

Frontend

Tech: Vanilla JavaScript, Chart.js
Server: nginx
Features: Interactive form, real-time plot updates, embedding display

Quick Start

Prerequisites

Docker
Docker Compose

Running the Application

Clone this repository
Navigate to the project directory
Start the application:

docker-compose up --build

Open your browser and go to http://localhost
Enter words or sentences to see their embeddings visualized!

The first startup will take a few minutes as it downloads the transformer model (~90MB).

Stopping the Application

docker-compose down

Usage

Enter Text: Type a word or sentence in the input box
Compute: Click "Compute Embedding" or press Enter
View Results:
- See the embedding vector (first 10 dimensions shown)
- Watch the 2D plot update with the new point
Add More: Keep adding words/sentences to see relationships

Example Inputs to Try

Try these to see semantic clustering:

"cat", "dog", "puppy", "kitten"
"king", "queen", "prince", "princess"
"happy", "sad", "joyful", "depressed"
"Paris", "France", "London", "England"

API Endpoints

GET /health

Health check endpoint

curl http://localhost:8000/health

POST /embed

Compute embedding for input text

curl -X POST http://localhost:8000/embed \
  -H "Content-Type: application/json" \
  -d '{"text": "hello world"}'

Response:

{
  "id": "uuid",
  "text": "hello world",
  "embedding": [0.123, -0.456, ...]
}

GET /embeddings

Get all stored embeddings with 2D coordinates

curl http://localhost:8000/embeddings

Response:

{
  "count": 3,
  "embeddings": [
    {
      "id": "uuid",
      "text": "hello world",
      "x": 1.23,
      "y": -0.45,
      "embedding": [...]
    }
  ]
}

Development

Running Backend Locally

The backend is now a UV-managed Python package:

cd backend
uv sync  # Install dependencies
MODEL_NAME="sentence-transformers/all-MiniLM-L6-v2" uv run uvicorn embeddings_backend.main:app --reload

Backend will be available at http://localhost:8000

Running Frontend Locally

Simply open frontend/index.html in a browser, or use a local server:

cd frontend
python -m http.server 8080

Frontend will be available at http://localhost:8080

Note: Update API_BASE_URL in app.js if running locally without Docker.

Project Structure

visualize-embeddings/
├── backend/                     # UV-managed Python package
│   ├── src/
│   │   └── embeddings_backend/
│   │       ├── __init__.py
│   │       ├── main.py          # FastAPI app and endpoints
│   │       ├── embedding_service.py  # Transformer model wrapper
│   │       └── embedding_store.py    # In-memory storage
│   ├── pyproject.toml           # UV package configuration
│   ├── uv.lock                  # Locked dependencies
│   └── Dockerfile
├── frontend/
│   ├── index.html               # UI layout and styling
│   ├── app.js                   # Frontend logic and API calls
│   └── Dockerfile
├── docker-compose.yml           # Container orchestration
└── README.md

Technical Details

UMAP Dimensionality Reduction

Uses UMAP for projecting 384-dim embeddings to 2D
Preserves local and global structure
Random state fixed for reproducibility
Handles edge cases (1 or 2 points)

In-Memory Storage

Embeddings stored as numpy arrays
Singleton pattern for service instances
No persistence (data lost on restart)
Suitable for demo/exploration purposes

CORS Configuration

Configured to allow all origins
Suitable for development
Consider restricting in production

Limitations

Data is not persisted (in-memory only)
No authentication or user management
Single-user experience
Model cannot be changed without code modification

Future Enhancements

Add database for persistence
Support multiple models
Download embeddings as CSV
Similarity search functionality
Clustering visualization
Multi-user support with sessions

License

MIT

Acknowledgments

Built with FastAPI
Embeddings from sentence-transformers
Visualization with Chart.js
Dimensionality reduction using UMAP

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
backend		backend
frontend		frontend
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Word Embeddings Visualizer

Features

Architecture

Backend

Frontend

Quick Start

Prerequisites

Running the Application

Stopping the Application

Usage

Example Inputs to Try

API Endpoints

GET /health

POST /embed

GET /embeddings

Development

Running Backend Locally

Running Frontend Locally

Project Structure

Technical Details

UMAP Dimensionality Reduction

In-Memory Storage

CORS Configuration

Limitations

Future Enhancements

License

Acknowledgments

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Word Embeddings Visualizer

Features

Architecture

Backend

Frontend

Quick Start

Prerequisites

Running the Application

Stopping the Application

Usage

Example Inputs to Try

API Endpoints

GET /health

POST /embed

GET /embeddings

Development

Running Backend Locally

Running Frontend Locally

Project Structure

Technical Details

UMAP Dimensionality Reduction

In-Memory Storage

CORS Configuration

Limitations

Future Enhancements

License

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages