BAIO (Bioinformatics AI for Open-set detection) is a web-based metagenomic analysis platform that classifies DNA sequences using machine learning. It uses 6-mer sequence features with trained SVM and RandomForest models to distinguish viral and host DNA, with an optional novelty flag for low-confidence predictions.
- Features
- Prerequisites
- Get the Code
- Get a Google API Key
- Running the Project
- Using the App
- Project Structure
- Common Issues & Fixes
- Testing
- Contributors
- License
- Sequence Classification — Classifies DNA sequences as Virus or Host
- K-mer Analysis — Uses 6-mer frequency features for sequence representation
- Confidence Visualization — Color-coded confidence bars per prediction
- GC Content Analysis — Heatmap showing GC content distribution
- Risk Assessment — Color-coded risk indicators (Low / Moderate / High)
- Dark Mode — Toggle between light and dark themes
- Export Options — Download results as JSON, CSV, or PDF
- AI Assistant — Gemini-powered chat for sequence analysis questions
- Novelty Flagging — Optional flag for low-confidence / out-of-distribution sequences
Before you can run BAIO, you need to install a few tools. Follow the section for your operating system.
Open Terminal (press Cmd + Space, type "Terminal", press Enter) and run:
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"Follow the on-screen prompts. This may take a few minutes.
brew install gitVerify it works:
git --versionYou should see something like git version 2.x.x.
Download and install Miniconda for macOS from: https://docs.conda.io/en/latest/miniconda.html
Choose the macOS installer for your chip:
- Apple Silicon (M1/M2/M3) → pick the
arm64version - Intel Mac → pick the
x86_64version
After installing, close and reopen Terminal, then verify:
conda --versionbrew install nodeVerify:
node --version # should be 18 or higher
npm --versionDocker lets you run the whole app with a single command — no manual Python/Node setup required.
Download Docker Desktop for Mac from: https://www.docker.com/products/docker-desktop
After installing, open Docker Desktop and wait until the whale icon in the menu bar says "Docker Desktop is running."
Download and install from: https://git-scm.com/download/win
During installation, keep all defaults. This also installs Git Bash, which you will use to run commands.
After installation, open Git Bash (search for it in the Start menu) and verify:
git --versionNote for Windows users: Use Git Bash (not Command Prompt or PowerShell) for all commands in this guide unless stated otherwise.
Download Miniconda for Windows from: https://docs.conda.io/en/latest/miniconda.html
Run the .exe installer and follow the prompts. When asked, check "Add Miniconda3 to my PATH environment variable".
After installing, open a new Git Bash window and verify:
conda --versionDownload the LTS installer from: https://nodejs.org/
Run the .msi installer and keep all defaults.
Open a new Git Bash window and verify:
node --version # should be 18 or higher
npm --versionDownload Docker Desktop for Windows from: https://www.docker.com/products/docker-desktop
Docker Desktop on Windows requires WSL 2 (Windows Subsystem for Linux). The installer will guide you through enabling it if it is not already active.
After installing, open Docker Desktop and wait until it says "Docker Desktop is running."
Open Terminal (macOS) or Git Bash (Windows) and run:
git clone https://github.com/oss-slu/baio.git
cd baioThis downloads the project to a folder called baio and moves you into it.
BAIO uses Google Gemini AI for the chat assistant. You need a free API key.
- Go to: https://makersuite.google.com/app/apikey
- Sign in with a Google account
- Click "Create API Key"
- Copy the key
Now create a file called .env in the baio folder:
macOS / Git Bash (Windows):
echo "GOOGLE_API_KEY=paste_your_key_here" > .envOr open any text editor, create a file named .env in the baio folder, and add:
GOOGLE_API_KEY=paste_your_key_here
Replace
paste_your_key_herewith the actual key you copied.
Choose the option that works best for you.
This runs everything with one command. No need to set up Python or Node manually.
Requirements: Docker Desktop must be running.
Copy the example file and fill in your values:
cp .env.example .envOpen .env and set your API keys (optional — the app works without them, but the AI chatbot will use mock responses):
OPENROUTER_API_KEY=your_openrouter_key_here
GEMINI_API_KEY=your_gemini_key_here
No API keys? That's fine. DNA classification works fully without them. The chatbot will fall back to built-in responses instead of a live AI model.
docker compose up --buildThe first time this runs, it will download and build everything — this may take 5–10 minutes.
Once you see log lines like uvicorn running and the frontend is ready, open your browser:
| Service | URL |
|---|---|
| Frontend (App) | http://localhost:4173 |
| Backend API | http://localhost:8080 |
| API Docs | http://localhost:8080/docs |
To stop the app:
docker compose downTo run in the background:
docker compose up -d --buildThis runs the backend and frontend separately. You need two Terminal windows open at the same time.
conda env create -f environment.yml
conda activate baioIf you see
conda: command not found, close and reopen Terminal after installing Miniconda.
conda activate baio
uvicorn backend.app.main:app --reload --port 8080 You should see:
INFO: Uvicorn running on http://0.0.0.0:8080
Leave this terminal running.
Open a second Terminal window:
cd baio/frontend
npm install # only needed the first time
npm run devYou should see:
VITE v5.x.x ready in xxx ms
➜ Local: http://localhost:5173/
Go to http://localhost:5173 in your browser.
Use Git Bash for all commands below.
conda env create -f environment.yml
conda activate baioIf
conda activatedoes not work in Git Bash, try running this once:conda init bashThen close and reopen Git Bash.
conda activate baio
uvicorn backend.app.main:app --reload --port 8080 You should see:
INFO: Uvicorn running on http://0.0.0.0:8080
Leave this window open.
Open a second Git Bash window:
cd baio/frontend
npm install # only needed the first time
npm run devYou should see:
➜ Local: http://localhost:5173/
Go to http://localhost:5173 in your browser.
Once the app is open in your browser:
-
Enter DNA Sequences
- Paste sequences directly into the text box, OR
- Click "Upload FASTA file" to upload a
.fastafile
-
Configure Settings (optional)
- Confidence threshold (default: 0.75)
- Enable/disable open-set (novelty) detection
- Select model type (K-mer or Evo 2 if available)
-
Click "Classify Sequences"
-
View Results
- Each sequence gets a Virus or Host classification
- Confidence bar shows how certain the model is
- Risk indicator: Low / Moderate / High
- GC content shown per sequence
-
Expand a Row
- Click any result row to see a detailed explanation, confidence breakdown, and sequence preview
-
Export Results
- Download as JSON, CSV, or PDF using the buttons at the top of the results table
-
AI Assistant
- Click the chat icon to open the AI assistant
- Ask questions about your sequences or the classification results
baio/
├── api/ # FastAPI backend
│ ├── main.py # API endpoints (/classify, /chat, /health)
│ ├── llm_client.py # Google Gemini AI client
│ └── Dockerfile # Docker config for the backend
│
├── frontend/ # React + Vite frontend (what you see in the browser)
│ ├── src/
│ │ ├── components/
│ │ │ ├── Header.tsx # Top navigation bar
│ │ │ ├── SequenceInput.tsx # DNA input form
│ │ │ ├── ConfigPanel.tsx # Settings panel
│ │ │ └── ResultsDashboard.tsx # Results table and charts
│ │ ├── App.tsx # Main React component
│ │ ├── api.ts # Functions to call the backend
│ │ └── types.ts # TypeScript type definitions
│ └── package.json # Node.js dependencies
│
├── binary_classifiers/ # ML classification core
│ ├── predict_class.py # Takes DNA sequence, returns Virus/Host prediction
│ ├── evaluation.py # Computes model metrics
│ └── models/ # Saved model files (.pkl)
│
├── data/ # Sample FASTA files for testing
├── tests/ # Unit tests
├── scripts/ # Evaluation and utility scripts
├── .env # Your API key (you create this)
├── environment.yml # Conda environment definition
├── docker-compose.yml # Docker setup
├── requirements.txt # Python dependencies reference
└── README.md # This file
Miniconda was not added to your PATH. Close and reopen your terminal after installing. If the issue persists on macOS, run:
export PATH="$HOME/miniconda3/bin:$PATH"macOS uses python3 by default. Try:
python3 --versionAnother process is using that port.
macOS:
lsof -i :8080
kill -9 <PID shown in output>Windows (Git Bash):
netstat -ano | grep 8080
taskkill /PID <PID shown> /FYour Conda environment is not activated. Run:
conda activate baioYour Node.js version may be too old. Check with node --version — it must be 18 or higher. Re-download from https://nodejs.org/ if needed.
Your .env file is missing or the key is wrong. Make sure the file is in the root baio/ folder and contains:
GOOGLE_API_KEY=your_actual_key
Try clearing Docker's cache:
docker system prune -a
docker compose up --buildAll Python dependencies live in pyproject.toml — this is the single source of truth for both local and Docker environments.
- Add the package to the
dependencieslist inpyproject.toml:"your-package>=1.0", - Rebuild the Docker image:
docker compose build api docker compose up -d
- For local development, reinstall:
pip install -e .
Do not add packages directly to the
Dockerfile— usepyproject.tomlso local and Docker environments stay in sync.
The backend is not running. Make sure Terminal 1 (uvicorn) is still active and showing no errors.
# Make sure the environment is active
conda activate baio
# Run all tests
pytest tests/
# Run a specific test file
pytest tests/test_api_classification.py
# Run with coverage report
pytest --cov=. tests/Modal runs the backend on demand — you only pay when a request is processed. No idle cost. Evo2 inference runs on an A10G GPU that spins up automatically.
Cost: ~$0 when idle. ~$0.30/hr only while actively running Evo2 inference.
pip install modal
modal token newThis opens a browser to create a free Modal account and authenticate.
modal secret create baio-secrets \
OPENROUTER_API_KEY=your_key_here \
GEMINI_API_KEY=your_key_here# Upload the trained classifier weights to Modal's persistent storage
modal volume put baio-weights weights/random_forest_best_model.pkl random_forest_best_model.pkl
modal volume put baio-weights weights/support_vector_machine_best_model.pkl support_vector_machine_best_model.pklmodal deploy modal_app.pyModal will print a URL like:
https://your-username--baio-fastapi-app.modal.run
That is your live API endpoint — update VITE_API_BASE in your frontend .env to point to it.
modal serve modal_app.pyThis runs the app locally with the same Modal environment — useful for testing.
| Request type | Container | GPU | Cold start |
|---|---|---|---|
| Classification (RandomForest/SVM) | CPU | No | ~5s |
| Chat (LLM) | CPU | No | ~5s |
| Evo2 inference | GPU (A10G) | Yes | ~30-90s |
The CPU container has keep_warm=1 so it's always ready. The GPU container scales to zero when not in use and spins up only when Evo2 is selected.
- Mainuddin — Tech Lead
- Luis Palmejar — Developer
- Kevin Yang — Developer
- Vrinda Thakur — Developer
MIT License — see LICENSE
Note: This is a research prototype. For use in clinical or production settings, additional validation and regulatory approval are required.