Skip to content

jarrod227/capstone_project

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

137 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

EOG-IMU Cursor Control System

Head and Eye Controlled Cursor Using Electrooculography (EOG) and Inertial Measurement Units (IMU)

A hands-free computer cursor control system using dual-channel EOG for eye event detection and an IMU for head motion tracking. Built as a capstone project demonstrating embedded systems, real-time signal processing, sensor fusion, and machine learning. See docs/data_flow.md for the complete system pipeline.

Team: Jiayu Yang (Jarrod), Andrew Xie, Gordon Lin, Nicole Le, Ani Sarker

Demo

Final Capstone Demo

Final demo.

System Overview

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚  AD8232 x2  β”‚     β”‚   STM32     β”‚     β”‚   PC (Python)    β”‚
β”‚ Vertical EOG│────>β”‚  ADC1 (PA0) β”‚     β”‚  Signal Proc.    β”‚
β”‚ Horiz.  EOG │────>β”‚  ADC2 (PA4) │────>β”‚  State-Space     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜     β”‚  @200Hz     β”‚     β”‚  Sensor Fusion   β”‚
                    β”‚             β”‚     β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”     β”‚  I2C Read   β”‚              β”‚
β”‚  MPU9250    │────>β”‚  Raw Gyro   β”‚              β–Ό
β”‚  IMU        β”‚     β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜       β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜                           β”‚ OS Mouse API β”‚
                                          β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

How it works: IMU head motion drives cursor movement (direct proportional in threshold mode, state-space model with velocity decay in statespace mode). A Kalman filter tracks gyroscope bias drift in real time, separating true angular velocity from slowly-changing sensor offset without requiring a second sensor. Vertical EOG detects blinks (click/double-click) and up/down gaze (scroll). Horizontal EOG detects left/right gaze (back/forward). Triple blink triggers double click. Looking left/right freezes the cursor; double head nod while frozen centers the cursor on screen. Scroll and navigation require both eye gaze and head motion to agree, preventing false triggers.

Features

Action Input Type
Cursor Move IMU Gyro X/Y (proportional or state-space, by mode) Continuous
Left Click Double Blink (two rapid blinks) Discrete
Right Click Long Blink (eyes closed >=0.4s) Discrete
Double Click Triple Blink (three rapid blinks) Discrete
Center Cursor Look Left/Right + Double Head Nod (eog_h + gyro_x) Freeze + Gesture
Scroll Up/Down Eye Up/Down (enters scroll-ready) β†’ Head Up/Down (eog_v + gx) Fusion (2-step)
Browser Back/Fwd Eye Left/Right (enters nav-ready) β†’ Head Left/Right (eog_h + gy) Fusion (2-step)

Cursor freeze mechanic: Looking left or right (horizontal EOG) freezes the cursor. While frozen, head nods center the cursor on screen. This prevents accidental triggers during normal head movement and eliminates cursor drift during gestures.

Scroll ready mechanic: Looking up or down (vertical EOG) locks the cursor into a scroll-ready state β€” the cursor freezes and only head tilt (up/down) can trigger scrolling. Eyes returning to neutral exit the state.

Nav ready mechanic: Looking left or right (horizontal EOG) locks the cursor into a nav-ready state β€” the cursor freezes and only head turn (left/right) can trigger browser back/forward. Eyes returning to neutral exit the state.

Both scroll and nav use the same two-step design: eye gaze enters the ready state, head motion confirms the action. This prevents missed triggers caused by imperfect eye-head timing synchronization.

Blink detection uses a 4-state machine analyzing full spike waveforms, not simple thresholds. See docs/detection.md for signal zones, state diagrams, and parameters.

Quick Start

1. Install & Setup

Requires a graphical desktop (Windows / macOS / Linux with X11) for cursor control.

pip install -r requirements.txt
python python/scripts/generate_demo_data.py --output data/raw   # ~10s, deterministic (seed=42)
python python/scripts/train_model.py --data data/raw             # ~15s, ~98% CV accuracy

2. Collect Training Data (optional, requires hardware)

python python/scripts/collect_data.py --port COM4

Label keys during recording: 0=idle 1=blink 2=double_blink 3=triple_blink 4=long_blink 5=look_up 6=look_down 7=look_left 8=look_right, ESC=stop and save.

Procedure: press label key ~1 s before the gesture β†’ perform gesture β†’ wait ~1 s β†’ press 0. The extra buffer ensures the actual gesture falls well within the labeled region; a few hundred ms of timing error is fine because the ML pipeline uses windowed features.

Note: The serial port is exclusive β€” only one process can open it at a time. Close collect_data before running main.py on the same port.

3. Run

3 modes Γ— 3 data sources β€” any combination works (run from project root):

--replay CSV (offline) --simulate (no hardware) --port COM4 (hardware)
threshold python python/main.py --replay data/raw/demo_replay.csv python python/main.py --simulate python python/main.py --port COM4
statespace python python/main.py --replay data/raw/demo_replay.csv --mode statespace python python/main.py --simulate --mode statespace python python/main.py --port COM4 --mode statespace
ml python python/main.py --replay data/raw/demo_replay.csv --mode ml python python/main.py --simulate --mode ml python python/main.py --port COM4 --mode ml

Default mode is threshold. Hardware port: Windows COM4, Linux /dev/ttyACM0 (Nucleo).

Simulator controls: Arrows=move, Space(x2)=left-click, Space(hold)=right-click, Space(x3)=double-click, L/R+N(x2)=center-cursor (look left/right then nod), U+Up=scroll-up, D+Down=scroll-down, L+Left=back, R+Right=forward, Q=quit.

Keyboard overlay (hardware mode): Add --keyboard-overlay (or --kb) to inject EOG events from keyboard while hardware continues streaming sensor data. Keyboard events are processed through independent detectors and merged with hardware events β€” they do not modify real EOG values. IMU data still comes from hardware. (Also accepted with --replay for testing, but replay data already contains deterministic events so the overlay is rarely needed.)

Keyboard overlay controls: Space(x2)=left-click, Space(hold)=right-click, Space(x3)=double-click, U=look-up (scroll fusion with hardware IMU), D=look-down (scroll fusion with hardware IMU), L=look-left (freezes cursor, enables nod from hardware IMU), R=look-right (freezes cursor, enables nod from hardware IMU).

python python/main.py --port COM4 --mode threshold   --kb
python python/main.py --port COM4 --mode statespace  --kb
python python/main.py --port COM4 --mode ml          --kb

Note: The simulator generates square-wave EOG signals (instant jumps), which differ from the smooth waveforms used to train the SVM. As a result, --mode ml with --simulate cannot classify EOG events reliably. Use --replay CSV or real hardware for ML mode.

Project Structure

β”œβ”€β”€ firmware/                    # STM32 reference firmware (C)
β”‚   β”œβ”€β”€ firmware.ioc            # CubeMX project (STM32F303RETx Nucleo-64)
β”‚   β”œβ”€β”€ Core/Inc/
β”‚   β”‚   └── mpu9250.h           # MPU9250 I2C driver header
β”‚   └── Core/Src/
β”‚       β”œβ”€β”€ main.c              # Main loop: dual ADC + I2C + DMA UART @200Hz (TIM6)
β”‚       └── mpu9250.c           # MPU9250 I2C driver
β”‚
β”œβ”€β”€ python/                      # PC-side application
β”‚   β”œβ”€β”€ main.py                  # Entry point with CLI
β”‚   β”œβ”€β”€ eog_cursor/              # Core library
β”‚   β”‚   β”œβ”€β”€ config.py            # All tunable parameters
β”‚   β”‚   β”œβ”€β”€ serial_reader.py     # STM32 UART data parser (dual-channel)
β”‚   β”‚   β”œβ”€β”€ signal_processing.py # Low-pass filter, Kalman filter, sliding window
β”‚   β”‚   β”œβ”€β”€ event_detector.py    # Blink, gaze, double nod detectors
β”‚   β”‚   β”œβ”€β”€ feature_extraction.py # 10 features Γ— 2 channels for SVM classifier
β”‚   β”‚   β”œβ”€β”€ cursor_control.py    # Threshold & state-space controllers
β”‚   β”‚   β”œβ”€β”€ ml_classifier.py     # SVM training and inference (dual-channel)
β”‚   β”‚   β”œβ”€β”€ simulator.py         # Keyboard-based hardware simulator
β”‚   β”‚   β”œβ”€β”€ keyboard_overlay.py  # Keyboard EOG overlay for hardware mode
β”‚   β”‚   └── csv_replay.py        # Offline CSV file replay
β”‚   β”œβ”€β”€ scripts/
β”‚   β”‚   β”œβ”€β”€ collect_data.py      # Labeled data collection from hardware
β”‚   β”‚   β”œβ”€β”€ generate_demo_data.py# Synthetic dual-channel data generator
β”‚   β”‚   β”œβ”€β”€ train_model.py       # SVM training with cross-validation
β”‚   β”‚   └── visualize.py         # Real-time 3-subplot signal visualization
β”‚   β”œβ”€β”€ tests/                   # 70 tests (signal, events, ML, state-space, Kalman, keyboard overlay)
β”‚   └── models/                  # Trained SVM model + scaler (.gitignored)
β”‚
β”œβ”€β”€ data/raw/                    # Generated by scripts/generate_demo_data.py
β”œβ”€β”€ docs/                        # Technical deep-dives
β”‚   β”œβ”€β”€ data_flow.md             # System pipeline (firmware + Python, all 9 run configs)
β”‚   β”œβ”€β”€ detection.md             # Blink state machine, signal zones, waveform analysis
β”‚   β”œβ”€β”€ state_space.md           # Matrix derivation, velocity retention analysis, stability proof
β”‚   β”œβ”€β”€ kalman_filter.md         # Kalman filter derivation, steady-state analysis, parameter tuning
β”‚   β”œβ”€β”€ performance.md           # Evaluation metrics template (ML + real hardware)
β”‚   └── interview_questions.md   # Technical Q&A for project understanding
└── requirements.txt

Technical Details

  • Kalman filter: 2-state filter per gyro axis tracks bias drift in real time β€” startup calibration seeds the initial estimate, then the filter adapts continuously. See docs/kalman_filter.md for derivation and steady-state analysis.
  • State-space cursor: Velocity-retention model gives the cursor physical inertia. See docs/state_space.md for matrix derivation and stability proof.

Sensor Fusion

Scroll and navigation require both eye gaze and head motion to agree:

Action Eye Signal Head Signal
Scroll Up eog_v > 2400 (look up) gx < -300 (tilt up)
Scroll Down eog_v < 1600 (look down) gx > 300 (tilt down)
Browser Back eog_h < 1600 (look left) gy < -300 (turn left)
Browser Fwd eog_h > 2400 (look right) gy > 300 (turn right)

Hardware

Component Qty Purpose Interface
STM32 MCU (F3/F4/U5/etc.) 1 Data acquisition USB (UART)
AD8232 2 EOG analog front-end (V + H) ADC pins
MPU9250 (or MPU6050) 1 IMU head tracking I2C
Ag/AgCl electrodes 5 EOG signal pickup (2 pairs + 1 ref) AD8232 input

Electrode placement: Vertical pair (V+/V-) above and below one eye β†’ eog_v. Horizontal pair (L/R) at outer canthi of both eyes β†’ eog_h. Reference on forehead.

Firmware: Reference code in firmware/, developed with STM32CubeMX + STM32CubeIDE. The included firmware.ioc is the CubeMX project for STM32F303RETx (Nucleo-64) β€” open it to regenerate HAL code, or create a new project for your board. Data packet format: timestamp,eog_v,eog_h,gyro_x,gyro_y,gyro_z\r\n at 115200 baud. See firmware/README.md for AD8232 wiring, serial debug, and CubeMX regeneration instructions. See docs/data_flow.md for the data pipeline.

Configuration

All parameters in python/eog_cursor/config.py. Key values:

# --- All modes ---
GYRO_DEADZONE = 300             # Below this = noise (cursor deadzone + fusion check)
GYRO_CALIBRATION_SAMPLES = 400  # Startup bias calibration (2s at 200Hz)
KALMAN_Q_OMEGA = 1000.0         # Kalman process noise for angular velocity (fast, trust measurement)
KALMAN_Q_BIAS = 0.001           # Kalman process noise for bias (slow drift, ~6s time constant)
KALMAN_R = 500.0                # Kalman measurement noise (gyro sensor noise variance)

# --- threshold mode only ---
CURSOR_SENSITIVITY = 0.01      # Direct gyro-to-pixel ratio (no inertia)

# --- threshold & statespace modes ---
BLINK_THRESHOLD = 2600         # ADC value for blink detection (ML mode uses SVM instead)

# --- statespace & ml modes ---
SS_VELOCITY_RETAIN = 0.95      # Cursor glide per step (0.8=snappy, 0.99=floaty)
SS_SENSITIVITY = 0.05          # Gyro-to-velocity input gain

Real-Time Visualization

Use scripts/visualize.py to display live EOG and IMU signals in 3 subplots (vertical EOG, horizontal EOG, gyroscope 3-axis) with threshold lines overlaid. Useful for verifying hardware connections, tuning thresholds, and observing signal patterns.

cd python

python -m scripts.visualize --port /dev/ttyACM0   # Linux
python -m scripts.visualize --port COM4            # Windows
python -m scripts.visualize --port COM4 --window 10  # Show last 10s of data on screen (default: 5s)

Testing

cd python && python -m pytest tests/ -v

70 tests across 4 files:

File Key Verifications
test_event_detector.py β€” 30 tests Double blink detected; triple blink detected; triple blink window expired; single blink ignored; long blink fires on release; long blink max duration rejected; sustained close fires once; cooldown prevents re-trigger; sustained gaze detected; transient gaze rejected; double head nod triggers center cursor (only when cursor frozen); single nod ignored; nod ignored when not frozen; state reset on unfreeze
test_keyboard_overlay.py β€” 12 tests Double/triple/long blink from Space; look up/down from U/D keys; look left/right from L/R keys; cursor freeze from L/R; idle produces no events; Space does not produce gaze events
test_signal_processing.py β€” 21 tests Low-pass preserves DC baseline; high frequency attenuated; sliding window keeps most recent samples; Kalman filter tracks constant bias, passes real motion, tracks drift; 3-axis wrapper corrects all axes; feature vector has correct length; state-space velocity decays to ~0 after 200 iterations
test_ml_pipeline.py β€” 7 tests Training accuracy >80%; model save/load roundtrip succeeds; predictions are valid labels (all 9 classes); streaming classifier produces output; blink features clearly separable from idle

Performance

See docs/performance.md for ML classification accuracy, end-to-end latency, per-action accuracy, and robustness evaluation β€” all measured with real EOG hardware.

Architecture Decisions

Decision Rationale
Dual-channel EOG Enables horizontal gaze for browser back/forward
Eye + head fusion Both must agree β€” prevents false triggers
Processing on PC Full Python ecosystem, easier debugging
Kalman bias tracking Tracks gyro drift without accelerometer; separates slow bias from fast motion using process noise tuning
State-space model Physical inertia makes cursor feel natural
SVM over deep learning Small dataset, low latency (<5ms), interpretable
Lazy pyautogui import Enables testing in headless CI

License

MIT

About

Hands-free cursor control using dual-channel EOG (eye tracking) and IMU (head tracking) with sensor fusion and SVM classification.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors