Proof of concept for webcam-based hand gesture control on Windows.
The first version focuses on actions that can be triggered reliably:
thumbs up-> configurable (default: approve in VS Code/Windsurf)thumbs down-> configurable (default: deny in VS Code/Windsurf)open palm-> configurable (default: switch focus between Visual Studio Code and Windsurf)
This is intentionally built around keyboard and window automation, not direct clicks on AI approval buttons. That makes the prototype much more stable across VS Code and Windsurf.
The app uses:
- MediaPipe Hands for hand landmarks
- OpenCV for webcam capture and preview
- PyAutoGUI for key presses
- PyGetWindow for activating VS Code or Windsurf windows on Windows
To reduce accidental triggers, a gesture must be held briefly before the action fires, and each action has a cooldown.
- Create and activate a Python virtual environment.
- Install dependencies:
pip install -r requirements.txt- Run the app using one of these methods:
Option A: Use the run script (recommended)
.\run.ps1Option B: Manually with conda
C:\Users\Bady\anaconda3\Scripts\conda.exe run -p c:\Users\Bady\Documents\Projekty\DetectionCamera\.conda python main.pyTo inspect real window titles on your machine:
.\list_windows.ps1You can customize behavior by creating settings.json in the project root.
- Copy
settings.example.jsontosettings.json. - Adjust
gesture_actionsand optional camera thresholds.
Action format in gesture_actions:
"key:enter"for single key press"key:ctrl+shift+p"for hotkey combinations"ide:approve"to focus VS Code or Windsurf and sendEnter"ide:deny"to focus VS Code or Windsurf and sendEsc"switch_window"to switch focus between windows matched bytarget_window_keywords"none"to disable an action for a gesture
Example mapping:
{
"gesture_actions": {
"thumbs_up": "key:ctrl+shift+p",
"thumbs_down": "key:esc",
"open_palm": "switch_window"
}
}- Press
qto quit the preview window. - Show one hand clearly to the webcam.
- Keep the gesture stable for about
0.8seconds. - Press
cto toggle calibration hints. - Press
[or]to decrease or increase hold time. - Press
-or=to decrease or increase cooldown. - Press
sto save current hold and cooldown tosettings.json. - Press
wto start or stop the calibration wizard. - Press
ato apply wizard suggestion and save it. - Press
kto start guided gesture calibration foropen_palm,thumbs_up, andthumbs_down.
The preview now includes:
- a color border based on currently detected gesture,
- a big live gesture label,
- hold progress and cooldown timer,
- recent detection history bars for
thumbs_up,thumbs_down, andopen_palm, - wizard target and progress when calibration wizard is active,
- guided gesture-calibration prompts and progress for personalized thresholds.
If gesture angles still feel too strict, press k in the camera window.
The app will ask you to hold these gestures in sequence:
open_palmthumbs_upthumbs_down
The captured metrics are converted into a personalized gesture_profile and saved to settings.json automatically. That profile is loaded again on the next run.
- The approval actions only press keys. For AI confirmation dialogs, the correct control still needs focus.
- The switching action looks for a window title containing
Visual Studio CodeorWindsurf. - If switching does not work, run
python list_windows.pyand copy matching fragments intotarget_window_keywords. - Hotkeys are sent to the currently active window, so focus still matters.
- If your VS Code window title is customized, adjust the keywords in the settings.
- Gesture recognition is heuristic-based for the first version. It avoids model training and keeps iteration fast.
- Add a calibration screen for your camera angle and dominant hand.
- Map gestures to custom VS Code keybindings instead of generic
EnterandEsc. - Add logging and an on-screen action history.