Skip to content

Remote streaming mode: first attempt shelved, postponed to v3+ #45

@engmung

Description

@engmung

Logging an exploration of a "remote streaming" mode where shaders authored
in the browser (GLSL) are streamed live to the ESP32-S3 over WebSocket and
rendered on the HUB75 panel — no firmware rebuild required. Stalled at
13–17 fps with visible artifacts. Shelved for now; revisiting in v3 or later.

Goal

  • Author patterns without firmware build/flash cycles
  • Mirror the in-browser preview directly on the physical panel
  • Enable shaders too heavy to compute on-device (domain warping, FBM, etc.)
  • Eventually integrate with the existing local mode (long-press to toggle)

Architecture attempted

  • Hardware: ESP32-S3 N16R8, HUB75 128×64, Arduino IDE + ESP32 core 3.3.7
  • Data path: Browser WebGL2 → readPixels → chunked (6B header + 1536B payload × 16) → WebSocket binary → ESP32 → triple buffer + Core 1 render task → HUB75
  • WS library: ESPAsyncWebServer (failed on large binary frames) → switched to Links2004 WebSockets
  • Channel split: /stream (port 81, binary) and /control (port 82, JSON)
  • Rendering: FreeRTOS task pinned to Core 1, triple buffer with portMUX swap
  • Web page: raw WebGL2, no dependencies, served from PROGMEM (no LittleFS)

Where it stalled

  • rx/push fps capped at 13–17
  • LED output visibly choppier than browser preview (suspected tearing / race)
  • Occasional display blackouts that recover after a few seconds

Root cause hypotheses

  1. drawPixel function-call overhead. ~70 ms to call it 8192 times. Acknowledged as a fundamental limit by the library author in discussion #117. 30 fps is effectively unreachable via per-pixel calls.
  2. PSRAM DMA buffer + WiFi contention on S3 N16R8. Reported in discussions #258 / #441. Unverified in this attempt — verbose boot log was never captured to confirm whether the DMA buffer landed in PSRAM or internal SRAM.
  3. Browser readPixels is synchronous. Minor; not the bottleneck while the ESP32 side is the limit.

Entry points for next attempt

  • First step: turn on verbose Core Debug Level and capture the HUB75 library boot log. Confirms hypothesis Dev #2 in one line.
  • Decide whether 30 fps is actually required. 15 fps is reachable today with current architecture if tearing is fixed.
  • If 30 fps is required: fork the HUB75 library or write directly to the BCM bit-plane buffer instead of going through drawPixel.
  • Consider RGB565 transmission to halve bandwidth (not the current bottleneck, but cheap once the CPU side is unblocked).
  • Consider UDP/DDP via a small Node.js bridge if WebSocket TCP overhead becomes the constraint.
  • Improve debugging setup: keep USB serial connected while the panel runs on external 5V (common GND), or pipe logs to the browser via /control.

Related

  • mrcodetastic/ESP32-HUB75-MatrixPanel-DMA discussions #117, #258, #441
  • ESPAsyncWebServer is unsuitable for high-frequency large binary messages on this stack; Links2004 WebSockets is the working substitute.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions