feat(web): add push-to-talk, VAD continuous listening, and voice settings#2
Open
P2Chill wants to merge 1807 commits into
Open
feat(web): add push-to-talk, VAD continuous listening, and voice settings#2P2Chill wants to merge 1807 commits into
P2Chill wants to merge 1807 commits into
Conversation
57399d8 to
f4e14b0
Compare
rand 0.10 moved random_range from the Rng trait to RngExt. Entire-Checkpoint: 17e9b54d784a
rand 0.10 moved RngCore into rand_core (re-exported as rand::Rng for fill_bytes) and moved random/random_range to the RngExt trait. Update all affected crates: oauth, vault, webhooks, browser, auth, cli, gateway, channels. Entire-Checkpoint: 999e29f811d7
The "Memory live" / "Memory frozen" button and its refresh companion cluttered the chat header for a setting most users configure once. Prompt memory status and the refresh action remain accessible via the "..." menu → Context (full context modal). Remove the toolbar HTML, bindPromptMemoryToolbar(), the per-session effect that polled chat.context on every switch, and the E2E test that exercised the now-removed toolbar buttons. The full context modal test already covers prompt memory display and refresh. Entire-Checkpoint: ad104e58319b
…ders local-llm (built-in llama.cpp with HuggingFace downloads) was offered but buried under "All providers" instead of showing in the recommended section. LM Studio wasn't in the default offered list at all. Both are now recommended and offered by default so users see local inference options prominently during onboarding. Entire-Checkpoint: 73c6362fea07
…ters The backend (auth_routes.rs) already enforces a 12-character minimum but all UI strings, placeholders, client-side validation, Swift bridge, macOS app, and docs still referenced 8 characters. Entire-Checkpoint: 3ff3aaf1c445
Entire-Checkpoint: fa2921ba12b0
Entire-Checkpoint: 7c8ff12599b3
Entire-Checkpoint: 5f8d1d7d2a14
slack-morphism <2.20 mapped its rustls-native-certs feature to tokio-tungstenite/rustls-native-certs, which only activates the certificate resolver — not the actual TLS stack. This caused Url(TlsFeatureNotEnabled) when connecting to wss://wss-primary.slack.com. Version 2.20 adds tokio-tungstenite/rustls-tls-native-roots, enabling the full rustls + tokio-rustls TLS implementation. Closes moltis-org#543 Entire-Checkpoint: 1648c51b4b8b
# Conflicts: # Cargo.toml
- Run `cargo update reqwest@0.12.28` so all workspace crates resolve to reqwest 0.13.2 instead of 0.12.28 - Add `query` feature which became opt-in in reqwest 0.13 - Third-party crates (async-openai, chromiumoxide, matrix-sdk, oauth2) still pull reqwest 0.12; teloxide-core still pulls 0.11 — these cannot be resolved at the workspace level Entire-Checkpoint: e7e417b27a80
GraphQL sessionKey was optional, causing session routing mismatches for custom clients that omit it — the chat service would silently fall back to the "main" session instead of erroring. Make sessionKey a required String on all chat mutations (send, abort, cancelQueued, clear, compact), queries (history, context, rawPrompt, fullContext), and the chatEvent subscription. Regenerate iOS GraphQL schema to match. Closes moltis-org#542 Entire-Checkpoint: d217a837cb32
- Drop chat events without sessionKey in payload instead of forwarding them to all subscribers (explicit match guard) - Remove flaky 100ms timeout from subscription validation test - Add required-sessionKey tests for all remaining operations: abort, cancelQueued, clear, compact, context, rawPrompt, fullContext - Add test case for events without sessionKey being dropped Entire-Checkpoint: 261bd146b2d8
WhatsApp updated their protobuf message schema. waproto 0.2 no longer covers the current wire format, so after Signal decryption the protobuf deserialization yields a Message with all fields None. The handler falls through to ChannelMessageKind::Other and replies with an error instead of routing to the LLM. Bump all 6 crates (whatsapp-rust, wacore, wacore-binary, waproto, whatsapp-rust-tokio-transport, whatsapp-rust-ureq-http-client) from 0.2 to 0.5 and adapt to the new API: - Add .with_runtime(TokioRuntime) to bot builder (new 4th type param) - Implement new SignalStore::get_max_prekey_id() for both stores - Implement new AppSyncStore::get_latest_sync_key_id() for both stores - Implement 8 new ProtocolStore methods (TC tokens + sent message cache) - Update SKDM methods: Vec<String> → Vec<Jid> signatures - Add new sled trees (tc_tokens, sent_messages) and DashMap fields - Disable default simd feature on wacore/wacore-binary (requires unstable core::simd::Select not available on current nightly) Existing paired sessions may need re-pairing if wacore struct layouts changed between 0.2 and 0.5. Closes moltis-org#534
Entire-Checkpoint: 65e290092e0b
Entire-Checkpoint: 2860ff44e09b
WhatsApp is a fully implemented channel but was excluded from the default offered list, requiring manual opt-in via moltis.toml. Now that the whatsapp-rust ecosystem is upgraded to 0.5 and working properly, include it by default so users see it in onboarding and the channels settings page without extra configuration. Updated default lists in: - Config schema (default_channels_offered) - Config template (comment) - Gateway state (hardcoded fallback) - JS fallbacks (onboarding-view.js, page-channels.js) Entire-Checkpoint: a27a36510a1a
…dering Address PR review comments: - Replace manual SystemTime::UNIX_EPOCH arithmetic with time::OffsetDateTime::now_utc().unix_timestamp() per CLAUDE.md rules - Unify sent message timestamp type to i64 across both stores (sled_store was u64, memory_store was i64) - Track latest sync key ID explicitly in MemoryStore instead of relying on non-deterministic DashMap iteration order Entire-Checkpoint: ab25d949d314
The WhatsApp QR code was delivered exclusively via WebSocket events, which could be missed due to timing (event fires before subscription is fully active) or dropped if the client is slow. Users saw "Waiting for QR code..." indefinitely. Two changes: 1. Expose QR code data in ChannelHealthSnapshot.extra so it's available via the channels.status RPC 2. Add polling fallback (every 2s) in both onboarding-view.js and page-channels.js that fetches QR data from channels.status when the WebSocket event hasn't arrived Entire-Checkpoint: 1cf2bc56081a
Entire-Checkpoint: 5eb5e38f13db
…org#862) Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
chat-autoscroll: add missing setAutoScrollMode to the state.js E2E shim so the "always" mode test can set the scroll mode. projects: add a project via the UI before testing the edit form since the E2E env starts with no projects. Wait for WS connection before interacting.
- Add codeIndexEnabled field to Project type in iOS GraphQL schema - Remove unused network.client entitlement from macOS app
…oltis-org#845) Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Port push-to-talk, VAD continuous listening, and voice settings from the original JS implementation to the current TypeScript/Preact codebase. The original PR (moltis-org#303) targeted files that no longer exist after the JS-to-TS migration. PTT: configurable hotkey (default F13), function keys work in inputs, BroadcastChannel tab coordination prevents dual-tab recording. VAD: energy-based RMS detection with exponential sensitivity curve, 2.5s silence auto-send, 30s max recording safety valve, TTS mute/unmute with echo settle delay, AudioContext health monitoring, MediaStream track auto-reacquisition, vadTranscribing race-condition guard, EBML header validation, 15s fetch timeout. Settings: PTT key picker (click-to-rebind), VAD sensitivity slider.
penso
added a commit
that referenced
this pull request
Apr 24, 2026
ci: add Rust CI workflow for fmt, clippy, and build
|
Important Review skippedToo many files! This PR contains 283 files, which is 133 over the limit of 150. ⚙️ Run configurationConfiguration used: Organization UI Review profile: ASSERTIVE Plan: Pro Run ID: ⛔ Files ignored due to path filters (17)
📒 Files selected for processing (283)
You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
- Extract PttKeyPicker component with useEffect cleanup to prevent stale keydown listener leaking after settings panel unmount (P1) - Move clearTimeout into finally block so fetch timeout is always cleared on network errors (P2) - Add .catch() to AudioContext.resume() to prevent unhandled promise rejection when context is closed concurrently (P2) - Remove continuous console.debug VAD monitor logging from production code (P2)
- Stop old vadStream tracks before replacing in vadReacquireStream to prevent microphone device leak on stream reacquisition (P1) - Stop active VAD session when STT becomes unconfigured in checkSttStatus (P2) - Replace hardcoded error strings with i18n keys in transcribeAudio: voiceNoSpeech, voiceTranscriptionError, voiceTranscriptionFailed, voiceUploadFailed — with en/fr/zh translations (P2)
- Add recordingCancelled flag to prevent cancelled audio from being transcribed: the browser flushes a final ondataavailable chunk after mediaRecorder.stop(), repopulating audioChunks after cancel (P1) - Guard vadReacquireStream with vadReacquiring flag to prevent concurrent getUserMedia calls at 60fps when a mic track dies (P1)
- Fix PTT race condition: quick key tap could orphan a recording when getUserMedia resolves after keyup. Now sets recordingCancelled flag in onPttKeyUp when isStarting, and startRecording checks the flag after getUserMedia resolves to bail out cleanly (P1) - Fix MediaStreamSourceNode leak in vadReacquireStream: disconnect old source node before connecting a new one on stream reacquisition. Track source node in vadSourceNode variable, clean up in stopVad (P2)
- Guard vadStartContinuousRecorder against being called while a recorder is already in "recording" state, preventing silent audio loss from overwriting an active recorder (P1) - Fix onTtsPlay to null vadMediaRecorder before calling stop() via a local reference, closing the window where vadMonitorLoop could see null and start a concurrent recorder before onstop fires (P1)
Add vadStarting flag so a second click while getUserMedia is in-flight is ignored, preventing two parallel VAD sessions from being created with leaked streams and racing shared state.
Escape handler leak - Add isRecording/isStarting guard to startVad so clicking the VAD button during an active toggle recording is ignored, preventing audioChunks corruption from concurrent recorders (P1) - Extract anonymous Escape keydown and mic keydown handlers into named functions (onEscapeKeydown, onMicKeydown) so they can be properly removed in teardownVoiceInput, preventing listener accumulation across SPA mount cycles (P1)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds two new voice input modes alongside the existing toggle:
Changed files
voice-input.js— Complete rewrite with PTT, VAD, tab coordination, health monitoringpage-chat.js— VAD waveform button next to mic buttonpage-settings.js— PTT key picker + VAD sensitivity slidercomponents.css— VAD button CSS states (listening glow, speech pulse)input.css— Waveform icon SVGlocales/en,fr,zh/chat.js— i18n keysTest plan