Skip to content

feat: Choice UI#271

Draft
latekvo wants to merge 63 commits into
mainfrom
feat/propose-variants
Draft

feat: Choice UI#271
latekvo wants to merge 63 commits into
mainfrom
feat/propose-variants

Conversation

@latekvo

@latekvo latekvo commented May 28, 2026

Copy link
Copy Markdown
Member

Adds a UI for proposing and choosing between multiple agent-generated variants on a running app.

MCP tools (packages/tool-server/src/tools/variants/)

  • propose_variant — non-blocking; agent stages a variant for an on-screen element (text/label/identifier/role matcher).
  • await_user_selection — one blocking call; parks until the human submits, then returns the picks + free-form annotations + a global comment.

Preview UI (packages/ui)

  • Floating cards anchored to live element frames via the existing describe tree, with connector lines.
  • Comment-mode inspector: hover spotlight (blur+dim everything else), right-click any element to pin a comment.
  • Sticky padded hitbox + trajectory-aware dismissal so focus doesn't drop on cursor jitter.
  • Footbar [Complete] [Clear], confirm modal for destructive actions, green toast on submit.
  • Single source of feel constants in theme.css.

Electron host (packages/preview-window)

  • New workspace package. Frameless transparent BrowserWindow at fixed size; the squeeze open/close is a CSS transform: scaleY on <html>, GPU-composited — no OS-level resize.
  • Tool-server spawns it on await_user_selection park, reuses it across re-awaits, and asks it to animate-close on submit (1s grace so the toast is visible).
  • Stdio-line IPC (no extra ports). Ephemeral throughout — the renderer hits the tool-server's already-ephemeral /preview/* HTTP.

latekvo and others added 30 commits June 3, 2026 17:09
…alive for long-running tools, UI partition + tests
…light add-comment inspector; UI cleanup (no debug/hint, auto-select single sim)
…isted roots; stable cards (no focus-stealing rebuild); inspector gates device input; reset/leak hardening + tests
…way scroll

- Thumbnails crop to the matched element's frame (letterboxed, no
  distortion) instead of squeezing the whole device screenshot into a
  chip; falls back to a contained full image when the element is
  off-screen or the preview is a data: URI.
- .spot-hit opts back into pointer-events (it inherited none from the
  spotlight root), so hovering now sets vp.focus and the inspector
  actually un-blurs the focused element instead of dimming everything.
- Wheel input is trailing-debounced (mirrors radon-ide): one trackpad
  scrub -> one wheel command, instead of forwarding the whole macOS
  momentum flood (each a ~400ms edge swipe) and scrolling forever.
- paintSpot guards non-finite frames for parity with applyThumbCrop.
…tracking, typable comments

- Thumbnail crops are frozen at the element's first-matched frame
  (vp.thumbFrames, per id), so a static preview screenshot no longer
  shows the wrong region when the live screen scrolls.
- Bubbles (card + connector + pin) fade out when their element leaves
  the screen and pop back when it returns, with edge-graze hysteresis
  (GONE_GRACE_MS) so they don't strobe/teleport at the screen edge.
- Connectors/pins are redrawn every animation frame and eased toward
  the element's latest frame; describe polling is a single fast
  self-scheduling loop (separate from the variants poll, 4s-bounded),
  so the line follows the stream instead of jumping once per poll.
- Comment textareas are typable again: shouldForwardKey now excludes
  TEXTAREA/contentEditable (the key-forwarder was preventDefault-ing
  every keystroke); Escape no longer discards an in-progress comment.
- Bubbles spring with a small spawn velocity into element-anchored,
  de-overlapped homes (no more rigid column); blur + bubbles use
  bouncy spring transitions. Removed dead .offscreen connector CSS.
… connectors + propose-variants skill

- Bubbles: replace the fast/jiggly per-frame spring with a fixed-timestep
  time-based damped spring (w=4.2, z=0.9) so cards drift in like a cloud;
  gentler low-energy spawn velocities.
- Spotlight: replace the 4 overlapping blur+dim strips with ONE
  backdrop-filter layer masked by a clip-path even-odd donut (constant
  segment structure -> smooth transition), eliminating the compounding
  dark/bright overlap flicker.
- Connector: drive rec.anim with a fast zero-overshoot spring so the line
  glides continuously between describe samples; add an interruptible
  describe loop kicked by device interaction (sendCmd) on top of the
  fixed background cadence.
- Harden: clear currentUdid on stream teardown; only kick describe when
  something is anchored to the stream.
- Add argent-propose-variants SKILL.md (plan -> build -> navigate ->
  screenshot -> propose_variant -> repeat -> one await_user_selection)
  and its skill-routing entry.
…ect-fit previews, full-screen no-blur

- Variant card shows only name + summary + preview image; drop the raw
  code/filePath dump (and the now-dead .vcode style); cardSig no longer
  rebuilds the card on code/filePath-only changes.
- layoutCards distributes cards in two columns flanking the central
  stream (element-aligned when matched, even-spread + main-centre
  fallback otherwise); removes the !streamRect early-return that left
  every card piled at the (0,0) origin.
- Variant row stacks image-above-text (column) unless the cropped
  preview is >=1.2x taller than wide; the preview now fills the bubble
  (full content width in column, generous height in row) instead of a
  fixed 96px chip.
- .vcard-body reserves a stable scrollbar gutter so the row width is
  invariant — JS-sized previews can't overhang or oscillate.
- Comment mode with nothing hovered now leaves the whole screen
  un-blurred (the no-blur hole spans the stream) instead of collapsing
  to a centre point and blurring everything.
…ills, animated annotation popover

- Ring now shares .spot's exact clip-path transition (0.28s, same easing) so the blue frame and the no-blur hole move in lockstep (was 0.26s easeOutBack overshoot).
- pointerleave on the stream rect resets the spotlight to whole-screen no-blur instead of freezing on the last element (locked comment target excepted).
- pointermove only re-targets on an actual element hit; thin gaps between elements hold the last highlight instead of snapping full-screen and back.
- Off-screen variant cards now leave a minimal non-interactive top/bottom pill ('N choices off-screen') with scroll direction.
- Saved-comment pins: native confirm() replaced by a fade/slide-in popover above the dot (comment text + Edit + Remove); pins are spring-eased like connectors so they track every frame instead of snapping between describe polls.
- Comment + annotation-edit textareas: Cmd/Ctrl+Enter submits (IME-safe); plain Enter still newlines.
- Structural-pass reconcile + deterministic vpResetOverlay teardown so the popover can never wedge open across a round change or stream re-attach.
- Drop the grey variant summary from cards (removed .vrow .vs, the vs div, and v.summary from cardSig). Preview thumbnail + variant name remain; summary still stored server-side and delivered to the agent via await_user_selection.
- Bubbles are now throwable: the header-drag handler tracks an EMA of pointer velocity and, on release, hands it (clamped ±1500 px/s) to the existing BUB spring so the card flies briefly then damps to rest — same toss feel as a bubble spawning in.
…l untouched)

The preview UI's element tree (variant-card anchoring + comment spotlight) was the interactive-only iOS ax-service subset, so non-accessibility containers couldn't be anchored/highlighted and crops misaligned.

Add buildRnPreviewTree: walks the React fiber tree via the same makeComponentTreeScript inner logic as debugger-component-tree but WITHOUT the agent-readability pruning, adapting it into a nested DescribeNode tree (normalized frames; name/accLabel/testID/text -> role/label/identifier/value; nesting preserved through dropped wrappers via parentIdx). GET /preview/describe prefers it and falls back to the regular describe tool when no usable RN frames are available — never a regression. The describe tool itself is unchanged. 6 unit tests; full suite 674 green.
…ic measure)

The first cut routed through makeComponentTreeScript, whose own measurement path yields null rects on some apps (every node collapsed to screen-centre) even though the Fabric measure primitive works fine — so the preview tree came out empty and fell back to ax.

Replace it with a self-contained injected script (plain Runtime.evaluate) that walks the fiber tree and measures every host via nativeFabricUIManager.measure (page coords) + reads RN Text via accessibilityLabel and TextInput via value/placeholder. Proven live on the real app: 919 nodes, correct screen dims, real on-screen frames incl. non-accessibility container views (the exact thing the iOS ax tree lacks). describe tool still untouched; null-then-fallback on any failure (no regression, no 500); defense-in-depth against malformed payloads. 7 unit tests; suite 675 green; verification agent PASS.
…ibe tree

The RN fiber-tree path let the comment spotlight anchor to off-screen /
scrolled-away elements and only reliably resolved text nodes. Revert to the
describe-based approach.

main #197 made the `describe` tool's public output a token-efficient text
rendering (the JSON tree is dropped before reply). The preview UI reads
`j.tree`, so /preview/describe now calls the same per-platform adapter the
describe tool uses (describeIos / describeAndroid), minus the text formatter,
and returns the structured DescribeTreeData ({ tree, source }). The describe
tool itself is left untouched.

- delete tool-server/src/tools/describe/preview-rn-tree.ts + its test
- /preview/describe: resolveDevice -> describeIos|describeAndroid, json(tree)
 text

Adds the route's first coverage (none existed; the deleted fiber-tree test
only covered the removed builder). Mocks describeIos/describeAndroid and
asserts: iOS udid -> describeIos and the body is the structured
DescribeTreeData ({ tree, source }) the UI reads as `j.tree` and explicitly
NOT main #197's text `{ description }`; Android serial -> describeAndroid
(dispatch parity with the describe tool); should_restart forwarded; adapter
throw -> 500 { error }. This is the tripwire for re-introducing the exact
regression this revert fixes.
…ng…"

When more than one simulator is booted, refreshSimulators() only populated
the small toolbar dropdown; the main viewport stayed frozen on the initial
"Connecting…" placeholder, so it looked hung with no obvious way to choose.

Render a device chooser into the viewport whenever there are multiple (or
zero) choices and nothing is connected yet: booted devices as clickable
rows (name + runtime), non-booted ones listed disabled, plus a Refresh
button and a clear empty state. The toolbar dropdown stays as the in-stream
switcher; the picker is skipped when a stream is already live (so ⟳ while
connected never clobbers it). Initial placeholder reworded to the accurate
"Loading simulators…".
Every colour, pixel, radius, duration, easing, z-index and opacity now lives
in packages/ui/theme.css as a semantic, purpose-named token (named for what
it IS, not an abstract scale; shared only when the same design decision).
index.html's <style> block contains zero hardcoded visual values — only
var(--token) — and links the theme. Values are byte-identical to before:
this is a pure refactor, no visual change.

The motion/feel constants the physics & timing JS needs (spring ω/ζ, card
gone-grace, poll cadence, tree-staleness, flick cap, card width) are read
from theme.css once at boot via a small THEME bridge, with fallbacks equal
to the prior literals — so the script has no hardcoded look/feel either, and
a missing/unloaded theme.css cannot change behaviour. Protocol/behavioural
constants (HID keycodes, EMA smoothing, network timeouts, layout clamps) are
left in JS — they are not styles.

- packages/ui/theme.css: new single source of truth (:root tokens)
- index.html: <link> + full var(--token) rewrite + THEME bridge
- preview.ts: findUiHtml → findUiFile(name); add GET /theme.css (text/css)
- bundle-tools.cjs: ship index.html AND theme.css (prod parity)
…off, hover comments

- describe hit-test (vpNodeAtPoint): drop the text gate so icon-only
  pressables are selectable; innermost wins via smallest-area; on-screen
  guard mirrors vpMatchNode. Skip the describe ROOT node (the full-stream
  "AXGroup"): it contains every point, so otherwise it was the smallest
  match over gaps — whole screen as focus, no ease-off, swallowed nested
  pressables. Removes dead nodeText().
- theme.css: brand sky-blue accent #9bc7f0 (+ derived touch rgba/glow)
  and softer brand geometry (radius-card 14, button 8, input 7, item
  10, control 6, chip 4). Dark base kept — already on-brand.
- spotlight: losing a hovered element while still on the stream holds
  the focus box on the last element for a short grace
  (--spot-idle-dismiss-ms, 200ms) then releases; leaving the stream rect
  resets immediately. Shared armSpotIdle / clearSpotIdle.
- pending choices: split the off-screen reminder into two indicators —
  ▲/▼ pill on the stream edge stays a TRUE scroll-direction cue (only
  for seen-then-scrolled elements); a new non-directional "◌ N hidden"
  badge in the top toolbar surfaces never-matched proposals, so a
  pending choice whose element isn't on the device screen no longer
  misleadingly suggests "scroll down to find it".
- comment pins: open the detail popover on hover (no click) with a
  pin<->popover grace timer (--annpop-hover-grace-ms); editing/focus
  suppresses auto-close; Esc / outside-click still dismiss.
Before a stream connects, vp.tree is null and every proposal trivially
fails to match — surfacing 'N hidden' in that state suggested the user
had work they couldn't see when in fact we just hadn't started looking
yet. Gate the badge on streamRect the same way the directional ▲/▼
pills already are.
iOS AX labels for tab-bar buttons and many toolbar items embed each SF
Symbol glyph as a BMP private-use code point. The Map tab arrives as
'<U+F442>, <U+F443>, Map' but the proposer (LLM, doc snippet, human)
sees and writes 'Map', so strict equality on by:'label' silently
failed and the bubble stayed hidden.

vpNormLabel strips BMP private-use (U+E000-U+F8FF) and collapses the
leftover comma/whitespace runs so both forms normalise to 'map'. Apply
the same normalisation on both sides of the comparison, allow substring
on by:'label' so a clean needle still hits a wrapped element, and
guard the empty-needle case (substring of '' matches everything).
Two related bugs the user hit on a screen with two bottom-bar elements
matched (Map tab + Compare tab):

  1. Cards always rendered at the fixed --vcard-body-max-h: 320px
     regardless of variant count — a single-variant proposal got the
     same chunky body as a five-variant one.
  2. Stacks anchored to elements near the viewport edge ran off the
     bottom; the second card was invisible until the user scrolled the
     page (which the layout shouldn't require).

Drop the fixed body cap (theme.css → 'none') and do the sizing in
layoutCards each frame:

  • Phase 1 — clear inline maxHeight so each card reads its true
    natural height (head + body content + comment row).
  • Phase 2 — per column: anchor each card to its element and
    de-overlap. If the resulting stack overflows the bottom, shove the
    whole column up. If even the shoved stack doesn't fit, switch to
    compress mode: distribute the available height proportionally to
    each card's natural height, floored at MIN_CH for tappability.
  • Phase 3 — only the compressed cards get a body maxHeight (subtract
    the full chrome — head, comment row, border — not just the head),
    so the internal scrollbar appears strictly when there's no other
    room.

Result: cards always fit their variant count when there's space, and
the long-stack case scrolls inside each card instead of falling off
the viewport edge.
Three bugs the user hit:

  1. Dragging the inline comment textarea's CSS resize handle on one
     card mutated the column's natural-height sum, so my new auto-fit
     packer compressed the SIBLING cards to make room. The handle was
     never meant to be user-facing; drop it (resize: none).
  2. Each layoutCards call cleared body.style.maxHeight before
     measuring then re-set it. That round-trip mangles scrollTop and
     re-triggers the scrollbar appearance every poll, manifesting as
     'scrolling is glitchy' as the user scrubbed.
  3. spotlight idle-dismiss was a touch eager — bumped from 200 ms to
     260 ms (+30%) per direct ask.

Cleanups:
  • body.scrollHeight always reflects the natural content height
    regardless of an active max-height — read it directly instead of
    the clear-and-remeasure dance.
  • chrome (head + comment row + border) is invariant once a card is
    built, so cache it on rec.chromeH; refreshed only on a rebuild
    (cardSig change → fresh rec → null cache).
  • The final maxHeight write is diffed against the inline value so
    unchanged frames are no-ops — preserves scrollTop mid-scroll.
…side

A screen with every matched element on the right half of the device
(e.g. both Map and Compare tabs near the right of the bottom bar) used
to stack every card in the right column and leave the left empty.

Keep the element-anchored initial side, then do a cheap rebalance:
while |L − R| > 1, move the card on the larger side whose own
sideBias points least into that side — so a card pinned dead-centre
flips before one near the edge. Result: each column stays within one
card of the other, the per-element anchoring is preserved where it
mattered, and connector lines still reach the right place on the
device.
The header drag handler placed the card so the pointer sat at
(width/2, 12px) on the first move event, snapping the card under the
cursor before the user even moved. Capture the pointer offset at
pointerdown and apply that translation throughout, so the card stays put
and follows the pointer's delta.
Two unrelated issues uncovered during the variant-overflow test.

* `.vcard-body { scrollbar-gutter: stable }` was reserving ~15px on the
  right of every body so the row-hover background didn't extend to the
  card edge. Drop the gutter; include the row's current clientWidth in
  applyThumbCrop's cache key so a scrollbar appearing/disappearing
  re-fits the preview instead of overhanging.
* Dragging a card past the viewport edge left it stranded off-screen.
  On drop, clamp `home` to the closest in-bounds point and add an
  inward velocity kick — the spring then carries the card back with
  momentum rather than snapping. `layoutCards` also clamps a pinned
  card's pos when adopting it as home, so a layout pass mid-flight
  can't re-fix the home to an out-of-bounds value.
Three usability nits surfaced during testing:

* SF-Symbol private-use glyphs in iOS AX labels (tab bar items etc.)
  leaked stray commas into headers like `Comment on , , Favourites (6)`.
  Add a display-side `cleanLabel` that strips U+E000-U+F8FF and
  collapses comma/space runs while preserving case; use it in
  `nodeDesc` and the per-card comment placeholder.
* The floating Complete-selection footbar covered the bottom of any
  card whose comment box landed low, and could overlap the comment
  popover when a near-bottom element was selected. Clamp the column
  bottom in `layoutCards` to the footbar's top, and position the
  popover above the footbar based on its actual offsetHeight rather
  than a 180-px guess.
* Staging a variant left every other variant taking up the screen.
  Collapse non-staged rows to a single-line name when something is
  staged (still clickable to switch), and make `vpStage` toggle so
  re-clicking the staged row unstages and brings everyone back.
Mirrors the spotlight click. Trackpad two-finger taps are a natural pick
gesture and the previous handler only suppressed the native menu without
acting on it. Falls back to vpNodeAtPoint if no element is currently
focused so a right-click without a prior pointer move still works.
The index references theme.css with a relative URL. When the page is
served at /preview (no trailing slash), browsers resolve theme.css
against /, hitting /theme.css (404) instead of /preview/theme.css.
Canonicalise to the trailing-slash form so the relative sub-resource
loads regardless of which URL form is typed.
Two issues conspired so right-click looked indistinguishable from a
normal device tap:

- The stream's pointerdown handler didn't filter by mouse button, so
  button:2 sent a Down to the device just like button:0.
- The contextmenu handler short-circuited unless inspect mode was
  already active, leaving nothing in place of the native menu.

Filter pointerdown to button:0 so right- and middle-click no longer
drive the device. Make contextmenu always activate inspect mode and
pick the node under the cursor; when the describe tree isn't loaded
yet, queue the click and let the next describe tick complete the pick.
Replace the fixed 260ms lost-element grace with a velocity-aware rule:
hold the spotlight as long as the pointer is heading toward another
highlightable (regardless of travel time), and only arm a 300ms release
when the cursor is stopped or drifting at nothing reachable. A watchdog
re-evaluates after the velocity sample window so a flick that ends
mid-flight still decays into the stopped branch.

The four feel constants (idle, window, stopped-speed, reach) live in
theme.css next to the existing spotlight tokens.
The earlier loop walked newest → oldest and assigned `early` BEFORE the
break check, so it picked the first sample one step OUTSIDE the window
rather than the last one inside. With 50 ms-spaced samples and an
80 ms window, the velocity was averaged over 100 ms instead of 50 ms,
slightly over-smoothing direction changes. Walk oldest → newest and
take the first sample whose age is within the window.

Also refresh a stale comment in `vpNodeAtPoint` that still mentioned
the old "1s ease-off" release model.
latekvo and others added 19 commits June 3, 2026 17:09
Switch the accent-coloured hairline on .comment-pop and .ann-pop to
--color-border so the two comment-related modals sit on the surface
without the sky-blue ring around them. The accent-coloured span inside
the "Comment on <X>" heading is left alone — it's not part of the
outline.
The trajectory predictor was unforgiving on near-misses: a cursor heading
roughly toward a target but not perfectly intersecting its frame would
flip to "aimed at nothing" and arm the dismissal timer. Inflate the AABB
by `--spot-traj-hitbox-margin` (0.2 = 20% per side) inside the trajectory
ray test so anything within a fingertip of a real frame still counts as
aimed-at.

Only `rayHitsFrame` (called from `trajAimsAtTarget`) honours the margin;
`vpNodeAtPoint` keeps its strict containment so the inflated hitboxes
can't steal hover focus from the actually-hovered element. The margin
is purely about delaying dismissal while the cursor travels through
gaps — it never decides which element wins.
- Connected ws now clears the toolbar status (was 'live' + pulsing dot);
  error states still render their red dot and label.
- Footbar moves from centred-bottom to the bottom-right corner, stacks
  the 4-row comment textarea above the Complete-selection button, and
  drops the fixed 32px input height so rows determines it.
- Adds a Clear button next to Complete in the footbar (now in a flex
  row); the existing primary action is renamed from 'Complete selection'
  to 'Complete'.
- The new Clear and the existing annotation Remove buttons share a
  ghost-danger look — page-bg fill with a hairline outline, turning red
  border + red text on hover, never solid red.
- Both Remove and Clear now route through a shared confirm modal
  (#confirm-modal) before mutating state. Modal supports backdrop click
  and Escape to cancel; Confirm focuses by default for Enter-to-confirm.
- vpClear resets staged picks, per-element comments, free annotations,
  and the global comment textarea.
Adds packages/preview-window, an Electron host for the variant-selection
preview UI. The tool-server spawns it on demand when an
`await_user_selection` parks, reuses the same child across multiple
awaits in one round (sends a foreground IPC over stdin), and asks it to
play an animated close + exit when a selection is submitted.

- packages/preview-window: frameless BrowserWindow, height squeezes
  from 1px to the target on open and back on close via setBounds()
  with a cubic-out ease, centred so the iris stays anchored. URL is
  passed via ARGENT_PREVIEW_URL env so no extra ports are introduced.
- VariantProposalStore: two new events, awaitParked + selectionSubmitted,
  fired only on real park / successful submit (fast-path returns are
  silent). Covered by new tests.
- tool-server bootstrap: wires the manager to those events. URL derived
  from the actual listening port via server.address() — production
  launcher already uses findFreePort, so the whole chain stays ephemeral.
- IPC channel: line-delimited JSON over the child's stdin
  ({cmd:'foreground'|'close',url?}). The Electron child also exits when
  stdin closes so a tool-server crash doesn't strand a window.
- main.ts createWindow: await loadURL in a try/catch and `app.quit()` on
  rejection; otherwise an unreachable URL (tool-server gone, stale port)
  would strand an invisible BrowserWindow keeping the Electron event
  loop alive.
- main.ts foreground: attach a .catch to the reload loadURL so its
  rejection is logged instead of silently swallowed.
- preview-window.ts manager: clear `child` in the `error` handler too
  (not just `exit`). spawn ENOENT / EACCES arrives asynchronously after
  the handle has been assigned, so a follow-up ensureOpen would otherwise
  see `exitCode===null && !killed` and no-op against a dead handle.
Three small feel improvements to the variant-selection preview UI:

- Sticky padded focus. Hovering an element lets the cursor drift up to
  --spot-traj-hitbox-margin (20% per side) outside its real frame before
  the spotlight is released. Strict containment still wins so margins
  never steal focus from an obviously-hovered element. The same token
  drives the trajectory predictor's hitbox, so both forgiveness knobs
  move together. New helper pointInPaddedFrame() sits next to
  rayHitsFrame() and is the single source of the inflate maths.

- Green 'Selection sent' toast. On a successful Complete the renderer
  pops a centred green card (--color-ok, scaleY in via --ease-pop) over
  the existing footbar note. New --success-toast-show-ms token controls
  how long it lingers before auto-dismissing in browser mode.

- 1s grace before the Electron host closes. tool-server's
  selectionSubmitted listener now defers the close cmd by
  PREVIEW_CLOSE_DELAY_MS so the toast is fully visible before the
  window starts squeezing closed. The agent's await is resolved
  immediately as before — the delay is purely cosmetic.
Reworks the open/close animation to keep the OS window at a fixed
1200×820 and instead animate a CSS transform on <html> inside the
renderer:

- BrowserWindow is now transparent: true + hasShadow: false +
  backgroundColor: '#00000000'. The OS window is just a passthrough;
  what the user sees as 'the window' is whatever the renderer paints
  inside the squeezed area.
- Animation is a 320ms cubic-out transform: scaleY transition on
  <html>, GPU-composited. The previous setBounds-driven approach
  (whether manual ticks or Cocoa's animate:true) had to relayout the
  heavy preview UI on every frame, which is what surfaced the visible
  framerate. With a CSS transform the only per-frame work is a
  compositor matrix multiply.
- Close handshake is event-driven: executeJavaScript returns a Promise
  that resolves on the 'transitionend' event (with an 80ms safety
  timer), so app.quit() runs the instant the animation actually ends.
  No more 3–5px residual sitting around from a fixed setTimeout that
  outlived Cocoa's variable-duration native animation.
- Injected host CSS forces html + body to transparent and moves the
  --color-bg fill to #root. Without this the renderer's root
  background propagates to the page canvas (CSS Backgrounds L3), which
  paints the whole viewport dark grey even with transparent: true on
  the BrowserWindow — squeeze content sat on top of an unchanging dark
  rectangle. The host CSS runs before win.show() so the first painted
  frame is already collapsed + transparent, no flash.
…close

Three correctness fixes from a self-review of the previous two commits:

- Wrap every webContents.executeJavaScript call in runInRenderer() with
  a try/catch + stderr log. The three call sites (prepareSqueeze,
  squeezeIn, squeezeOut) are all invoked fire-and-forget via `void fn()`,
  so a renderer crash or a webContents destroyed mid-animation would
  otherwise leak as an unhandled rejection and take down the Electron
  main process. closeWithAnimation no longer needs its own try/catch
  because squeezeOut never rejects now.

- Cancel the post-submit close timer when a new await parks. Without
  this, if the agent submits and then propose_variant + await_user_selection
  again within PREVIEW_CLOSE_DELAY_MS, the previous round's pending close
  fires after the new window has already opened and squeezes it away
  under the user. Verified end-to-end: window survives a 200ms
  submit→re-await sequence.

- Replace a meaningless 'auto-dismisses after spotTraj* feel' comment in
  showSuccessToast with the actual token name (THEME.successToastShowMs).
Surfacing a failure mode that wasn't called out by name. The golden rule
already said 'one variant = one real screenshot', but it left room to
rationalize reusing the same file across two variants (or pushing ahead
with byte-identical captures when the variant didn't actually apply on
the device). Now spelt out:

- Golden rule explicitly forbids pointing two variants at the same path
  and degenerating to identical thumbnails.
- Step 2.4 'Screenshot' tells the agent to shasum-diff the new capture
  against the previous one when the screen looks unchanged, and to fix
  whatever's broken (variant didn't apply, bundle didn't reload, etc.)
  before proposing.
- Rules section gets a new 'Distinct screenshot per variant' bullet:
  if you can't produce visibly different captures (read-only app, AX
  broken, hot-reload dead), STOP and tell the user instead of staging
  duplicates.
When a variant is staged its siblings used to snap to the tightened
'one-line summary' layout via display:none on the thumb + a direct
:has() rule rewrite. Animatable now:

- .vrow transitions padding / gap / font-size / color over
  --dur-card-move + --ease-out.
- .thumb shrinks via width / height / border-width / opacity instead
  of display:none — display:none is not animatable.
- Dropped the flex-direction: row forcing in the :has() rule. Portrait
  rows that were stacked stay stacked through the collapse so there's
  no mid-animation direction flip.
- bundle-tools.cjs: bundles packages/preview-window/src/main.ts as
  dist/preview-window/main.cjs (CJS, electron externalised). Lands next
  to the tool-server bundle so the spawn helper can find it via
  __dirname without needing @argent/preview-window as a published peer.
- @swmansion/argent: declares electron as a runtime dependency so
  `npm install -g` pulls in the Electron binary alongside.
- preview-window.ts: resolveMainScript prefers the bundled
  __dirname/preview-window/main.cjs path (published bundle layout),
  falls back to require.resolve('@argent/preview-window/dist/main.js')
  for workspace ts-node runs.
- Bumps every workspace package from 0.8.1 to 0.9.1.
…o polish

- Gate propose_variant/await_user_selection behind the variant-selection flag
  (dynamic HTTP-layer gate via ToolDefinition.featureFlag, re-read per request
  so `argent enable` takes effect without a tool-server restart).
- Port the feature-flag system (flags.ts + enable/disable/flags CLI) from main
  and register the variant-selection flag; add flag + gate tests.
- Fix the preview window not launching in the published package: externalize
  electron in the tool-server esbuild bundle (was inlined, threw at eval).
- Make the frameless preview window draggable (toolbar drag region).
- Per-variant crop frame: propose_variant accepts variant.frame so each
  thumbnail crops to its own (re-laid-out) bounds.
- Stop surfacing the web /preview/ URL to the agent; correct the stream-error
  message to name the streaming cargo feature.
- Demo: disable collapse-on-select, non-selected image hiding, and card bounce.
- Bump all packages to 0.9.3; regenerate the lockfile.
- Broaden the variant-image serving allowlist to include /tmp in addition to
  os.tmpdir() + cwd. On macOS os.tmpdir() is a per-user /var/folders path, so
  agents that drop screenshots under /tmp (common) 404'd → "No preview". A
  sensitive non-temp path is still rejected (verified).
- Reinforce the skill: never hand-crop / re-encode / copy the screenshot to a
  custom folder (e.g. crop.py into /tmp/variants) — pass the raw full-screen
  path plus variant.frame; the preview window does the cropping.
tool-server and argent now import @argent/cli (feature-flag gate). The
root solution tsconfig drives 'tsc --build' order, and a clean CI build
compiled tool-server before argent-cli's declarations existed -> TS2307
'Cannot find module @argent/cli'. Move argent-tools-client and argent-cli
ahead of tool-server (preserving tools-client < argent-cli), and add the
matching project reference on tool-server. Clean 'tsc --build' is green.
0.9.3 was a temporary tag used for local debug installs; the feature
branch should not carry a version bump (that happens at release). Restore
all workspace packages to 0.9.0, matching origin/main, and regenerate the
lockfile (version-only changes).
The tool-server bundle externalizes electron (it can't be inlined — its
postinstall resolves a binary via a __dirname-relative path.txt), so
`require("electron")` must resolve at runtime. @swmansion/argent never
declared electron, so a clean `npm i -g` produced an install where the
preview window's `require("electron")` threw MODULE_NOT_FOUND — the
window silently failed and await_user_selection parked with nothing on
screen. Declare electron (^31.7.7, matching @argent/preview-window) as an
optionalDependency so it ships with the package while keeping argent
installable if electron's binary download fails.
…e close

- Open the window for a specific device: propose_variant takes an optional
  udid (stored per round, surfaced in the snapshot); index.ts appends it as
  ?udid to the preview URL and the UI connects straight to that device. No
  more simulator chooser when the agent already knows the target.
- Remove the top toolbar entirely: drop the simulator picker, the reload
  button, the "SIMULATOR" label and the "round N" badge. The frameless
  window now drags by its empty background (#main is the drag region; every
  interactive overlay opts out with -webkit-app-region: no-drag).
- Move "N hidden" and the status indicator into floating top-left pills
  (status auto-hides while idle).
- Make "Add comment" a larger floating pill and add a matching "Close"
  button. Close posts to a new POST /preview/close route -> store emits
  closeRequested -> the tool-server animates the window shut (reliable under
  the sandboxed BrowserWindow, unlike renderer window.close()).
- Document the udid param in the argent-propose-variants skill.
@latekvo latekvo force-pushed the feat/propose-variants branch from 66dd66d to 63222c4 Compare June 3, 2026 15:13
latekvo added 10 commits June 3, 2026 18:30
The rebase left the branch un-buildable: a committed merge-conflict marker in
tool-server/package.json (invalid JSON -> npm/CI fail), preview.ts calling
describeAndroid(udid) with one arg instead of (registry, udid) -> TS2554 plus a
test pinned to the wrong call, and unindented "version" lines across package
manifests (-> Format CI fail). Resolve the conflict (keep @argent/cli, dedupe
@argent/registry), restore the 2-arg describeAndroid call + its test, and
prettier-format the manifests.
…ose time

Subsequent variants were mis-cropped: when the agent didn't pass variant.frame,
every variant fell back to ONE frozen per-element frame (captured at first
match), so the first thumbnail looked right but the rest cropped to a stale
position once the element moved. propose_variant now describes the device at
propose time (the variant is on screen then) and matches the element, giving
each variant its own frame — the same describe->frame flow, run per variant.
Best-effort: a failed/empty describe leaves the frame undefined and the UI
falls back, so propose_variant never fails over it. Agent may still pass
variant.frame to override. Adds a server-side matcher mirroring the UI's
vpMatchNode, with unit tests.
Brings main up to 0.10.0 (#301) plus boot-device/android-binary/linux
and `argent server start` work. Only conflict was package-lock.json,
resolved by regenerating it against the merged manifests. Resolves the
PR's DIRTY merge state so pull_request CI can run.
…cards from the stream

Three fixes to the frameless preview window:

- Draggable by background. -webkit-app-region is union(drag) − union(no-drag),
  so the full-cover no-drag .cards layer was cancelling the whole region (only
  the top resize edge survived, which then dragged instead of resized). Replace
  #main's drag with a dedicated full-bleed drag LAYER inset by --resize-edge
  (so the window edges still resize) and pointer-events:none (clicks fall
  through); only the real .vcard chips opt out, not the .cards container. Also
  clear the lingering transform:scaleY(1) the squeeze-in leaves on <html> — a
  transformed ancestor breaks app-region hit-testing entirely.

- No more bouncing/resizing loop. .vcard-body's space-taking scrollbar toggled
  the body width, which busted applyThumbCrop's width-keyed cache → re-fit →
  re-toggle → forever. Hide the scrollbar (scrollbar-width:none) so the width
  is constant; the body still scrolls by wheel/trackpad and there's no gutter
  strip.

- Cards repel from the simulator stream. Each card's spring home is now clamped
  out of the streamed-phone rect (homeOffStream), sliding it past the nearer
  side — the same way homes are inset from the window edges — so a narrow
  window or a card dropped on the phone glides off it instead of covering it.
…put while dragging a card

- homeOffStream now keeps a card at its in-bounds home (overlapping the stream)
  when neither side of the phone has room, instead of shoving it off-screen —
  border-avoidance is the stronger pull of the two.
- Ignore stream pointerdowns while a card or variant chip is being dragged, so a
  drag that crosses or releases over the simulator never taps the device.
…bels, keep cards' relative position on resize

- Mute the stream via CSS pointer-events:none for the whole duration of a card/
  chip drag (a #root.drag-muting class toggled every frame), held one frame past
  release. The per-handler guard could still race; this can't — no pointer event
  reaches the stream at all, so a drag crossing or releasing over the phone
  never taps it.

- Disambiguate elements that share a label (e.g. a "Favourites" tab AND a
  "Favourites" header). vpMatchNode now takes an anchor — seeded from the
  propose-time frame and updated to the matched element each frame — and picks
  the nearest candidate within range. It follows the target as it scrolls, and
  when the target leaves the screen the bubble goes off-screen instead of
  snapping onto the other same-labeled element.

- Keep floating cards in their relative place across a window resize: their
  left/top is absolute px, so scale each card's live + home position by the
  size change before re-laying out, instead of letting them peg to the edge.
A scrollable child can teleport across the screen in a single frame on a
fast scroll or a pull-to-refresh and still be the same element, so the
per-frame distance gate wrongly dropped it. Switch the shared-label
matcher to gate candidates by SIZE — an element keeps its width/height as
it scrolls, while a same-labeled element of a different kind (a tab vs a
header) is a very different size. Position becomes a tie-breaker only when
two same-sized, same-labeled elements are on screen at once. When the real
target leaves the screen the remaining same-labeled element fails the size
gate, so the bubble still goes off-screen instead of snapping onto it.
…, collapse non-selected variants

Identity certainty (issue 1): size disambiguation alone re-homed the bubble
onto a same-text impostor once the real target left the screen (a 'Favourites'
header gone → its card snapped to the 'Favourites (5)' tab). Lock an identity
descriptor (role / a11y-id / label, captured from the first size-bootstrapped
match) and require every later frame to match it; when nothing does, the card
goes off-screen instead of choosing a most-likely stand-in. Label matching
tolerates a dynamic counter suffix.

Topmost inspector pick (issue 2): the element picker chose the smallest-area
node under the point, so an element painted ON TOP of a smaller one was
un-selectable. Walk paint order and keep the last hit → the topmost element
wins; children still beat containers and near-full-screen wrappers are filtered.

Collapse-on-select (issue 3): re-enable the collapse that was disabled to kill
an old bounce loop (now fixed via the hidden scrollbar). Picking a variant
forces every other row in that card to a side-by-side layout with the preview
shrunk to ~3 lines of text tall, animated by the thumb width/height transition.
The collapsed variant preview was hard-locked to 3 lines tall with its width
clamped independently, so the box aspect diverged from the crop and letterboxed
— reading as a re-crop. Treat 3-lines-tall and 50%-wide as MAX bounds and fit
the crop inside preserving aspect (box sized to the crop's ratio), so the
collapsed preview is just the expanded crop shrunk.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant