Feat/restore observability panel#591
Conversation
- Header: add Observability button (pulse icon) that opens the existing OperationsPage (All Operations / Performance / Logs / Errors tabs). Entry point was orphaned after the v10 UI rewrite; page itself and its PanelCenter wiring were already intact. - PanelBottom: replace fixed 200px bottom panel with a fully resizable, draggable panel (vertical drag handle on top edge, persisted height via layout store). Adds a maximise/restore toggle (80vh overlay). Height is stored alongside leftWidth/rightWidth and survives reloads. - Transactions tab: wired from existing /api/operations (publish/update ops that reached the chain phase) — on-chain activity without any new backend routes. Expandable rows show tx hash, peer, phase waterfall. - Gossip tab: live-filtered view of the node log showing only libp2p / gossipsub / peer / SWM lines. Keyword list is broad enough to catch relay, DHT and protocol events. - Node Log tab: adds level filter (error/warn/info/debug), pause button, auto-scroll that respects manual scroll position. Note: all three tabs are currently backed by the local SQLite-backed /api/* endpoints. Once the OTEL telemetry stack is live these tabs will be replaced by Tempo trace / Loki log streams at the fleet level. Co-authored-by: Cursor <cursoragent@cursor.com>
OperationName in packages/core/src/logger.ts now covers:
publish, publishFromSWM, update, ka-update, query, resolve, connect,
sync, share, gossip, reconstruct, verify, init, system
Operations.tsx OP_TYPE_COLORS and OP_TYPE_DESCRIPTIONS were missing
the seven new ones (share, publishFromSWM, ka-update, reconstruct,
verify, init, resolve). All added with distinct colours and descriptions.
PanelBottom Transactions tab TX_OP_TYPES expanded from {publish, update}
to {publish, publishFromSWM, update, ka-update, reconstruct} — all op
types that can reach the chain phase and submit a tx.
Co-authored-by: Cursor <cursoragent@cursor.com>
- Strip ANSI escape sequences from all log lines (Node Log + Gossip) so raw color codes no longer appear as literal text - Remove pause/play button from Node Log toolbar (simplified to filter + level select only) - Extract useAutoScroll() hook shared by Node Log and Gossip - Gossip tab now shows only real libp2p events (Connection opened/closed, ProtocolRouter timings, Circuit relay, GossipSub, FinalizationHandler) instead of leaking general DKGAgent structured log lines - CSS: remove browser focus outlines on tab/toggle buttons, fix toolbar flex layout, add v10-log-level-select class, use pre-wrap + word-break on log lines so long lines wrap instead of overflowing, active tab now uses accent-blue underline Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
The v10 UI rewrite left Operations.tsx without corresponding stylesheet
definitions for its legacy v9 class names. Added:
tab-group / tab-item — horizontal tab bar with accent-blue underline
input / select.input — themed form controls with custom select arrow
data-table — striped/hoverable table with uppercase headers
badge / badge-{success,error,warn,info} — coloured type/status pills
empty-state / --compact / --rich — centred placeholder layouts
page-section / page-title — page wrapper and heading
card-title — section heading inside cards
filters — flex filter bar
phase-bar-wrap/seg — inline phase progress bars
tx-link-icon — subtle tx hash link styling
Co-authored-by: Cursor <cursoragent@cursor.com>
…ph_* Databases created before the v10 terminology rename still have `paranet_count` and `paranet_id` columns. INSERT statements targeting the new `contextGraph_*` names fail silently, preventing metrics and operations from being stored. Adds schema version 14 which detects the old column names via pragma table_info and renames them in-place. Co-authored-by: Cursor <cursoragent@cursor.com>
Adds OperationTracker instrumentation for DKG events that were previously untracked: PROJECT_SYNCED (sync), GOSSIP_MESSAGE (gossip), KC_CONFIRMED (verify), and KA_UPDATED (ka-update). These are event-based records (work completed in the core before the event fires), so they capture occurrence + metadata rather than multi-phase timing. Co-authored-by: Cursor <cursoragent@cursor.com>
- Reduce SNAPSHOT_INTERVAL_MS from 120s to 30s for more responsive hardware metrics in the Observability panel. - Handle 401 responses in useFetch by triggering a page reload so the server re-injects a fresh auth token after node restarts. Co-authored-by: Cursor <cursoragent@cursor.com>
- Split the old "Performance" tab into dedicated "All Operations" and "Hardware" tabs. Operations tab shows operation stats charts + the full operations list; Hardware tab shows live stat cards and time-series graphs. - Redesign MiniGantt component to display phase name pills with colored dots and durations inline (no hover required). - Show "event-based" label for operations without phases instead of a bare dash. - Fix header showing "**" instead of node name when agent identity has a placeholder name. Co-authored-by: Cursor <cursoragent@cursor.com>
| // Track sync completions | ||
| agent.eventBus.on(DKGEvent.PROJECT_SYNCED, (data: any) => { | ||
| try { | ||
| const ctx = createOperationContext("sync"); |
There was a problem hiding this comment.
🔴 Bug: PROJECT_SYNCED is emitted only after catch-up has already finished, so creating a fresh sync context here and completing it immediately records every sync as ~0 ms. That will skew the new latency/success views instead of reflecting the real sync cost. Start/finish the tracked operation from the actual sync entrypoint, or store this as a separate event type rather than an Operation.
| }); | ||
|
|
||
| // Track gossip messages | ||
| agent.eventBus.on(DKGEvent.GOSSIP_MESSAGE, (data: any) => { |
There was a problem hiding this comment.
🟡 Issue: GOSSIP_MESSAGE fires for every inbound GossipSub payload. Writing each one as a full operation row will grow operations at network-traffic rate and swamp the observability views on busy nodes. Consider aggregating/sampling gossip activity into metrics instead of tracker.start/complete per message.
| const ctx = createOperationContext("ka-update"); | ||
| tracker.start(ctx, { | ||
| contextGraphId: data.contextGraphId, | ||
| details: { kaUri: data.kaUri }, |
There was a problem hiding this comment.
🔴 Bug: the current KA_UPDATED emitters publish fields like ual, batchId, and rootEntities rather than kaUri, so these new ka-update rows will always lose the asset identifier you're trying to surface. Record data.ual here or normalize the event payload before sending it to the tracker.
| const [expanded, setExpanded] = useState<string | null>(null); | ||
|
|
||
| const load = useCallback(() => { | ||
| fetchOperationsWithPhases({ limit: '100', periodMs: String(6 * 60 * 60_000) }) |
There was a problem hiding this comment.
🟡 Issue: this calls fetchOperationsWithPhases from ui/api.ts directly instead of going through api-wrapper like the rest of the shell. In mock/offline mode the Transactions tab will fail while the other bottom-panel tabs still fall back cleanly. Route this through api.fetchOperationsWithPhases(...) for consistent behavior.
| <span><b>ID:</b> <span style={{ fontFamily: 'var(--font-mono)' }}>{op.operation_id}</span></span> | ||
| {txHash && <span><b>Tx:</b> <span style={{ fontFamily: 'var(--font-mono)' }}>{txHash}</span></span>} | ||
| {op.peer_id && <span><b>Peer:</b> <span style={{ fontFamily: 'var(--font-mono)' }}>{shortId(op.peer_id)}</span></span>} | ||
| {op.error && <span style={{ color: '#ef4444' }}><b>Error:</b> {op.error}</span>} |
There was a problem hiding this comment.
🔴 Bug: /api/operations exposes failures as error_message, not error, so failed transactions in this new panel will render without any reason. Read op.error_message here (or normalize the API response shape before rendering).
Summary
Restores the Observability page that was lost during the v10 merge, adds comprehensive operation tracking for all DKG event types, and includes a critical database migration fix for pre-existing nodes.
What's included
Observability UI (restored & enhanced)
Operation Tracking (daemon)
sync(PROJECT_SYNCED),gossip(GOSSIP_MESSAGE),verify(KC_CONFIRMED),ka-update(KA_UPDATED)query(with parse/execute phases),publish,connect,share(with validate/store phases)Database Migration (critical fix)
paranet_count→contextGraph_countandparanet_id→contextGraph_idpragma table_infobefore rename (safe for fresh DBs)Quality-of-life fixes
**from agent identityTest plan
contextGraph_*columns, Observability page loads correctlyparse+executephases visible in Operations tabsyncoperations appear**