Skip to content

Add normalized skill_invocation session event kind #31

@samzong

Description

@samzong

Background

Recall now stores structured session_events from AI coding sessions. The current event contract is already useful for command/tool/file analytics, but skill usage is still inferred indirectly.

Current observed event kinds in the local index include:

  • command
  • file_read
  • file_write
  • search
  • session_meta
  • tool_call
  • tool_result
  • turn

Skill usage currently appears in two incomplete ways:

  1. High-confidence Claude Code events where name = 'Skill' and attrs_json.skill is present.
  2. Medium-confidence traces where an event references a path like ~/.agents/skills/<skill>/SKILL.md.

This makes it possible to approximate skill usage, but not to answer product questions reliably across adapters.

Problem

Recall cannot currently represent "a skill was invoked" as a first-class event.

As a result:

  • Skill usage rankings are adapter-specific and inconsistent.
  • Codex skill usage can only be inferred from file reads, command args, or session text.
  • Reading a skill file is mixed with actually invoking a skill.
  • tool_call analytics are polluted with Skill as a generic tool name.
  • It is hard to correlate skill usage with outcomes such as tests run, corrections, token use, or project/session patterns.

Example current ambiguity:

  • pre-ship may appear thousands of times by path-reference / file-read traces.
  • verification-before-completion may be the top explicit Claude Code Skill call.
  • These are different semantics, but Recall has no normalized event kind to distinguish them.

Proposal

Add a normalized event kind:

kind = skill_invocation

This event should represent evidence that a skill was intentionally invoked or selected for a session/task.

Proposed event contract

For session_events rows with kind = 'skill_invocation':

Field Proposed semantics
kind Always skill_invocation
actor Usually user, assistant, or system, depending on who initiated/selected the skill
name Canonical skill name, e.g. pre-ship, codex-review, vibe-calibrate
target Skill definition path when known, e.g. /Users/x/.agents/skills/pre-ship/SKILL.md
status Optional: started, completed, failed, inferred, or source-native status if available
message_seq User/assistant message that triggered the skill when known
summary Short human-readable invocation summary, not the full skill body
source_event_id Source-native call id / line id when available
attrs_json Structured metadata for attribution and confidence

Suggested attrs_json shape:

{
  "skill": "pre-ship",
  "trigger": "explicit_slash",
  "confidence": "high",
  "evidence": "user_message",
  "raw_name": "Skill",
  "args": "--committed"
}

Suggested fields:

attr Meaning
skill Canonical skill name, duplicated from name for JSON consumers
trigger explicit_slash, explicit_dollar, tool_call, skill_file_read, session_text, inferred
confidence high, medium, low
evidence Where the attribution came from: tool_event, user_message, assistant_message, file_path, command_target, etc.
raw_name Source-native tool/function name, e.g. Skill
args Optional skill invocation args, if source exposes them
path Optional skill path, if known

Attribution confidence rules

High confidence

Create skill_invocation with confidence = high when any of these are true:

  • Source-native event explicitly records a skill call, e.g. Claude Code name = 'Skill' and attrs_json.skill exists.
  • User explicitly invokes a skill by name, e.g. /skill:pre-ship, $vibe-calibrate, or equivalent known syntax.
  • A source has a dedicated skill/tool invocation envelope.

Medium confidence

Create skill_invocation with confidence = medium when:

  • The agent reads a canonical skill definition path such as ~/.agents/skills/<skill>/SKILL.md and the read occurs near the start of a task or shortly after a user request.
  • A command/tool target references a skill path in a way that strongly indicates loading that skill.

Low confidence

Do not create a default skill_invocation for low-confidence pattern matches unless we explicitly decide to support inferred analytics.

Potential low-confidence signals:

  • Session text mentions a skill name without invocation syntax.
  • Workflow resembles a skill but no direct skill evidence exists.

If low-confidence inference is ever added, it should be clearly marked confidence = low and excluded from default rankings.

Adapter notes

Claude Code

Current source evidence exists today:

  • session_events.name = 'Skill'
  • attrs_json.skill = '<skill-name>'

The adapter can emit skill_invocation directly instead of only a generic tool_call, or emit both if backward compatibility is desired.

Codex

Codex currently needs inference from:

  • User syntax such as $vibe-calibrate if captured in indexed messages.
  • Reads of /Users/x/.agents/skills/<name>/SKILL.md.
  • Tool/command args that reference a skill path.

Codex should start with high-confidence explicit syntax and medium-confidence SKILL.md path reads.

OpenCode and other adapters

If a source has a source-native skill/tool concept, map it directly.
Otherwise, support only path/syntax-based attribution.

Analytics enabled by this event

Once this exists, Recall can answer:

  • Top skills by invocation count.
  • Top skills by distinct sessions.
  • Top skills by project/source/time range.
  • Skill usage before/after corrections.
  • Skill usage correlated with command/test/build outcomes.
  • Skills that are frequently loaded but rarely explicitly invoked.
  • Skills that should be promoted, merged, deprecated, or converted into native Recall workflows.

Example future query:

SELECT
  name AS skill,
  COUNT(*) AS invocations,
  COUNT(DISTINCT session_id) AS sessions
FROM session_events
WHERE kind = 'skill_invocation'
  AND json_extract(attrs_json, '$.confidence') IN ('high', 'medium')
GROUP BY name
ORDER BY invocations DESC;

CLI / UX follow-up ideas

This issue only requires the event contract, but it should unblock later commands such as:

recall events --kind skill_invocation
recall stats skills
recall stats skills --source codex --time 30d
recall stats skills --json

Acceptance criteria

  • session_events.kind can contain skill_invocation rows.
  • At least one adapter emits high-confidence skill_invocation events from source-native skill evidence.
  • Codex supports at least explicit $skill-name / /skill:<name> style detection if present in messages, or documents why not.
  • Skill file reads can be represented as medium-confidence attribution when implemented.
  • attrs_json includes at least skill, trigger, and confidence.
  • Export JSONL includes skill_invocation events without schema breakage.
  • Tests cover at least one high-confidence and one medium-confidence case.
  • Documentation describes confidence semantics so analytics consumers do not mix direct invocations with inferred references accidentally.

Non-goals

  • Do not classify arbitrary workflow similarity as skill usage by default.
  • Do not parse or copy full SKILL.md bodies into event rows.
  • Do not mutate skill files or instruction files.
  • Do not make skill_invocation depend on a specific agent vendor.

Implementation sketch

  1. Add a helper in src/adapters/events.rs, for example:
pub fn skill_invocation_event(
    context: EventContext,
    skill: String,
    trigger: SkillTrigger,
    confidence: SkillConfidence,
    target: Option<String>,
    args: Option<String>,
) -> RawSessionEvent
  1. Keep kind = 'skill_invocation', name = Some(skill.clone()), and structured metadata in attrs_json.
  2. Update Claude Code parsing to map native Skill calls.
  3. Add Codex path/syntax attribution as a second step.
  4. Add regression tests in adapter tests and export tests.
  5. Consider a later recall stats skills CLI after the event contract lands.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions