Skip to content

feat: AI-assisted CSV transaction import (human-in-the-loop) #9

@toolpathguy

Description

@toolpathguy

Idea

Let the user import bank transactions from a CSV they export themselves, and have the AI turn arbitrary CSV layouts into hledger transactions. The user reviews and approves before anything is written.

This replaces an earlier "connect a bank via Stripe/Plaid" idea — the pivot keeps the project local and secure: no aggregator vendor, no stored bank credentials, no new auth/DB/webhook infrastructure. The user-supplied CSV is the only input; the only external call is to the Anthropic API.

Why AI instead of a fixed parser

Bank CSV formats vary wildly (column names, date formats, single vs split debit/credit columns, sign conventions). The AI maps an arbitrary layout to a normalized shape (date, payee, amount, inflow/outflow, suggested envelope) without a per-bank parser. That mapping is the feature's value-add.

Autonomy model — human-in-the-loop (HITL)

AI proposes transactions → user reviews/edits in a staging table (account, payee, envelope) → on approve, committed via the existing direct journal writer. Nothing is written without explicit approval.

Architecture fit

The commit path already exists and fits cleanly — amounts are handled in integer cents at the journalWriter boundary, and each row becomes a balanced entry via appendTransaction. The hard parts are parsing, dedup, and review.

Layers

  • server/api/import/parse.post.ts — receives CSV text, calls Anthropic (structured output) to extract normalized transactions; returns proposals. Holds ANTHROPIC_API_KEY.
  • server/api/import/commit.post.ts — approved transactions → journalWriter (or reuse transactions.post.ts per row), with dedup.
  • server/utils/anthropic.ts — shared client (from feat: AI budgeting chat with human-in-the-loop envelope assignment #8).
  • composables/useImport.ts + a Nuxt UI upload + review table.
  • Dedup: guard against re-importing the same row (e.g. hash of date+amount+payee, checked against existing journal entries).

Suggested scope

  • CSV upload UI + parse round-trip
  • server/api/import/parse.post.ts — AI extracts normalized transactions (structured output schema: date, payee, amount, direction, suggested envelope/account)
  • Review/staging table — user edits account + envelope + payee, approves/rejects per row
  • server/api/import/commit.post.ts — approved rows → journalWriter, balanced entries
  • Dedup against already-imported / existing journal transactions
  • Handle the messy cases: varied date formats, debit/credit-as-separate-columns, sign normalization, blank/uncategorized → land in Ready-to-Assign
  • Tests: parse normalization + dedup + that nothing commits without approval

Risks / notes

  • Data egress: CSV rows (transaction descriptions, amounts) are sent to the Anthropic API. This is the one external data flow — acceptable for a single-user local app, but document it prominently so the user knows their transaction text leaves the machine.
  • No secrets stored: unlike a bank-aggregator approach, there are no access tokens or bank credentials to persist — only ANTHROPIC_API_KEY in env.
  • Categorization here overlaps with feat: AI budgeting chat with human-in-the-loop envelope assignment #8's tools; the same envelope-aware AI can suggest categories.

Context

Captured from an ideas session; pivoted from "Stripe/Plaid bank connection" to CSV + AI to stay local and avoid stored credentials. Depends on #8 for the shared Anthropic client and the "propose → approve → journalWriter" HITL pattern — build #8 first.

Relates to #8.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions