Skip to content

block append --file: markdown parser drops tables and crashes on common language aliases #48

@4ier

Description

@4ier

block append --file: markdown parser drops tables and crashes on common language aliases

Summary

notion block append <page> --file <foo.md> uses a very primitive "split on blank lines, one block per section" parser that:

  1. Completely loses markdown tables — table rows become raw-text paragraph blocks (| a | b | appears literally on the page).
  2. Fails the whole append when a fenced code block uses a common alias like ```ts or ```sh — the Notion API rejects ts, and the one bad block aborts the entire file import.

This makes --file unsuitable for real markdown documents; I had to write a custom parser + raw /v1/blocks/{id}/children calls to upload a design doc that contained 15 tables.

Environment

$ notion --version
notion version dev

$ notion block append --help | grep -i file
--file string   Read content from a file (each double-newline-separated section becomes a block)

Installed binary: ~/.local/bin/notion (10 MB, built as dev). Built from source via go install.

Repro 1 — tables are flattened

Input /tmp/t.md:

# heading

A paragraph.

| col1 | col2 |
|------|------|
| a    | b    |
| c    | d    |

- item one
- item two

Command:

notion block append <page-id> --file /tmp/t.md

Expected: 1 heading + 1 paragraph + 1 table block (with 3 rows) + 2 bulleted_list_item = 5 blocks.

Actual (inspected via /v1/blocks/{id}/children):

total: 8
types: Counter({'paragraph': 5, 'bulleted_list_item': 2, 'heading_1': 1})
  heading_1: 'heading'
  paragraph: 'A paragraph.'
  paragraph: '| col1 | col2 |'
  paragraph: '|------|------|'
  paragraph: '| a    | b    |'
  paragraph: '| c    | d    |'
  bulleted_list_item: 'item one'
  bulleted_list_item: 'item two'

Every table row ends up as a paragraph showing raw pipe characters. No table / table_row blocks are created.

Repro 2 — code language alias crashes the whole append

Input /tmp/t2.md:

# heading

Some prose.

```ts
const x = 1;
  • item

**Command**:

notion block append --file /tmp/t2.md


**Actual**:

Error: append block: validation_error: body failed validation: body.children[6].code.language should be "abap", "abc", ..., "typescript", ..., instead was "ts".
→ Check your input format. Use --debug for request details


**Nothing** gets appended (whole request is rejected atomically). Common aliases users write constantly in markdown (`ts`, `js`, `py`, `sh`, `yml`, `rb`, `rs`) all hit this.

The help text and skill docs suggest normalization already happens ("'ts' / 'sh' / 'yml' etc. normalized"), but in this build it does not.

## Impact

- `--file` is essentially unusable for typical technical markdown (design docs, READMEs) because tables and fenced code with language aliases are both extremely common.
- Users have to fall back to `notion api PATCH /v1/blocks/<id>/children` and implement their own markdown parser — which defeats the purpose of the flag.

## Suggested fix

1. Use a real markdown AST parser (e.g. `goldmark` with the table extension) instead of the "blank-line split" heuristic.
2. Map CommonMark `table` nodes to Notion `table` + `table_row` blocks (set `table_width`, `has_column_header: true`).
3. Normalize language aliases to Notion's accepted set before POST:
   - `ts` → `typescript`, `js` → `javascript`, `py` → `python`, `sh`/`bash` → `shell`, `yml` → `yaml`, `rb` → `ruby`, `rs` → `rust`, `md` → `markdown`, empty → `plain text`, unknown → `plain text` (with a warning).
4. Auto-split code blocks that exceed the 2000-char rich-text limit (keep same language, consecutive blocks).
5. For consistency with `set-markdown` (mentioned in docs), consider sharing the same parser between `block append --file` and a future `page set-markdown`.

## Workaround I used

Wrote a ~200-line Python parser that:
- detects fenced code, tables, headings, lists, blockquotes, hrules in source order
- emits a Notion `children` array with proper `table`/`table_row`/`code`/heading/list/quote blocks
- splits code blocks and inline text > 2000 chars
- normalizes language aliases
- uses `notion api PATCH /v1/blocks/{page_id}/children` in batches of 80

Happy to contribute the markdown→blocks mapping logic as a Go port if useful.

## Additional notes

- `page set-markdown`, `page markdown`, `page property`, `page archive`, `comment update/delete`, `file get` (all mentioned in the v0.7 skill/README) are **not present** in my `dev` build — see companion issue on docs vs implementation sync.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions