Skip to content

print the diff between right result and wrong result#141

Merged
bb7133 merged 2 commits into
pingcap:masterfrom
hawkingrei:show_diff
Dec 5, 2025
Merged

print the diff between right result and wrong result#141
bb7133 merged 2 commits into
pingcap:masterfrom
hawkingrei:show_diff

Conversation

@hawkingrei
Copy link
Copy Markdown
Member

@hawkingrei hawkingrei commented Dec 4, 2025

When your result set changes significantly, it's not easy to directly discern the differences from the logs. I've modified the log format so that it can print the diff.

before

image

after the improvement

image

Signed-off-by: Weizhen Wang <wangweizhen@pingcap.com>
Signed-off-by: Weizhen Wang <wangweizhen@pingcap.com>
Copy link
Copy Markdown
Member

@YangKeao YangKeao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM


func (e *WrongResultError) Error() string {
diff := diffmatchpatch.New()
diffText := diff.DiffPrettyText(diff.DiffMain(e.actual, e.expected, false))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the effect of checklines = false?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I gave it a try, but I couldn't see any difference.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot:

In sergi/go-diff, the checklines parameter controls whether the diff algorithm first does a fast, line‑level diff before falling back to a more precise, character‑level diff.

Specifically:

  • DiffMain(text1, text2, checklines bool) passes checklines down to diffCompute:
    func (dmp *DiffMatchPatch) diffCompute(text1, text2 []rune, checklines bool, deadline time.Time) []Diff {
        …
        } else if checklines && len(text1) > 100 && len(text2) > 100 {
            return dmp.diffLineMode(text1, text2, deadline)
        }
        return dmp.diffBisect(text1, text2, deadline)
    }
  • When checklines == true and both texts are longer than 100 runes, it uses diffLineMode, which:
    • Does a quick diff at line granularity (treating each line as a token), then
    • “Rediffs” the changed line blocks at character level for better accuracy.

Effects in practice

  • checklines = false

    • Always uses diffBisect (a Myers-style character-level diff).
    • Slower on large inputs, but tends to give minimal/optimal character-level diffs.
  • checklines = true

    • For long texts: runs the faster line-mode pre-pass, then refines.
    • This usually gives a significant speedup on big inputs, at the cost of possibly non-minimal diffs (more or differently grouped hunks than the theoretical minimum).

The tests (TestDiffMainWithCheckLines) confirm that:

  • For most cases, results with and without checklines match exactly.
  • It’s explicitly documented that the speedup “can produce non-minimal diffs” (comment on diffLineMode), and there’s a TODO about a failing test case, highlighting that behavior can differ.

Copy link
Copy Markdown
Member

@bb7133 bb7133 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@bb7133 bb7133 merged commit 12f3756 into pingcap:master Dec 5, 2025
2 checks passed
dveeden pushed a commit to dveeden/mysql-tester that referenced this pull request May 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants