Re-key reverse symbol map by (nucleus, type)#201
Conversation
textToLatexSymbolNames is now a two-level dict keyed by nucleus and then boxed MTMathAtomType. latexSymbolNameForAtom: retries the (nucleus, Bin) cell when (nucleus, Un) misses, since MTMathList.finalized reclassifies leading Bin atoms to Un. This unblocks future PRs that register type-distinct symbols at the same nucleus. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
f' parses as f^{\prime}, y'' as y^{\prime\prime}, etc. Mirrors the
existing ^ branch in -buildInternal:stopChar:: allocates an empty Ord
when there is no host atom or the host already has a superscript,
greedily collects consecutive primes, and merges with a trailing ^
(TeX \futurelet behavior). +atomForCharacter:'s rejection of '
is preserved so +mathListForCharacters: (used by
+fractionWithNumeratorStr:) keeps its no-LaTeX semantics.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Registers \nleq, \ngeq, \nless, \ngtr, \nsubseteq, \nsupseteq, \nmid, \nparallel and 23 more negated-arrow / negated-relation symbols as kMTMathAtomRelation. Symbol-table only; no parser or layout changes. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
\boxplus, \boxminus, \boxtimes, \boxdot, \circledast, \circledcirc, \circleddash, \barwedge, \veebar, \triangleleft, \triangleright. All registered as kMTMathAtomBinaryOperator. Round-trip tests verify the Bin → Un retry in +latexSymbolNameForAtom: handles the start-of-list case. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
\rightleftharpoons, \leftrightharpoons, \upharpoonleft, \upharpoonright, \downharpoonleft, \downharpoonright, \rightharpoonup, \leftharpoonup, \rightharpoondown, \leftharpoondown, \hookleftarrow, \hookrightarrow, \twoheadleftarrow, \twoheadrightarrow, \rightarrowtail, \leftarrowtail registered as kMTMathAtomRelation. \restriction registered as an alias for \upharpoonright. Bundled in one commit so the alias is never broken on a transient build. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…p \square Adds 17 missing relations (\vdash, \dashv, \Subset, \Supset, \backsim, \backsimeq, \eqsim, \Bumpeq, \bumpeq, \therefore, \because, \multimap, \vartriangleleft, \vartriangleright, \trianglelefteq, \trianglerighteq, \triangleq) and 17 missing ordinaries (\complement, \Box, \Diamond, \lozenge, \blacklozenge, the four card suits, the Hebrew letters, and the triangle family). Removes the \square → +placeholder mis-wiring and registers \Box as the canonical Ord/U+25A1 entry. \square now resolves via an alias to \Box. +placeholder is unchanged. Adds the remaining Feature-6 aliases: \implies, \impliedby, \dotsc, \dotsb, \dotsm, \dotsi, \square, \vartriangle. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
kostub
left a comment
There was a problem hiding this comment.
Code Review — PR #201 (Re-key reverse symbol map by (nucleus, type))
All 169 tests pass (swift test). The core idea is sound and the implementation is clean. A few observations below, ranging from a correctness concern to nits.
Correctness: nsucceq / npreceq codepoints may be wrong
In iosMath/lib/MTMathAtomFactory.m:
@"nsucceq" : [MTMathAtom atomWithType:kMTMathAtomRelation value:@"⪱"],
@"npreceq" : [MTMathAtom atomWithType:kMTMathAtomRelation value:@"⪰"],The standard amssymb assignments for \npreceq / \nsucceq are typically U+22E0 (DOES NOT PRECEDE OR EQUAL, ⋠) and U+22E1 (DOES NOT SUCCEED OR EQUAL, ⋡). U+2AB0 and U+2AB1 are ⪰ / ⪱ (PRECEDES/SUCCEEDS ABOVE SINGLE-LINE NOT EQUAL TO) — which is what textsf{stmaryrd} uses. Most LaTeX distributions ship \npreceq = ⋠ (U+22E0) and \nsucceq = ⋡ (U+22E1). Worth double-checking against the specific font's glyph mapping before merging.
Design: Un→Bin fallback is one-directional only
In latexSymbolNameForAtom:, the fallback goes Un→Bin. But finalized also converts a trailing Bin to Un. If any downstream caller ever passes a post-finalized atom that was trailing (not leading), the same problem applies. The comment in the code mentions "leading/orphan", which covers the known cases today — but the asymmetry between what finalized can reclassify and what the fallback handles is worth noting as a known limitation if future atoms are added.
addLatexSymbol: overwrites without respect to length preference
The new textToLatexSymbolNames builder respects the "shorter/alphabetical" tie-break. addLatexSymbol: (the public API path) does not — it blindly overwrites inner[typeKey] with the new name if a prior entry already exists. This is the same behaviour as before the re-key, so not a regression, but it's now more visible since the inner dict is directly written. Could be worth adding the same guard here for consistency:
NSString* existing = inner[@(atom.type)];
if (!existing || name.length < existing.length ||
(name.length == existing.length && [name compare:existing] == NSOrderedAscending)) {
inner[@(atom.type)] = name;
}Nit: \square removed from the symbol table, not just re-routed
The diff removes @"square" from supportedLatexSymbols and adds it only to aliases. This means \square no longer appears in +supportedLatexSymbolNames, which changes the observable API contract slightly (a caller enumerating that list to populate a keyboard palette would lose square). That may be intentional — testSquareBoxParity confirms round-trips still work — but it is a behaviour change worth calling out explicitly in the PR description.
Nit: dotsb / dotsm both alias \cdots; dotsc / dotsi both alias \ldots
This is correct per the amsmath spec (\dotsb = binary-operator dots = \cdots, \dotsc = comma-separated dots = \ldots). No issue here — just confirming it's intentional rather than a copy-paste error.
Minor: testReverseMapTypeKeyedRoundTrip — \pm trailing space in expected value
@[ @"\\pm", @"\\pm " ],The trailing space in @"\\pm " is the correct output of mathListToString: (the serializer appends a space after every command name). This is expected and correct — just noting it for reviewers who might wonder if it's a typo.
Overall this is well-structured work. The main item to verify before merging is the nsucceq/npreceq codepoint question.
|
/review code |
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request expands iosMath's LaTeX compatibility by adding 91 new symbols, implementing prime shorthand in the parser, and refactoring the reverse symbol map to preserve atom types during round-trip serialization. Review feedback identifies several issues, including inconsistent tie-breaking logic in the runtime symbol registration API, incorrect Unicode codepoints for \nsucceq and \npreceq, and an incorrect alias for \dotsi. Additionally, \square is redundantly defined as both a canonical symbol and an alias, leading to incorrect expectations in the test suite.
| if (!existing || name.length < existing.length || | ||
| (name.length == existing.length && [name compare:existing] == NSOrderedAscending)) { | ||
| inner[typeKey] = name; | ||
| } |
There was a problem hiding this comment.
The tie-breaking logic here is inconsistent with the logic in textToLatexSymbolNames. If an existing symbol with the same name is re-added, this implementation will not update it, whereas textToLatexSymbolNames would. This can lead to unexpected behavior for clients using this runtime API.
The comparison should be != NSOrderedDescending to include NSOrderedSame and be consistent with the logic for building the initial reverse map.
if (!existing || name.length < existing.length ||
(name.length == existing.length && [name compare:existing] != NSOrderedDescending)) {
inner[typeKey] = name;
}| @"nsucceq" : [MTMathAtom atomWithType:kMTMathAtomRelation value:@"\u22E1"], | ||
| @"npreceq" : [MTMathAtom atomWithType:kMTMathAtomRelation value:@"\u22E0"], |
There was a problem hiding this comment.
The Unicode characters for \nsucceq and \npreceq are incorrect. According to the LLD and standard LaTeX, they should be U+2AB1 and U+2AB0 respectively. The current implementation uses U+22E1 and U+22E0, which correspond to \nsqsupseteq and \nsqsubseteq.
@"nsucceq" : [MTMathAtom atomWithType:kMTMathAtomRelation value:@"\u2AB1"],
@"npreceq" : [MTMathAtom atomWithType:kMTMathAtomRelation value:@"\u2AB0"],| @"dotsc" : @"ldots", | ||
| @"dotsb" : @"cdots", | ||
| @"dotsm" : @"cdots", | ||
| @"dotsi" : @"cdots", |
| // Missing ordinaries (logic / suits / Hebrew letters / amssymb) | ||
| @"complement" : [MTMathAtom atomWithType:kMTMathAtomOrdinary value:@"\u2201"], | ||
| @"Box" : [MTMathAtom atomWithType:kMTMathAtomOrdinary value:@"\u25A1"], | ||
| @"square" : [MTMathAtom atomWithType:kMTMathAtomOrdinary value:@"\u25A1"], |
| @[ @"nsucceq", @0x22E1 ], | ||
| @[ @"npreceq", @0x22E0 ], |
There was a problem hiding this comment.
| @[ @"dotsc", @"ldots", @"…", @"\\ldots " ], | ||
| @[ @"dotsb", @"cdots", @"⋯", @"\\cdots " ], | ||
| @[ @"dotsm", @"cdots", @"⋯", @"\\cdots " ], | ||
| @[ @"dotsi", @"cdots", @"⋯", @"\\cdots " ], |
|
|
||
| XCTAssertEqualObjects([MTMathListBuilder mathListToString:a], @"\\Box "); | ||
| XCTAssertEqualObjects([MTMathListBuilder mathListToString:b], @"\\Box "); | ||
| XCTAssertTrue([[MTMathAtomFactory supportedLatexSymbolNames] containsObject:@"square"]); |
There was a problem hiding this comment.
Summary
+textToLatexSymbolNamesfromnucleus → commandtonucleus → (type → command)+latexSymbolNameForAtom:soMTMathList.finalized's leading-Bin → Un reclassification doesn't break round-trip+addLatexSymbol:value:to write to the nested dicttestReverseMapTypeKeyedRoundTripcovering\\pm,\\cdot,\\leq,\\to,\\alphaand confirms behavior with and without finalize-induced Bin → Un transitionsNo new symbols and no public API changes. De-risks PRs that will add type-distinct entries at shared nuclei (triangle family, future
\\bigtriangleup).Test plan
swift teststill passestestReverseMapTypeKeyedRoundTripcovers Bin atoms at start-of-list and mid-list🤖 Generated with Claude Code