perf(syntax): replace HTML5NamedCharRefs table with browser-native textarea decoder#1
Merged
NullVoxPopuli-ai-agent merged 2 commits intomerge-handlebars-parser-into-glimmer-syntaxfrom Apr 20, 2026
Conversation
…xtarea decoder Drops the ~2200-entry HTML5NamedCharRefs lookup table in favour of a lazily-created <textarea> element for named entity decoding. The browser (or happy-dom/jsdom in SSR) already knows every entity natively — no bundled table needed. Numeric refs (&emberjs#38; / &) are unaffected; they were already handled inline with String.fromCodePoint. A fast path skips the regex entirely for text nodes with no '&', which is the common case. This removes simple-html-tokenizer as a runtime dependency of @glimmer/syntax, saving ~32 kB uncompressed from the browser bundle. Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Parallel update to the @glimmer-workspace/integration-tests suite. The same three-file pattern as the syntax test update: - invalid-html-test.ts: all 16 syntaxErrorFor calls → parseErrorFor with the full source template and Peggy-derived span length. Rewrote the inline expectedError helper to take (source, line, col, spanLength) instead of (code, line, col). - syntax/general-errors-test.ts: 19 syntaxErrorFor calls → parseErrorFor (path restriction errors, block-params errors, unquoted attribute errors). - syntax/named-blocks-test.ts: 2 parse-error syntaxErrorFor calls → parseErrorFor. Remaining calls in that file are semantic compiler errors and stay as syntaxErrorFor. - integration-tests/index.ts: re-exports parseErrorFor alongside syntaxErrorFor so test files can import it from the integration-tests package. Files left as syntaxErrorFor (semantic compiler errors, unaffected by the Peggy format change): - if-unless-test.ts - yield-keywords-test.ts - argument-less-helper-paren-less-invoke-test.ts - strict-mode-test.ts - dynamic-modifiers-test.ts Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
7bac568
into
merge-handlebars-parser-into-glimmer-syntax
14 of 21 checks passed
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
HTML5NamedCharRefslookup table from@glimmer/syntaxin favour of a lazily-created<textarea>element for named entity decodingsimple-html-tokenizeras a runtime dependency of@glimmer/syntax, saving ~32 kB uncompressed from the browser bundle&/&) are unchanged — already handled inline withString.fromCodePoint!text.includes('&')fast path skips the regex entirely for text nodes with no entities, which is the common caseStacked on emberjs#21308. Addresses the entity-decoding discussion in emberjs#21308 (comment).
How it works
The
<textarea>element uses the HTML RCDATA content model — it decodes character references but does not parse child elements, so there is no XSS surface. Unknown entities pass through as-is (matching the previous?? matchfallback). The element is created once and reused.🤖 Generated with Claude Code