Skip to content

perf(syntax): replace HTML5NamedCharRefs table with browser-native textarea decoder#1

Merged
NullVoxPopuli-ai-agent merged 2 commits intomerge-handlebars-parser-into-glimmer-syntaxfrom
perf/textarea-entity-decoding
Apr 20, 2026
Merged

perf(syntax): replace HTML5NamedCharRefs table with browser-native textarea decoder#1
NullVoxPopuli-ai-agent merged 2 commits intomerge-handlebars-parser-into-glimmer-syntaxfrom
perf/textarea-entity-decoding

Conversation

@NullVoxPopuli-ai-agent
Copy link
Copy Markdown
Owner

Summary

  • Drops the ~2200-entry HTML5NamedCharRefs lookup table from @glimmer/syntax in favour of a lazily-created <textarea> element for named entity decoding
  • The browser (or happy-dom / jsdom in SSR) natively knows every HTML5 entity — no bundled table needed
  • Removes simple-html-tokenizer as a runtime dependency of @glimmer/syntax, saving ~32 kB uncompressed from the browser bundle
  • Numeric refs (&#38; / &#x26;) are unchanged — already handled inline with String.fromCodePoint
  • A !text.includes('&') fast path skips the regex entirely for text nodes with no entities, which is the common case

Stacked on emberjs#21308. Addresses the entity-decoding discussion in emberjs#21308 (comment).

How it works

let _entityDecoder: HTMLTextAreaElement | undefined;

function decodeEntities(text: string): string {
  if (!text.includes('&')) return text;
  return text.replace(ENTITY_RE, (match, _body, hex, dec, _name) => {
    if (hex !== undefined) return String.fromCodePoint(parseInt(hex, 16));
    if (dec !== undefined) return String.fromCodePoint(parseInt(dec, 10));
    _entityDecoder ??= document.createElement('textarea');
    _entityDecoder.innerHTML = match; // e.g. "&amp;" → value === "&"
    return _entityDecoder.value;
  });
}

The <textarea> element uses the HTML RCDATA content model — it decodes character references but does not parse child elements, so there is no XSS surface. Unknown entities pass through as-is (matching the previous ?? match fallback). The element is created once and reused.

🤖 Generated with Claude Code

NullVoxPopuli and others added 2 commits April 19, 2026 23:43
…xtarea decoder

Drops the ~2200-entry HTML5NamedCharRefs lookup table in favour of a
lazily-created <textarea> element for named entity decoding. The browser
(or happy-dom/jsdom in SSR) already knows every entity natively — no
bundled table needed.

Numeric refs (&emberjs#38; / &#x26;) are unaffected; they were already handled
inline with String.fromCodePoint. A fast path skips the regex entirely
for text nodes with no '&', which is the common case.

This removes simple-html-tokenizer as a runtime dependency of
@glimmer/syntax, saving ~32 kB uncompressed from the browser bundle.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
Parallel update to the @glimmer-workspace/integration-tests suite.
The same three-file pattern as the syntax test update:

- invalid-html-test.ts: all 16 syntaxErrorFor calls → parseErrorFor
  with the full source template and Peggy-derived span length.
  Rewrote the inline expectedError helper to take (source, line, col,
  spanLength) instead of (code, line, col).

- syntax/general-errors-test.ts: 19 syntaxErrorFor calls → parseErrorFor
  (path restriction errors, block-params errors, unquoted attribute errors).

- syntax/named-blocks-test.ts: 2 parse-error syntaxErrorFor calls →
  parseErrorFor. Remaining calls in that file are semantic compiler errors
  and stay as syntaxErrorFor.

- integration-tests/index.ts: re-exports parseErrorFor alongside
  syntaxErrorFor so test files can import it from the integration-tests
  package.

Files left as syntaxErrorFor (semantic compiler errors, unaffected by
the Peggy format change):
  - if-unless-test.ts
  - yield-keywords-test.ts
  - argument-less-helper-paren-less-invoke-test.ts
  - strict-mode-test.ts
  - dynamic-modifiers-test.ts

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@NullVoxPopuli-ai-agent NullVoxPopuli-ai-agent merged commit 7bac568 into merge-handlebars-parser-into-glimmer-syntax Apr 20, 2026
14 of 21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants