Skip to content

[feature] Stateful streaming replacement: appendReplacement / appendTail equivalent#54

Merged
jbachorik merged 16 commits intomainfrom
muse/feature-stateful-streaming-replacement
May 8, 2026
Merged

[feature] Stateful streaming replacement: appendReplacement / appendTail equivalent#54
jbachorik merged 16 commits intomainfrom
muse/feature-stateful-streaming-replacement

Conversation

@jbachorik
Copy link
Copy Markdown
Collaborator

@jbachorik jbachorik commented May 8, 2026

What does this PR do?

Adds MatchCursor — a stateful streaming cursor for incremental match-and-replace, analogous to JDK Matcher.appendReplacement / appendTail. Also introduces the NamedGroupResult sub-interface and extends MatchResultImpl with optional named-group support.

Motivation

Implements #46. ReggieMatcher.replaceAll(Function) covers the common case, but callers that need to accumulate a StringBuilder incrementally across matches (e.g. interleaving non-match content with per-match computed replacements) have no equivalent API today.

Related Issue(s)

Closes #46

Change Type

  • Bug fix
  • New feature
  • Performance improvement
  • Refactoring (no functional change)
  • Documentation
  • Test improvement
  • Build/CI change

Checklist

  • I have read the CONTRIBUTING.md guidelines
  • All existing tests pass (./gradlew build)
  • I have added tests for my changes
  • I have updated documentation (if applicable)
  • My commits are signed

Performance Impact

MatchCursor.findNext() returns MatchResult directly (not Optional) to stay allocation-free on the hot path. Named-group index is parsed once at cursor construction from the pattern string.

Additional Notes

New public API:

  • ReggieMatcher.cursor(String) — factory method
  • MatchCursor — implements Iterator<MatchResult> and AutoCloseable
    • findNext()MatchResult or null (exhausted / closed)
    • appendReplacement(StringBuilder, String)MatchCursor (fluent)
    • appendTail(StringBuilder)StringBuilder
    • reset(String) — restarts cursor on new input, safe to call at any point
    • close() — resets all state
  • NamedGroupResult — extends MatchResult with group(String name)
  • MatchResult.hasNamedGroups() — default false; MatchResultImpl overrides to true when named groups are present

Acceptance criteria satisfied:

  • MatchCursor implements Iterator<MatchResult> and AutoCloseable
  • findNext() advances cursor; null signals exhaustion ✓
  • reset(String) may be called at any point, including mid-iteration ✓
  • Only non-overlapping sequential matches ✓
  • Numeric ($1, $2) and named (${name}) backreferences ✓
  • Non-participating groups follow JDK Matcher.appendReplacement behavior ✓
  • appendReplacement returns MatchCursor (fluent) ✓
  • appendTail returns StringBuilder
  • Thread-safety: not shared; each thread must use its own ReggieMatcher and cursor (documented) ✓
  • Invalid-state throws IllegalStateException / NullPointerException as appropriate ✓
  • findNext() may be called without appendReplacement() to skip a match ✓
  • findNext() after appendTail() returns null (not an error) ✓
  • close() resets all fields ✓
  • Coexists with replaceAll(Function)

20 tests covering all behaviours added in MatchCursorTest.

🤖 Generated with Claude Code via muse implement

Implements #46

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@jbachorik jbachorik added the AI Generated or assisted by AI label May 8, 2026
@jbachorik jbachorik requested a review from Copilot May 8, 2026 13:09
@jbachorik
Copy link
Copy Markdown
Collaborator Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 85c141e565

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new stateful, streaming match-and-replace API to reggie-runtime via MatchCursor, analogous to java.util.regex.Matcher.appendReplacement/appendTail, and introduces named-group support hooks (NamedGroupResult, MatchResult.hasNamedGroups()).

Changes:

  • Introduces MatchCursor (iterator-style cursor with findNext(), appendReplacement(), appendTail(), reset(), close()).
  • Adds named-group result support via NamedGroupResult and extends MatchResult/MatchResultImpl accordingly.
  • Adds MatchCursorTest covering streaming replacement behaviors.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
reggie-runtime/src/main/java/com/datadoghq/reggie/runtime/MatchCursor.java New streaming cursor implementation (iteration, replacement expansion, named-group parsing).
reggie-runtime/src/main/java/com/datadoghq/reggie/runtime/ReggieMatcher.java Adds cursor(String) factory method to create a MatchCursor.
reggie-runtime/src/main/java/com/datadoghq/reggie/runtime/NamedGroupResult.java New sub-interface for named-group lookup (group(String)).
reggie-runtime/src/main/java/com/datadoghq/reggie/runtime/MatchResultImpl.java Extends result implementation with optional named-group index support.
reggie-runtime/src/main/java/com/datadoghq/reggie/runtime/MatchResult.java Adds hasNamedGroups() default method for capability checks.
reggie-runtime/src/test/java/com/datadoghq/reggie/runtime/MatchCursorTest.java New test suite validating cursor iteration/replacement semantics and state transitions.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread reggie-runtime/src/main/java/com/datadoghq/reggie/runtime/MatchResultImpl.java Outdated
- MatchCursor.java:211 — fix digit-greedy loop in expandReplacement to stop before a digit that would push group number past groupCount (JDK appendReplacement semantics), so $12 with 1 group emits group(1) + literal '2'
- MatchCursor.java:88 — findNext() now consumes the peeked buffer from hasNext() to avoid skipping matches when caller mixes hasNext() with findNext()
- MatchResultImpl.java — revert to implements MatchResult only; remove nameIndex field
- NamedMatchResultImpl.java — new package-private class extends MatchResultImpl implements NamedGroupResult; only instantiated when pattern has named groups; requireNonNull guard on nameIndex
- MatchCursor.java:125 — appendTail() clears lastMatch/peeked references after tail is appended
- MatchCursorTest.java:191 — add ready.await() before start.countDown() so all threads reach barrier before race starts

Co-Authored-By: muse <muse@noreply>
@jbachorik
Copy link
Copy Markdown
Collaborator Author

@copilot review

@jbachorik
Copy link
Copy Markdown
Collaborator Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: cb5297eebf

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread reggie-runtime/src/main/java/com/datadoghq/reggie/runtime/MatchCursor.java Outdated
Google Java Format requires the parameters to fit on one line when
within the 100-char limit; put all 5 params on the same continuation
line to satisfy the spotlessJavaCheck CI task.

Agent-Logs-Url: https://github.com/DataDog/java-reggie/sessions/f4a2f7a8-d99e-4c39-9666-25404e355f5d

Co-authored-by: jbachorik <738413+jbachorik@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI commented May 8, 2026

@copilot review

Fixed the CI failure (commit e844992): the NamedMatchResultImpl constructor parameters were split across multiple lines, but Google Java Format requires them on a single continuation line when they fit within the 100-char limit. Running spotlessApply logic manually fixed the spotlessJavaCheck violation that was breaking the build.

@jbachorik
Copy link
Copy Markdown
Collaborator Author

@copilot resolve the merge conflicts in this pull request

…h-limit, self-ref backrefs)

Main added:
- Named group extraction directly into MatchResult/MatchResultImpl (group/start/end by name)
- split(String, int) with JDK semantics
- Self-referencing backreference support in RECURSIVE_DESCENT
- NameEnrichingMatcher for enriching results with named groups

Conflict resolution in MatchResult.java:
- Dropped hasNamedGroups() (our addition) in favour of main's group/start/end(String) defaults
- NamedGroupResult simplified to a marker interface (named group methods now in MatchResult)
- NamedMatchResultImpl delegates named-group storage to MatchResultImpl's 5-arg constructor

Co-authored-by: jbachorik <738413+jbachorik@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI commented May 8, 2026

@copilot resolve the merge conflicts in this pull request

Resolved in commit 7fc4118. The only true conflict was in MatchResult.java — our hasNamedGroups() default vs. main's group/start/end(String name) defaults. Resolution: took main's approach (named-group methods directly in the interface) and updated our additions accordingly:

  • NamedGroupResult simplified to a marker interface (the methods it declared are now in MatchResult)
  • NamedMatchResultImpl simplified to delegate named-group storage to MatchResultImpl's 5-arg constructor (no redundant field/overrides)
  • MatchResultImpl Javadoc cleaned up

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 6 changed files in this pull request and generated 6 comments.

Comment thread reggie-runtime/src/main/java/com/datadoghq/reggie/runtime/MatchCursor.java Outdated
Comment thread reggie-runtime/src/main/java/com/datadoghq/reggie/runtime/NamedGroupResult.java Outdated
(?|...) resets group numbering per-alternative; the previous monotonic
scan mapped named groups in later alternatives to wrong indices.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@jbachorik
Copy link
Copy Markdown
Collaborator Author

@copilot review

@jbachorik
Copy link
Copy Markdown
Collaborator Author

@codex review

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: b52f400a06

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Comment thread reggie-runtime/src/main/java/com/datadoghq/reggie/runtime/MatchCursor.java Outdated
Comment thread reggie-runtime/src/main/java/com/datadoghq/reggie/runtime/MatchCursor.java Outdated
- MatchCursor uses matcher.nameToIndex (authoritative, handles branch-reset)
  instead of re-parsing the pattern — removes parseNamedGroups() entirely
- expandReplacement throws IllegalArgumentException for $<non-digit> per JDK
- reset()/cursor() validate input is non-null via Objects.requireNonNull
- MatchResult.hasNamedGroups() default false; NamedGroupResult overrides true
- Add hasNext()/next() and mixed-iteration tests

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@jbachorik
Copy link
Copy Markdown
Collaborator Author

@copilot review

Copy link
Copy Markdown
Contributor

Copilot AI commented May 8, 2026

@copilot please review

CI is green on the latest commit (646b615). All tests pass. The code review tooling found no new issues — the PR is clean and ready for merge.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 646b615304

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

jbachorik and others added 2 commits May 8, 2026 18:21
…onstruction paths

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@jbachorik
Copy link
Copy Markdown
Collaborator Author

@copilot please review

@jbachorik
Copy link
Copy Markdown
Collaborator Author

@codex please review

Copy link
Copy Markdown
Contributor

Copilot AI commented May 8, 2026

@copilot please review

The PR is in good shape. Here's what I found after reviewing the full diff, including the two newest commits (9c334da, c64ee8f):

MatchCursor — no issues:

  • findNext() correctly drains peeked before advancing searchPos
  • hasNext() saves/restores lastMatch so appendReplacement() still sees the active match after a lookahead
  • appendTail() clears lastMatch, peeked, peekedValid to release references immediately
  • reset() and constructor both call requireNonNull on input
  • expandReplacement() follows JDK semantics: \\ → literal, \$ → literal, $N/${name} → group, anything else → IllegalArgumentException

MatchResultImpl.hasNamedGroups() — correct:
The new @Override delegates to !nameToIndex.isEmpty(), which covers all three construction paths:

  1. 4-arg constructor (no names) → emptyMapfalse
  2. 5-arg constructor (with names) → non-empty map → true
  3. JavaRegexFallbackMatcher.toMatchResult() (passes javaPattern.namedGroups()) → true when named groups are present ✓

NamedMatchResultImpl — harmless redundancy:
NamedGroupResult.hasNamedGroups() returns true as a default, but since MatchResultImpl.hasNamedGroups() is a concrete @Override, it takes priority for NamedMatchResultImpl instances. The interface default is effectively dead code here, but it doesn't cause bugs.

Tests — good coverage: the new namedGroupFallbackHasNamedGroupsTrue (fallback path), testHasNamedGroups_directMatchResultImplConstruction (direct 5-arg ctor path), and all three iterator-mixing tests cover the non-trivial interaction points.

CI is in progress on c64ee8f; the previous four runs were all green. No blocking issues found — ready to merge once CI completes.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: c64ee8f08d

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@jbachorik
Copy link
Copy Markdown
Collaborator Author

@copilot please review

@jbachorik
Copy link
Copy Markdown
Collaborator Author

@codex please review

Copy link
Copy Markdown
Contributor

Copilot AI commented May 8, 2026

@copilot please review

CI is green on the latest commit (625918c). All 35 tests pass. Here's a quick summary of the current state:

MatchCursor — no issues:

  • findNext() correctly drains peeked before advancing searchPos
  • hasNext() saves/restores lastMatch so appendReplacement() still sees the active match after a lookahead
  • appendTail() clears lastMatch, peeked, peekedValid immediately
  • reset() and constructor both call requireNonNull on input
  • expandReplacement() follows JDK semantics: \\ → literal, \$ → literal, $N/${name} → group, anything else → IllegalArgumentException

MatchResultImpl.hasNamedGroups() — correct:

  • Delegates to !nameToIndex.isEmpty() covering all construction paths (4-arg with empty map → false, 5-arg with names → true)

NamedGroupResult — clean marker interface with hasNamedGroups() default true

Tests — 35 tests covering all acceptance criteria including iterator mixing, peeked buffer semantics, and edge cases.

The PR is clean and ready for merge.

@copilot please review

CI is green on the latest commit (625918c). All 35 tests pass. No outstanding issues — the PR is clean and ready for merge.

@copilot please review

CI is green on the latest commit (625918c). No issues found in the review:

  • MatchCursor: state machine is correct — peeked/lastMatch interaction handles all mixing patterns (hasNext→findNext, hasNext→appendReplacement, next→appendReplacement), appendTail() and close() clear all refs
  • expandReplacement: JDK-aligned — \\x → literal x, \$ → literal $, $N/${name} → group value, anything else → IllegalArgumentException; trailing $ and $$ throw; out-of-range group throws IndexOutOfBoundsException
  • MatchResultImpl.hasNamedGroups(): covers all three construction paths (4-arg → emptyMapfalse, 5-arg with names → true, fallback via JavaRegexFallbackMatchertrue when names present)
  • NamedMatchResultImpl: correctly takes priority over the NamedGroupResult interface default since MatchResultImpl.hasNamedGroups() is a concrete @Override
  • 35 tests — comprehensive coverage including edge cases, thread-safety, Iterator mixing, and replacement semantics

@copilot please review

CI is green on the latest commit (625918c). No new issues found. The PR is clean and ready for merge.

Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 625918ce8f

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@jbachorik
Copy link
Copy Markdown
Collaborator Author

@copilot please review

@jbachorik
Copy link
Copy Markdown
Collaborator Author

@codex please review

Copy link
Copy Markdown
Contributor

Copilot AI commented May 8, 2026

@copilot please review

CI is green on the latest commit (8241d39). No new issues found:

  • MatchCursor: all state-machine invariants are correct — findNext() clears lastMatch on exhaustion (so appendReplacement() correctly throws after a null return), peeked buffer is drained before advancing, hasNext() saves/restores lastMatch, appendTail()/close() clear all refs
  • expandReplacement(): JDK-aligned — \\x → literal, \$ → literal $, $N/${name} → group value, trailing $/$$/$xIllegalArgumentException, out-of-range $NIndexOutOfBoundsException
  • MatchResultImpl.hasNamedGroups(): delegates to !nameToIndex.isEmpty() covering all construction paths
  • NamedMatchResultImpl: correctly inherits the concrete hasNamedGroups() override from MatchResultImpl
  • 35 tests — comprehensive coverage including all edge cases

The PR is clean and ready for merge.

@copilot please review

CI is green on the latest commit (8241d39). No new issues found — the PR is clean and ready for merge. (See the detailed review above.)

@chatgpt-codex-connector
Copy link
Copy Markdown

Codex Review: Didn't find any major issues. More of your lovely PRs please.

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

AI Generated or assisted by AI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[feature] Stateful streaming replacement: appendReplacement / appendTail equivalent

3 participants