Skip to content

AbbreviationExtension corrupts emphasis (bold/italic) resolution on the same line #935

@Kryptos-FR

Description

@Kryptos-FR

Description

When an abbreviation is defined and appears in the same paragraph as **bold** or *italic* markers, the emphasis is not resolved. The **/* characters are rendered as literal text instead.

Minimal Reproducible Example

#:package Markdig@1.1.*

using Markdig;
using Markdig.Syntax;
using Markdig.Syntax.Inlines;
using Markdig.Extensions.Abbreviations;

var md = """
HTML supports **bold** and *italic*

*[HTML]: HyperText Markup Language
""";

var pipeline = new MarkdownPipelineBuilder().UseAbbreviations().Build();
var doc = Markdown.Parse(md, pipeline);

void DumpInline(Inline inline, int depth = 0)
{
    var indent = new string(' ', depth * 2);
    switch (inline)
    {
        case LiteralInline lit:       Console.WriteLine($"{indent}LiteralInline: \"{lit.Content}\""); break;
        case AbbreviationInline abbr: Console.WriteLine($"{indent}AbbreviationInline: \"{abbr.Abbreviation?.Label}\""); break;
        case EmphasisInline em:
            Console.WriteLine($"{indent}EmphasisInline (char='{em.DelimiterChar}' count={em.DelimiterCount}):");
            foreach (var child in em) DumpInline(child, depth + 1);
            break;
        default:
            Console.WriteLine($"{indent}{inline.GetType().Name}");
            if (inline is ContainerInline ci)
                foreach (var child in ci) DumpInline(child, depth + 1);
            break;
    }
}

foreach (var block in doc)
{
    Console.WriteLine($"[{block.GetType().Name}]");
    if (block is LeafBlock lb && lb.Inline != null)
        foreach (var inline in lb.Inline) DumpInline(inline);
}

Actual AST (with UseAbbreviations())

[ParagraphBlock]
ContainerInline
  AbbreviationInline: "HTML"
  LiteralInline: " supports "
  EmphasisDelimiterInline
    LiteralInline: "bold"
    EmphasisDelimiterInline
      LiteralInline: " and "
      EmphasisDelimiterInline
        LiteralInline: "italic"
        EmphasisDelimiterInline

All EmphasisDelimiterInline nodes remain unresolved — no EmphasisInline is ever created.

Expected AST (without UseAbbreviations())

[ParagraphBlock]
LiteralInline: "HTML supports "
EmphasisInline (char='*' count=2):
  LiteralInline: "bold"
LiteralInline: " and "
EmphasisInline (char='*' count=1):
  LiteralInline: "italic"

Root Cause Hypothesis

The AbbreviationParser hooks LiteralInlineParser.PostMatch and, when a match is found, replaces processor.Inline with a newly-created ContainerInline. When this fires on the first literal of a paragraph (e.g. "HTML supports "), the ContainerInline becomes the current parent context for subsequent inline parsing. The EmphasisDelimiterInline nodes that follow are then appended as children of that ContainerInline rather than as top-level siblings of the paragraph's inline chain. The emphasis post-processor, which resolves delimiter pairs within a shared parent, can no longer match the opening and closing **/* delimiters, so they remain unresolved.

The problem only manifests when the abbreviation appears before emphasis markers on the same line (i.e. the PostMatch fires and returns a ContainerInline before the emphasis delimiters have been parsed).

Environment

  • Markdig 1.1.x
  • .NET 10

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions