Skip to content

Add Webcil support to R2RDump#127885

Open
davidwrighton wants to merge 13 commits intodotnet:mainfrom
davidwrighton:wasmR2RDump
Open

Add Webcil support to R2RDump#127885
davidwrighton wants to merge 13 commits intodotnet:mainfrom
davidwrighton:wasmR2RDump

Conversation

@davidwrighton
Copy link
Copy Markdown
Member

Note

This PR was created with the assistance of GitHub Copilot.

Summary

Adds support for reading and dumping Webcil files in R2RDump, including a full WebAssembly bytecode disassembler.

Changes

  • Webcil support in R2RDump: Enable R2RDump to open and process Webcil (.wasm) assemblies.
  • IAssemblyMetadata.GetSectionData: Add GetSectionData to the metadata interface and implement across all types.
  • WASM bytecode disassembler: New WasmDisassembler class that decodes WASM binary instructions into WAT text format, covering:
    • All core MVP instructions (control flow, memory, numeric, reference, GC)
    • Complete SIMD instruction set (0xFD prefix, all 256 sub-opcodes)
    • Exception handling try_table instruction with all catch clause kinds
    • Bulk memory, table, and saturating truncation instructions (0xFC prefix)
    • GC/struct/array instructions (0xFB prefix)

davidwrighton and others added 6 commits May 6, 2026 12:59
R2RDump previously could not read Webcil files (the format used for
managed assemblies in WebAssembly environments). This adds a
WebcilImageReader that implements IBinaryImageReader for the Webcil
format, enabling R2RDump to dump headers, methods, and section
contents from Webcil-format R2R images.

Changes:
- New WebcilImageReader.cs implementing IBinaryImageReader
- ReadyToRunReader detects Webcil format (after MachO, before PE)
- DumpModel handles Webcil in reference assembly loading
- Program.cs maps OperatingSystem.Unknown to TargetOS.Linux for Webcil
- ReadyToRunMethod gracefully handles null PEReader (Webcil has no PE)
- ILCompiler.Reflection.ReadyToRun.csproj includes shared Webcil.cs

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace the PEReader ImageReader property with a GetSectionData(int rva)
method that returns a BlobReader. This decouples the interface from
PEReader, enabling non-PE formats (Webcil) to provide section data.

Implementations:
- StandaloneAssemblyMetadata: delegates to PEReader.GetSectionData
- ManifestAssemblyMetadata: same with null-guard
- WebcilAssemblyMetadata: resolves RVA via WebcilImageReader sections
- SimpleAssemblyMetadata (tests): delegates to PEReader.GetSectionData

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Implement a full WASM instruction disassembler that decodes WebAssembly
binary format into WAT-style text output. This enables the --disasm flag
in R2RDump to work with Webcil/WASM R2R images.

- Add WasmDisassembler.cs with complete opcode tables for all standard
  WASM instructions (control, parametric, variable, table, memory,
  numeric, conversion, sign-extension, reference types) plus 0xFC
  (bulk memory/saturating truncation), 0xFB (GC), and 0xFD (SIMD)
  prefixed opcodes
- Add WebcilImageReader.GetWasmFunctionBody() to parse the WASM module's
  type, function, and code sections to extract function info including
  type signature and local declarations
- Integrate into TextDumper.DumpWasmDisasm() to print parameters and
  locals with their local indices, result types, and disassembled
  instructions

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
WebcilAssemblyMetadata was not retaining a reference to the pinned
metadata byte array passed to its constructor. After
GetStandaloneAssemblyMetadata returned, the array could be collected
by the GC despite being allocated on the Pinned Object Heap, since
no live reference existed. This caused an AccessViolationException
when MetadataReader accessed the freed memory on larger files like
system.private.corelib.wasm.

Fix: store the metadata byte array in a field to keep it rooted for
the lifetime of the MetadataReader.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace the stub DecodeFDPrefixed() method with a complete implementation
of all WebAssembly SIMD instructions (0xFD prefix, sub-opcodes 0-255)
per the WebAssembly spec. This includes memory operations, lane
load/store, shuffle, splat, extract/replace lane, comparisons, bitwise
operations, arithmetic, and conversion instructions.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Implement opcode 0x1F (try_table) per the WebAssembly exception handling
spec. Decodes the block type and vector of catch clauses, supporting all
four catch clause kinds: catch, catch_ref, catch_all, catch_all_ref.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 6, 2026 20:44
@github-actions github-actions Bot added the area-crossgen2-coreclr only use for closed issues label May 6, 2026
@davidwrighton davidwrighton requested a review from adamperlin May 6, 2026 20:53
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR extends the CoreCLR R2RDump toolchain to recognize Webcil inputs (including WASM-wrapped Webcil) and adds a WebAssembly bytecode disassembler for dumping function bodies in a WAT-like textual form. It also evolves the metadata abstraction so method-body bytes can be retrieved without assuming a PE-backed PEReader.

Changes:

  • Add WebcilImageReader support to ReadyToRunReader initialization and R2RDump’s metadata-opening path.
  • Replace IAssemblyMetadata.ImageReader with IAssemblyMetadata.GetSectionData(int rva) and update method-body local signature decoding accordingly.
  • Add WasmDisassembler and integrate WASM disassembly printing into TextDumper for Webcil/WASM scenarios.

Reviewed changes

Copilot reviewed 12 out of 12 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
src/coreclr/tools/r2rdump/WasmDisassembler.cs New WASM bytecode decoder/disassembler for dumping instructions.
src/coreclr/tools/r2rdump/TextDumper.cs Emits WASM-specific disassembly and metadata (params/locals/results) for Webcil inputs.
src/coreclr/tools/r2rdump/Program.cs Adds fallback handling for OperatingSystem.Unknown when producing TargetDetails.
src/coreclr/tools/r2rdump/DumpModel.cs Detects Webcil inputs when opening reference assemblies for metadata resolution.
src/coreclr/tools/aot/ILCompiler.Reflection.ReadyToRun/WebcilImageReader.cs New reader that parses Webcil (and WASM-wrapped Webcil) and exposes metadata/sections/function bodies.
src/coreclr/tools/aot/ILCompiler.Reflection.ReadyToRun/StandaloneAssemblyMetadata.cs Implements GetSectionData via PEReader section access.
src/coreclr/tools/aot/ILCompiler.Reflection.ReadyToRun/ReadyToRunReader.cs Detects Webcil images and uses WebcilImageReader as the CompositeReader.
src/coreclr/tools/aot/ILCompiler.Reflection.ReadyToRun/ReadyToRunMethod.cs Switches local-signature decoding to use GetSectionData + MethodBodyBlock.Create.
src/coreclr/tools/aot/ILCompiler.Reflection.ReadyToRun/ManifestAssemblyMetadata.cs Implements GetSectionData when backed by a PEReader.
src/coreclr/tools/aot/ILCompiler.Reflection.ReadyToRun/ILCompiler.Reflection.ReadyToRun.csproj Links in shared Webcil definitions (Webcil.cs).
src/coreclr/tools/aot/ILCompiler.Reflection.ReadyToRun/IAssemblyMetadata.cs Replaces PEReader exposure with GetSectionData(int rva).
src/coreclr/tools/aot/ILCompiler.ReadyToRun.Tests/TestCasesRunner/R2RResultChecker.cs Updates test metadata wrapper to implement GetSectionData.

Comment thread src/coreclr/tools/r2rdump/WasmDisassembler.cs
Comment thread src/coreclr/tools/r2rdump/WasmDisassembler.cs
Comment thread src/coreclr/tools/aot/ILCompiler.Reflection.ReadyToRun/WebcilImageReader.cs Outdated
Comment thread src/coreclr/tools/aot/ILCompiler.Reflection.ReadyToRun/WebcilImageReader.cs Outdated
Comment thread src/coreclr/tools/aot/ILCompiler.Reflection.ReadyToRun/WebcilImageReader.cs Outdated
Comment thread src/coreclr/tools/r2rdump/DumpModel.cs
Comment thread src/coreclr/tools/r2rdump/Program.cs Outdated
Comment thread src/coreclr/tools/aot/ILCompiler.Reflection.ReadyToRun/WebcilImageReader.cs Outdated
@jkotas jkotas added area-ReadyToRun arch-wasm WebAssembly architecture and removed area-crossgen2-coreclr only use for closed issues labels May 6, 2026
@dotnet-policy-service
Copy link
Copy Markdown
Contributor

Tagging subscribers to 'arch-wasm': @lewing, @pavelsavara
See info in area-owners.md if you want to be subscribed.

Comment thread src/coreclr/tools/aot/ILCompiler.Reflection.ReadyToRun/WebcilImageReader.cs Outdated
Comment thread src/coreclr/tools/aot/ILCompiler.Reflection.ReadyToRun/WebcilImageReader.cs Outdated
…ping

- Use stream.ReadExactly instead of stream.Read for WASM detection
- Validate sectionEnd and bodyEnd against image bounds
- Use MetadataReaderProvider.FromMetadataImage for safe metadata lifetime
- Constrain GetSectionData BlobReader to section boundary (not EOF)
- Pin byte[] in DumpModel for Webcil reference assemblies
- Map OperatingSystem.Unknown to TargetOS.Unknown instead of Linux

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Comment thread src/coreclr/tools/aot/ILCompiler.Reflection.ReadyToRun/WebcilImageReader.cs Outdated
Comment thread src/coreclr/tools/r2rdump/WasmDisassembler.cs
davidwrighton and others added 5 commits May 7, 2026 14:07
…e unsafe code

- Replace Unsafe.As<byte[], ImmutableArray<byte>> with ImmutableCollectionsMarshal.AsImmutableArray
- Replace Unsafe.As<ImmutableArray<byte>, byte[]> with ImmutableCollectionsMarshal.AsArray
- Change WebcilImageReader to hold ImmutableArray<byte> internally
- Change WasmFunctionInfo.Image to ImmutableArray<byte>
- Refactor IAssemblyMetadata.GetSectionData to callback-based Action<BlobReader>
  to ensure pointer lifetime safety in WebcilAssemblyMetadata
- Replace unsafe fixed/MemoryCopy in TryReadHeader and ReadSections with
  BinaryPrimitives.ReadXxxLittleEndian
- Replace unsafe MetadataReader construction with MetadataReaderProvider.FromMetadataImage
- Update WasmDisassembler to use ImmutableArray<byte> and BinaryPrimitives

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…lySpan

- Replace brute-force magic byte scanning with proper WASM section parsing
  to locate the Webcil payload in the data section (id=11)
- Consolidate ReadLebU32, ParseTypeSection, ParseFunctionSection, and
  SkipConstExpr to take ReadOnlySpan<byte> instead of separate byte[]/
  ImmutableArray<byte> overloads
- Remove last fixed/unsafe code from ReadSections (previous commit missed
  consolidating the ReadLebU32 overload)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
GetWasmFunctionBody now parses all function bodies on first call and
stores the results in a lazily-initialized array. Subsequent calls
are a simple index lookup.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ions

Block-opening instructions (block, loop, if, try_table) were incrementing
indent before returning, causing them to display one level too deep.
The else instruction made no indent adjustment, displaying at body level
instead of aligning with its matching if/end.

Fix by introducing a postAdjust out parameter: DecodeInstruction now sets
indent to the display value for the current line, and postAdjust is applied
after printing to set the indent for subsequent lines.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 8, 2026 00:30
Update GetSectionData signature from returning BlobReader to accepting
Action<BlobReader>, matching the interface change.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@davidwrighton davidwrighton requested a review from adamperlin May 8, 2026 00:37
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 8 comments.

Comment on lines +1 to 3
using System;
using System.Reflection.Metadata;
using System.Reflection.PortableExecutable;
Comment on lines 482 to 488
byte[] imageBytes = File.ReadAllBytes(path);
_peReader = new PEReader(new MemoryStream(imageBytes));
}

public PEReader ImageReader => _peReader;
public void GetSectionData(int relativeVirtualAddress, Action<BlobReader> action) => action(_peReader.GetSectionData(relativeVirtualAddress).GetReader());

public MetadataReader MetadataReader => _peReader.GetMetadataReader();
Comment on lines +980 to +986
private string ReadHeapType()
{
byte b = _code[_offset];
// Abstract heap types are encoded as single bytes
switch (b)
{
case 0x73: _offset++; return "nofunc";
Comment on lines +170 to +177
while (offset < imageSpan.Length)
{
byte sectionId = imageSpan[offset++];
uint sectionSize = ReadLebU32(imageSpan, ref offset);
int sectionEnd = offset + (int)sectionSize;

if (sectionEnd > imageSpan.Length)
throw new BadImageFormatException($"WASM section {sectionId} size extends beyond image boundary");
Comment on lines +493 to +497
var sections = ImmutableArray.CreateBuilder<WebcilSectionHeader>(header.CoffSections);

for (int i = 0; i < header.CoffSections; i++)
{
ReadOnlySpan<byte> span = image.AsSpan((int)(sectionDirectoryOffset + (i * SectionSize)));
Comment on lines +311 to +318
public int GetOffset(int rva)
{
foreach (var section in _sections)
{
if ((uint)rva >= section.VirtualAddress && (uint)rva < section.VirtualAddress + section.VirtualSize)
{
uint offset = (uint)rva - section.VirtualAddress;
if (offset >= section.SizeOfRawData)
Comment on lines +543 to +544
// Each passive segment: kind=1(byte) + size(LEB128) + bytes
// The Webcil payload is in the second passive data segment.
Comment on lines +271 to +275
uint paramCount = ReadLebU32(data, ref offset);
byte[] paramTypes = new byte[paramCount];
for (uint j = 0; j < paramCount; j++)
paramTypes[j] = data[offset++];
uint resultCount = ReadLebU32(data, ref offset);
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

arch-wasm WebAssembly architecture area-ReadyToRun

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants