Add Webcil support to R2RDump#127885
Open
davidwrighton wants to merge 13 commits intodotnet:mainfrom
Open
Conversation
R2RDump previously could not read Webcil files (the format used for managed assemblies in WebAssembly environments). This adds a WebcilImageReader that implements IBinaryImageReader for the Webcil format, enabling R2RDump to dump headers, methods, and section contents from Webcil-format R2R images. Changes: - New WebcilImageReader.cs implementing IBinaryImageReader - ReadyToRunReader detects Webcil format (after MachO, before PE) - DumpModel handles Webcil in reference assembly loading - Program.cs maps OperatingSystem.Unknown to TargetOS.Linux for Webcil - ReadyToRunMethod gracefully handles null PEReader (Webcil has no PE) - ILCompiler.Reflection.ReadyToRun.csproj includes shared Webcil.cs Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace the PEReader ImageReader property with a GetSectionData(int rva) method that returns a BlobReader. This decouples the interface from PEReader, enabling non-PE formats (Webcil) to provide section data. Implementations: - StandaloneAssemblyMetadata: delegates to PEReader.GetSectionData - ManifestAssemblyMetadata: same with null-guard - WebcilAssemblyMetadata: resolves RVA via WebcilImageReader sections - SimpleAssemblyMetadata (tests): delegates to PEReader.GetSectionData Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Implement a full WASM instruction disassembler that decodes WebAssembly binary format into WAT-style text output. This enables the --disasm flag in R2RDump to work with Webcil/WASM R2R images. - Add WasmDisassembler.cs with complete opcode tables for all standard WASM instructions (control, parametric, variable, table, memory, numeric, conversion, sign-extension, reference types) plus 0xFC (bulk memory/saturating truncation), 0xFB (GC), and 0xFD (SIMD) prefixed opcodes - Add WebcilImageReader.GetWasmFunctionBody() to parse the WASM module's type, function, and code sections to extract function info including type signature and local declarations - Integrate into TextDumper.DumpWasmDisasm() to print parameters and locals with their local indices, result types, and disassembled instructions Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
WebcilAssemblyMetadata was not retaining a reference to the pinned metadata byte array passed to its constructor. After GetStandaloneAssemblyMetadata returned, the array could be collected by the GC despite being allocated on the Pinned Object Heap, since no live reference existed. This caused an AccessViolationException when MetadataReader accessed the freed memory on larger files like system.private.corelib.wasm. Fix: store the metadata byte array in a field to keep it rooted for the lifetime of the MetadataReader. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace the stub DecodeFDPrefixed() method with a complete implementation of all WebAssembly SIMD instructions (0xFD prefix, sub-opcodes 0-255) per the WebAssembly spec. This includes memory operations, lane load/store, shuffle, splat, extract/replace lane, comparisons, bitwise operations, arithmetic, and conversion instructions. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Implement opcode 0x1F (try_table) per the WebAssembly exception handling spec. Decodes the block type and vector of catch clauses, supporting all four catch clause kinds: catch, catch_ref, catch_all, catch_all_ref. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Contributor
There was a problem hiding this comment.
Pull request overview
This PR extends the CoreCLR R2RDump toolchain to recognize Webcil inputs (including WASM-wrapped Webcil) and adds a WebAssembly bytecode disassembler for dumping function bodies in a WAT-like textual form. It also evolves the metadata abstraction so method-body bytes can be retrieved without assuming a PE-backed PEReader.
Changes:
- Add WebcilImageReader support to ReadyToRunReader initialization and R2RDump’s metadata-opening path.
- Replace
IAssemblyMetadata.ImageReaderwithIAssemblyMetadata.GetSectionData(int rva)and update method-body local signature decoding accordingly. - Add WasmDisassembler and integrate WASM disassembly printing into TextDumper for Webcil/WASM scenarios.
Reviewed changes
Copilot reviewed 12 out of 12 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| src/coreclr/tools/r2rdump/WasmDisassembler.cs | New WASM bytecode decoder/disassembler for dumping instructions. |
| src/coreclr/tools/r2rdump/TextDumper.cs | Emits WASM-specific disassembly and metadata (params/locals/results) for Webcil inputs. |
| src/coreclr/tools/r2rdump/Program.cs | Adds fallback handling for OperatingSystem.Unknown when producing TargetDetails. |
| src/coreclr/tools/r2rdump/DumpModel.cs | Detects Webcil inputs when opening reference assemblies for metadata resolution. |
| src/coreclr/tools/aot/ILCompiler.Reflection.ReadyToRun/WebcilImageReader.cs | New reader that parses Webcil (and WASM-wrapped Webcil) and exposes metadata/sections/function bodies. |
| src/coreclr/tools/aot/ILCompiler.Reflection.ReadyToRun/StandaloneAssemblyMetadata.cs | Implements GetSectionData via PEReader section access. |
| src/coreclr/tools/aot/ILCompiler.Reflection.ReadyToRun/ReadyToRunReader.cs | Detects Webcil images and uses WebcilImageReader as the CompositeReader. |
| src/coreclr/tools/aot/ILCompiler.Reflection.ReadyToRun/ReadyToRunMethod.cs | Switches local-signature decoding to use GetSectionData + MethodBodyBlock.Create. |
| src/coreclr/tools/aot/ILCompiler.Reflection.ReadyToRun/ManifestAssemblyMetadata.cs | Implements GetSectionData when backed by a PEReader. |
| src/coreclr/tools/aot/ILCompiler.Reflection.ReadyToRun/ILCompiler.Reflection.ReadyToRun.csproj | Links in shared Webcil definitions (Webcil.cs). |
| src/coreclr/tools/aot/ILCompiler.Reflection.ReadyToRun/IAssemblyMetadata.cs | Replaces PEReader exposure with GetSectionData(int rva). |
| src/coreclr/tools/aot/ILCompiler.ReadyToRun.Tests/TestCasesRunner/R2RResultChecker.cs | Updates test metadata wrapper to implement GetSectionData. |
adamperlin
reviewed
May 6, 2026
Contributor
|
Tagging subscribers to 'arch-wasm': @lewing, @pavelsavara |
adamperlin
reviewed
May 6, 2026
adamperlin
reviewed
May 6, 2026
adamperlin
reviewed
May 6, 2026
…ping - Use stream.ReadExactly instead of stream.Read for WASM detection - Validate sectionEnd and bodyEnd against image bounds - Use MetadataReaderProvider.FromMetadataImage for safe metadata lifetime - Constrain GetSectionData BlobReader to section boundary (not EOF) - Pin byte[] in DumpModel for Webcil reference assemblies - Map OperatingSystem.Unknown to TargetOS.Unknown instead of Linux Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
pavelsavara
reviewed
May 7, 2026
pavelsavara
reviewed
May 7, 2026
…e unsafe code - Replace Unsafe.As<byte[], ImmutableArray<byte>> with ImmutableCollectionsMarshal.AsImmutableArray - Replace Unsafe.As<ImmutableArray<byte>, byte[]> with ImmutableCollectionsMarshal.AsArray - Change WebcilImageReader to hold ImmutableArray<byte> internally - Change WasmFunctionInfo.Image to ImmutableArray<byte> - Refactor IAssemblyMetadata.GetSectionData to callback-based Action<BlobReader> to ensure pointer lifetime safety in WebcilAssemblyMetadata - Replace unsafe fixed/MemoryCopy in TryReadHeader and ReadSections with BinaryPrimitives.ReadXxxLittleEndian - Replace unsafe MetadataReader construction with MetadataReaderProvider.FromMetadataImage - Update WasmDisassembler to use ImmutableArray<byte> and BinaryPrimitives Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…lySpan - Replace brute-force magic byte scanning with proper WASM section parsing to locate the Webcil payload in the data section (id=11) - Consolidate ReadLebU32, ParseTypeSection, ParseFunctionSection, and SkipConstExpr to take ReadOnlySpan<byte> instead of separate byte[]/ ImmutableArray<byte> overloads - Remove last fixed/unsafe code from ReadSections (previous commit missed consolidating the ReadLebU32 overload) Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
GetWasmFunctionBody now parses all function bodies on first call and stores the results in a lazily-initialized array. Subsequent calls are a simple index lookup. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…ions Block-opening instructions (block, loop, if, try_table) were incrementing indent before returning, causing them to display one level too deep. The else instruction made no indent adjustment, displaying at body level instead of aligning with its matching if/end. Fix by introducing a postAdjust out parameter: DecodeInstruction now sets indent to the display value for the current line, and postAdjust is applied after printing to set the indent for subsequent lines. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Update GetSectionData signature from returning BlobReader to accepting Action<BlobReader>, matching the interface change. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Comment on lines
+1
to
3
| using System; | ||
| using System.Reflection.Metadata; | ||
| using System.Reflection.PortableExecutable; |
Comment on lines
482
to
488
| byte[] imageBytes = File.ReadAllBytes(path); | ||
| _peReader = new PEReader(new MemoryStream(imageBytes)); | ||
| } | ||
|
|
||
| public PEReader ImageReader => _peReader; | ||
| public void GetSectionData(int relativeVirtualAddress, Action<BlobReader> action) => action(_peReader.GetSectionData(relativeVirtualAddress).GetReader()); | ||
|
|
||
| public MetadataReader MetadataReader => _peReader.GetMetadataReader(); |
Comment on lines
+980
to
+986
| private string ReadHeapType() | ||
| { | ||
| byte b = _code[_offset]; | ||
| // Abstract heap types are encoded as single bytes | ||
| switch (b) | ||
| { | ||
| case 0x73: _offset++; return "nofunc"; |
Comment on lines
+170
to
+177
| while (offset < imageSpan.Length) | ||
| { | ||
| byte sectionId = imageSpan[offset++]; | ||
| uint sectionSize = ReadLebU32(imageSpan, ref offset); | ||
| int sectionEnd = offset + (int)sectionSize; | ||
|
|
||
| if (sectionEnd > imageSpan.Length) | ||
| throw new BadImageFormatException($"WASM section {sectionId} size extends beyond image boundary"); |
Comment on lines
+493
to
+497
| var sections = ImmutableArray.CreateBuilder<WebcilSectionHeader>(header.CoffSections); | ||
|
|
||
| for (int i = 0; i < header.CoffSections; i++) | ||
| { | ||
| ReadOnlySpan<byte> span = image.AsSpan((int)(sectionDirectoryOffset + (i * SectionSize))); |
Comment on lines
+311
to
+318
| public int GetOffset(int rva) | ||
| { | ||
| foreach (var section in _sections) | ||
| { | ||
| if ((uint)rva >= section.VirtualAddress && (uint)rva < section.VirtualAddress + section.VirtualSize) | ||
| { | ||
| uint offset = (uint)rva - section.VirtualAddress; | ||
| if (offset >= section.SizeOfRawData) |
Comment on lines
+543
to
+544
| // Each passive segment: kind=1(byte) + size(LEB128) + bytes | ||
| // The Webcil payload is in the second passive data segment. |
Comment on lines
+271
to
+275
| uint paramCount = ReadLebU32(data, ref offset); | ||
| byte[] paramTypes = new byte[paramCount]; | ||
| for (uint j = 0; j < paramCount; j++) | ||
| paramTypes[j] = data[offset++]; | ||
| uint resultCount = ReadLebU32(data, ref offset); |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Note
This PR was created with the assistance of GitHub Copilot.
Summary
Adds support for reading and dumping Webcil files in R2RDump, including a full WebAssembly bytecode disassembler.
Changes
GetSectionDatato the metadata interface and implement across all types.WasmDisassemblerclass that decodes WASM binary instructions into WAT text format, covering:try_tableinstruction with all catch clause kinds