Fix PS 1.4 instruction-size table and bound the bytecode walk#137
Open
BinqAdams wants to merge 1 commit intoNVIDIAGameWorks:mainfrom
Open
Fix PS 1.4 instruction-size table and bound the bytecode walk#137BinqAdams wants to merge 1 commit intoNVIDIAGameWorks:mainfrom
BinqAdams wants to merge 1 commit intoNVIDIAGameWorks:mainfrom
Conversation
PS 1.4 reuses opcodes 0x40 and 0x42 with explicit dst+src token pairs (3 dwords each) where PS 1.0-1.3 had implicit src derived from the dst register index (2 dwords each): Opcode Old table PS 1.0-1.3 PS 1.4 ------ --------- ---------- ------ 0x40 return 2 D3DSIO_TEXCOORD (2 dw) D3DSIO_TEXCRD (3 dw) 0x41 return 2 D3DSIO_TEXKILL (2 dw) D3DSIO_TEXKILL (2 dw) 0x42 return 2 D3DSIO_TEX (2 dw) D3DSIO_TEXLD (3 dw) The static SM 1.x table in CommonShader::getShaderInstructionSize was authored for PS 1.0-1.3 / VS 1.x and gives the wrong size for PS 1.4. The parser advances by 2 instead of 3, lands mid-instruction, treats a register-arg token as the next opcode, misaligns indefinitely, and walks past the actual D3DSIO_END token off the end of the bytecode buffer. Without PageHeap the over-read silently corrupts the heap freelist and surfaces as a delayed RtlpAllocateHeap AV; with PageHeap it AVs immediately on the guard page. Two changes in d3d9_commonshader.h: 1. PS 1.4 special case in getShaderInstructionSize. For m_majorVersion == 1 && m_minorVersion == 4, opcodes 0x40 and 0x42 return 3 instead of 2. Opcode 0x41 (TEXKILL) is unchanged at 2 dwords across PS versions; 0x43-0x4F are deprecated/reserved in PS 1.4 and shouldn't appear. Other opcodes fall through to the existing PS 1.0-1.3 / VS table. 2. Defense-in-depth bound in getShaderByteSize. 16K-dword (64 KB) cap on the walk. Real D3D9 shaders are at most a few KB; if the walk exceeds the cap without finding D3DSIO_END the bytecode is malformed or the instruction-size table doesn't recognize a token in this shader variant. Logs via bridge_util::Logger::err and returns the cap-bounded size so the caller's memcpy stays bounded - the server will reject the shader, but the client survives instead of walking off the end of the heap.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Fix a heap-corruption bug in
CommonShader::getShaderByteSizetriggered by any D3D9 application that creates Pixel Shader 1.4 shaders, and add a defense-in-depth bound on the bytecode walk.Motivation
PS 1.4 reuses opcodes
0x40and0x42with explicit dst+src token pairs (3 dwords each) where PS 1.0–1.3 had implicit src derived from the dst register index (2 dwords each):0x40return 2D3DSIO_TEXCOORD(2 dw)D3DSIO_TEXCRD(3 dw)0x41return 2D3DSIO_TEXKILL(2 dw)D3DSIO_TEXKILL(2 dw)0x42return 2D3DSIO_TEX(2 dw)D3DSIO_TEXLD(3 dw)The static SM 1.x table in
CommonShader::getShaderInstructionSize(bridge/src/client/d3d9_commonshader.h) was authored for PS 1.0–1.3 / VS 1.x and gives the wrong size for PS 1.4. The parser advances by 2 instead of 3, lands mid-instruction, treats a register-arg token as the next opcode, misaligns indefinitely, and walks past the actualD3DSIO_ENDtoken off the end of the bytecode buffer. Without PageHeap the over-read silently corrupts the heap freelist and surfaces as a delayedRtlpAllocateHeapAV; with PageHeap (gflags /full) it AVs immediately on the guard page.What Changed
Two changes in
bridge/src/client/d3d9_commonshader.h:getShaderInstructionSize. Form_majorVersion == 1 && m_minorVersion == 4, opcodes0x40and0x42return 3 instead of 2. Opcode0x41(TEXKILL) is unchanged at 2 dwords across PS versions;0x43–0x4Fare deprecated/reserved in PS 1.4 and shouldn't appear. Other opcodes fall through to the existing PS 1.0–1.3 / VS table.getShaderByteSize. 16K-dword (64 KB) cap on the walk. Real D3D9 shaders are at most a few KB; if the walk exceeds the cap without findingD3DSIO_ENDthe bytecode is malformed or the instruction-size table doesn't recognize a token in this shader variant. Logs viabridge_util::Logger::errand returns the cap-bounded size so the caller'smemcpystays bounded — the server will reject the shader, but the client survives instead of walking off the end of the heap.Net diff:
bridge/src/client/d3d9_commonshader.honly, +59 / -3.Testing
Reproducer: any D3D9 application that creates a PS 1.4 shader containing
D3DSIO_TEXCRD(0x40) orD3DSIO_TEXLD(0x42). With the old table,CreatePixelShadercorrupts heap metadata adjacent to the bytecode buffer. Under PageHeap (gflags /full), the over-read AVs immediately on the guard page following the buffer; without PageHeap, the corruption surfaces delayed inRtlpAllocateHeapafter thousands of unrelated allocations.With this fix the PS 1.4 walk terminates correctly at
D3DSIO_ENDand the buffer-boundedmemcpyno longer over-reads. The 16K-dword cap activates on malformed bytecode and on shader variants the table doesn't yet handle: in those cases the log line names the failing function and the server-side compilation fails cleanly.