Skip to content

Uninitialized device-memory read passed into CUB radix sort #7

@TarekIbnZiad

Description

@TarekIbnZiad

Uninitialized device-memory read passed into CUB radix sort

Environment

  • cuJSON commit: 2ac7d3dcd7ad1ff64ebdb14022bf94c59b3b4953 (branch master)
  • OS: Ubuntu 22.04.5 LTS
  • GPU: NVIDIA A40 (Ampere, sm_86)
  • NVIDIA driver: 590.48.01
  • CUDA toolkit: 13.1 (nvcc V13.1.115)

Component

cuJSON — sort integration (parse_standard_json.cu:1584-1611, parse_json_lines.cu:1070-1093). Reproduces across cuJSONIterator, cuJSONLinesIterator, parse_json_lines, and parse_standard_json.

Severity

Medium (CWE-908, Use of Uninitialized Resource).

Description

cuJSON passes partially initialized key/value buffers into the thrust/CUB sorting stage. cub::DeviceRadixSortSingleTileKernel then loads uninitialized tail bytes while sorting structural positions.

Root cause

The buffers handed to the radix sort are sized for a padded element count, but only the populated structural/pair elements are initialized. CUB sorts the full allocated range, so it loads the uninitialized tail. This reproduces across all four entry points because they share the same sort integration.

Call path:

harness LLVMFuzzerTestOneInput
  -> parse_standard_json / parse_json_lines / iterator entry points
  -> CUB DeviceRadixSortSingleTileKernel   (uninitialized read of sort buffer tail)

Impact

Systemic parse-tree corruption across standard JSON, JSON Lines, and iterator entry points. Malformed inputs can produce nondeterministic parse results and sanitizer-visible device reads.

Reproduction

CUDA_HOME=${CUDA_HOME:-/usr/local/cuda}
nvcc -std=c++17 -arch=native -O2 -g -lineinfo \
  -I<cujson-src> -I${CUDA_HOME}/include \
  llm_harness_cuJSONIterator.cu harness_afl_main.cpp \
  --compiler-bindir g++-13 -L${CUDA_HOME}/lib64 -lcudart \
  -o /tmp/cujson_cub_sort_initcheck.bin

compute-sanitizer --tool=initcheck --report-api-errors=no --error-exitcode 99 \
  /tmp/cujson_cub_sort_initcheck.bin trigger_cuJSONIterator.bin

Sanitizer evidence

========= Uninitialized __global__ memory read of size 4 bytes
========= Uninitialized __global__ memory read of size 1 bytes  (x6)
=========   Uninitialized access to 0x7f3cf6400480 on access by cudaMemcpy source
=========   Uninitialized access to 0x1938000420 on access by cudaMemcpy source
========= ERROR SUMMARY: 9 errors

Suggested fix

Track the exact initialized element count passed to the sorting kernels, zero-fill padded storage, and avoid sorting beyond the populated structural/pair arrays.

Attachments

The following files from this finding are attached:

  • llm_harness_cuJSONIterator.cu
  • harness_afl_main.cpp
  • trigger_cuJSONIterator.bin
  • initcheck.log

Archive.zip

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions