Skip to content

Data race on shared_error in validate_expand_MathAPI_new2 / validate_expand #1

@TarekIbnZiad

Description

@TarekIbnZiad

Data race on shared_error in validate_expand_MathAPI_new2 / validate_expand

Environment

  • cuJSON commit: 2ac7d3dcd7ad1ff64ebdb14022bf94c59b3b4953 (branch master)
  • OS: Ubuntu 22.04.5 LTS
  • GPU: NVIDIA A40 (Ampere, sm_86)
  • NVIDIA driver: 590.48.01
  • CUDA toolkit: 13.1 (nvcc V13.1.115)

Component

cuJSON — parse_standard_json.cu, parse_json_lines.cu

Severity

Medium (CWE-362, Concurrent Execution Using Shared Resource with Improper Synchronization) — correctness issue with potential security implications.

Description

The bracket-validation kernels validate_expand_MathAPI_new2 (parse_standard_json.cu) and validate_expand (parse_json_lines.cu) contain a shared-memory data race on the shared_error flag. The code uses a non-atomic shared_error |= 1 instead of atomicOr(&shared_error, 1u), creating a read/write race between threads in the same block.

The bug is deterministic: any JSON/JSONL input with bracket structures triggers it, and deeper nesting increases the number of concurrent threads and the probability of an observable effect.

Root cause

Both kernels use a shared-memory error flag:

__shared__ uint32_t shared_error;
// ...
shared_error |= 1; // non-atomic read-modify-write

Multiple threads in the same warp/block execute shared_error |= 1 concurrently. |= is a non-atomic read-modify-write (read, OR, write back). For two concurrent threads:

  1. Thread A reads shared_error = 0
  2. Thread B reads shared_error = 0
  3. Thread A writes shared_error = 1
  4. Thread B writes shared_error = 1 (lost update)

For |= 1 the lost update is benign (same value), but it is still undefined behavior in the CUDA memory model. More importantly, the race also exists between the error-setting writes and subsequent reads of shared_error (read at offset +0x260 vs write at +0x3d0), so a thread can observe stale state and miss an error condition. The earlier kernel variant validate_expand_MathAPI_new correctly used atomicOr(&shared_error, 1).

Affected locations: validate_expand_MathAPI_new2 (parse_standard_json.cu, ~lines 1491–1542) and validate_expand (parse_json_lines.cu, ~line 1009).

Impact

  • Correctness: a thread can read shared_error before another thread's write is visible, potentially missing a bracket-pairing error and allowing malformed JSON to pass validation.
  • Security: if pairError is not set when it should be, the parser continues with incorrectly paired brackets. Downstream iterators that trust pair_pos validity can then produce incorrect results or trigger out-of-bounds accesses.

Reproduction

Build the harness and run racecheck:

CUDA_HOME=${CUDA_HOME:-/usr/local/cuda}
nvcc -std=c++17 -arch=native -O2 -g -lineinfo \
  -I<cujson-src> -I${CUDA_HOME}/include \
  llm_harness_parse_standard_json.cu harness_afl_main.cpp \
  --compiler-bindir g++-13 -L${CUDA_HOME}/lib64 -lcudart \
  -o /tmp/cujson_validate_expand_racecheck.bin

compute-sanitizer --tool=racecheck --report-api-errors=no --error-exitcode 99 \
  /tmp/cujson_validate_expand_racecheck.bin trigger_input.bin

trigger_input.bin is 201 bytes: 100 nested [ brackets, then 1, then 100 ] brackets ([[[...[1]...]]]).

Expected: no race conditions. Actual: a racecheck hazard on shared_error.

Sanitizer evidence

parse_standard_json (racecheck):

Warning: Race between Read at validate_expand_MathAPI_new2+0x260
  and Write at validate_expand_MathAPI_new2+0x3d0 [4 hazards]
  and Write at validate_expand_MathAPI_new2+0x820 [4 hazards]

Error: Race between Write at validate_expand_MathAPI_new2+0x3d0
  and Read at validate_expand_MathAPI_new2+0x770 [4 hazards]

RACECHECK SUMMARY: 2 hazards (1 error, 1 warning)

parse_json_lines (racecheck):

Error: Race between Read at validate_expand+0x260
  and Write at validate_expand+0x3d0 [12 hazards]

RACECHECK SUMMARY: 1 hazard (1 error, 0 warnings)

Cross-tool notes: iGUARD does not flag this (it targets global-memory cross-SM races, not shared-memory intra-block RMW); memcheck is N/A (race, not OOB); synccheck reports 0 errors.

Suggested fix

Replace the non-atomic updates with atomicOr at all sites that set shared_error:

// validate_expand_MathAPI_new2 (parse_standard_json.cu, ~5 locations):
atomicOr(&shared_error, 1u);

// validate_expand (parse_json_lines.cu, ~1 location):
atomicOr(&shared_error, 1u);

Attachments

The following files from this finding are attached:

  • llm_harness_parse_standard_json.cu
  • harness_afl_main.cpp
  • trigger_input.bin
  • racecheck.log

Archive.zip

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions