Skip to content

Feat/shrink cpu byte alu#644

Open
jotabulacios wants to merge 22 commits into
mainfrom
feat/shrink-cpu-byte-alu
Open

Feat/shrink cpu byte alu#644
jotabulacios wants to merge 22 commits into
mainfrom
feat/shrink-cpu-byte-alu

Conversation

@jotabulacios
Copy link
Copy Markdown
Collaborator

@jotabulacios jotabulacios commented Jun 2, 2026

This PR implements the CPU rework specified in #624.

The CPU table is reduced from ~76 to ~39 columns by moving word
instructions to a dedicated table and collapsing the per-chip ALU/memory
buses into two shared buses.

Summary

  • Dense packed_decode.
    The per-opcode one-hot ALU selectors and the *_ext_bit / arg1
    columns were removed.
    Each CPU row now carries a single packed_decode field containing
    flags, register indices, and the alu_flags / mem_flags bytes. This
    value is checked against the DECODE table.

  • Unified buses.
    The per-chip Lt / Mul / Dvrm / Shift buses, together with the
    load/store path, were collapsed into two shared buses:

    • ALU[out; in1, in2, flags]
    • MEMORY[out; timestamp, address, value, flags]

    The CPU now dispatches every arithmetic, comparison, and shift operation
    through ALU, and every memory operation through MEMORY.

  • BYTE_ALU lookup.
    Bitwise AND / OR / XOR now use a single BYTE_ALU[opsel, X, Y] -> out lookup served by the BITWISE table.

  • Word instructions delegated to CPU32.
    *W instructions are delegated to a new CPU32 table, which has its
    own decode and 32-bit execution logic.
    In the main CPU table, these rows now only delegate execution, advance
    the PC, and send the corresponding CPU32 lookup.

  • New chips.
    Added the following chips:

    • CPU32: handles word instructions.
    • EQ: handles BEQ / BNE.
    • BYTEWISE: handles AND / OR / XOR through BYTE_ALU.
    • STORE: split from MEMW; MEMORY now dispatches between LOAD and
      STORE.

Comment thread prover/src/constraints/cpu.rs Outdated
@diegokingston
Copy link
Copy Markdown
Collaborator

/bench 5

@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 2, 2026

Benchmark — fib_iterative_8M (median of 3)

Table parallelism: auto (cores / 3)

Metric main PR Δ
Peak heap 51805 MB 35320 MB -16485 MB (-31.8%) 🟢
Prove time 24.644s 18.588s -6.056s (-24.6%) 🟢

🎉 Improvement detected — heap or time decreased by more than 5%.

✅ Low variance (time: 2.3%, heap: 0.5%)

Commit: 0a10f13 · Baseline: cached · Runner: self-hosted bench

@jotabulacios jotabulacios marked this pull request as ready for review June 2, 2026 19:14
@github-actions
Copy link
Copy Markdown

github-actions Bot commented Jun 2, 2026

Codex Code Review

Found one issue.

High - CPU32 padding rows can emit disconnected bus operations

prover/src/tables/cpu32.rs uses READ_REGISTER1, READ_REGISTER2, WRITE_REGISTER, and ALU directly as bus multiplicities, but the constraints only bit-check those flags and MU; they never enforce flag => MU at prover/src/tables/cpu32.rs. Since padded CPU32 rows have MU = 0, a malicious witness can set one of these flags on a row that is not connected to a main CPU CPU32 delegation or DECODE lookup, then emit MEMW/ALU interactions from an otherwise inactive row.

This can inject extra register reads/writes through MEMW without a corresponding executed *W instruction, weakening execution soundness. Add constraints like (1 - MU) * flag = 0 for READ_REGISTER1, READ_REGISTER2, WRITE_REGISTER, ALU, ADD, and SUB or make all CPU32 bus multiplicities gated by MU.

Verification note: I tried to run the focused prover tests, but the container’s Rust toolchain failed before build because /home/runner/.rustup/tmp is read-only.

Comment thread prover/src/constraints/cpu.rs Outdated
let carry = match self.carry_idx {
0 => self.compute_carry_0(step),
1 => self.compute_carry_1(step),
_ => panic!("Invalid carry index"),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Low — use unreachable! instead of panic!

The _ => panic!("Invalid carry index") arm is dead code: carry_idx is already validated as <= 1 by the assert! in new. Using panic! here signals a recoverable runtime error; unreachable! communicates the correct intent (this is a logic bug, not a runtime condition) and may allow the compiler to eliminate the arm entirely in release builds.

Suggested change
_ => panic!("Invalid carry index"),
_ => unreachable!("carry_idx validated <= 1 at construction"),

Comment thread prover/src/tables/eq.rs
alu_op::AND => self.a & self.b,
alu_op::OR => self.a | self.b,
alu_op::XOR => self.a ^ self.b,
other => panic!("BYTEWISE only handles AND/OR/XOR, got opcode {other}"),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Low — panic! in trace-generation path aborts proof creation ungracefully

This panic! (and the identical one at line 81) fires if the trace builder passes an unexpected opcode. Since this is inside trace generation (not constraint evaluation), a panic would crash the prover process rather than returning a recoverable error.

Consider returning Result<_, ProverError> from compute_res / collect_bitwise_ops and propagating the error up through the trace builder, so a malformed trace produces a proper error instead of an abort.

Comment thread prover/src/constraints/cpu.rs
@claude
Copy link
Copy Markdown

claude Bot commented Jun 2, 2026

Code Review — Feat/shrink-cpu-byte-alu (#644)

This is a significant, well-structured CPU table rework. The reduction from ~76 to ~39 columns, the unified ALU/MEMORY buses, and the delegation of *W instructions to CPU32 are all sound architectural choices. The new chips (EQ, BYTEWISE, CPU32, STORE) are clean and well-documented.

Constraint Completeness

The overall constraint set is correct and complete for valid traces:

  • All 12 IS_BIT flags cover the top-level flag columns.
  • The six word_instr · {MEMORY,BRANCH,ECALL,...} = 0 decode mutexes are present.
  • ADD/SUB fast-path templates, the unified ALU/MEMORY bus fingerprints, and the DECODE/BRANCH/ECALL lookups are correctly wired.

Issues Found

Severity Location Issue
Medium constraints/cpu.rs:413 BranchRvdConstraint omits carry from low→high word for JAL/JALR return address — not cryptographically enforced
Low constraints/cpu.rs:46 BRANCH_COND removed from IS_BIT enforcement; soundness is compositional and fragile under future refactors
Low tables/eq.rs:317 eq column has no explicit IS_BIT constraint; soundness depends on the ZERO bus receiver's output domain
Low constraints/cpu.rs:546 _ => panic!(...) should be unreachable!() — it is dead code guarded by an assert! at construction
Low tables/bytewise.rs:69,81 panic! in trace-generation path aborts the prover ungracefully on invalid opcode; should return Result

Inline comments posted for each issue.

Minor Notes

  • MemFlagsBitConstraint (non-MEMORY rows) correctly covers mem_flags as JALR. The dependency chain is non-obvious but sound — a brief cross-reference comment pointing to the mutex would help future readers.
  • NextPcAddConstraint's compute_carry_1 correctly propagates carry_0 for next_pc, but BranchRvdConstraint doesn't mirror this for rvd (the medium issue above).

@jotabulacios
Copy link
Copy Markdown
Collaborator Author

/bench

@jotabulacios
Copy link
Copy Markdown
Collaborator Author

/claude

@jotabulacios
Copy link
Copy Markdown
Collaborator Author

/bench

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants