Skip to content

feat: Zero-Allocation Sequential Scan Optimization#21

Merged
poyrazK merged 9 commits intomainfrom
perf/zero-allocation-scan
Apr 11, 2026
Merged

feat: Zero-Allocation Sequential Scan Optimization#21
poyrazK merged 9 commits intomainfrom
perf/zero-allocation-scan

Conversation

@poyrazK
Copy link
Copy Markdown
Owner

@poyrazK poyrazK commented Apr 10, 2026

Summary

This PR implements a zero-allocation sequential scan path using a TupleView architecture, closing the performance gap between CloudSQL and SQLite.

Key Changes:

  • TupleView: Lightweight view into pinned BufferPool pages, eliminating row-level allocations.
  • Fast-Path MVCC: Bypasses complex visibility checks for non-transactional scans.
  • Iterator Caching: Caches PageHeader during page transitions to minimize memcpy.
  • Operator Integration: Pass-through support for the new path in Project, Filter, and Limit operators.

Performance Impact:

  • Scan Throughput: Increased from 5.5M rows/s to 181.4M rows/s (~33x speedup).
  • SQLite Comparison: CloudSQL is now 9x faster than SQLite in raw scanning.

Summary by CodeRabbit

  • New Features

    • Major sequential-scan throughput improvements via a new zero-allocation tuple-view scan path, faster non-transactional scan fast-path, and lightweight view-based projection/filter support.
  • Documentation

    • Added a Performance section and detailed benchmark report describing ~9x sequential-scan improvement vs SQLite and updated roadmap for analytical optimizations.
  • Tests

    • New tests validating view-based iteration, projection/filter behavior, and deleted-row skipping.

poyrazK added 3 commits April 10, 2026 23:04
- Added TupleView for direct page access in HeapTable.
- Implemented Iterator::next_view with PageHeader caching.
- Optimized SeqScanOperator with no_txn_ fast-path MVCC.
- Added next_view pass-through for Project, Filter, and Limit operators.
Integrated validation logic and items_per_second tracking to compare with SQLite.
- Updated README.md with 181M rows/s scan performance.
- Updated SQLite comparison docs with detailed analysis and future roadmap.
- Updated Phase 8 baseline.
@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 10, 2026

📝 Walkthrough

Walkthrough

This PR introduces a zero-allocation, page-resident tuple view (HeapTable::TupleView) and a parallel iterator API (Iterator::next_view) plus executor Operator::next_view plumbing, implements view-based scan paths and iterator page-caching optimizations, updates benchmarks to exercise the new path, and adds related performance documentation.

Changes

Cohort / File(s) Summary
Documentation
README.md, docs/performance/SQLITE_COMPARISON.md, docs/phases/PHASE_8_ANALYTICS.md
Added a "Performance" section and benchmark table to README; updated sequential-scan benchmark numbers, analysis, and roadmap in performance docs.
Storage headers
include/storage/heap_table.hpp
Added public HeapTable::TupleView (payload pointers, schema, MVCC bounds, optional column_mapping), get_value()/materialize() declarations, and Iterator::next_view() declaration plus iterator caching fields.
Executor headers
include/executor/operator.hpp
Added virtual Operator::next_view(TupleView&) and declarations for overrides in SeqScan/Filter/Project/Limit; added SeqScanOperator::no_txn_ and ProjectOperator projection bookkeeping fields.
Storage implementation
src/storage/heap_table.cpp
Implemented TupleView::get_value() and materialize(), Iterator::next_view() (zero-copy page-resident views), iterator page/header caching, and moved cached state in iterator move operations.
Executor implementation
src/executor/operator.cpp
SeqScanOperator::open() sets no_txn_; next() gains no-txn fast-path (xmax==0); added next_view() implementations for SeqScan/Filter/Project/Limit and simple-projection detection/column mapping in Project::init().
Benchmarks
benchmarks/sqlite_comparison_bench.cpp
Replaced original scan benchmark with BM_CloudSQL_ScanMaterialized; added BM_CloudSQL_ScanView which iterates via next_view() and validates row counts with runtime checks and error paths.
Tests
tests/cloudSQL_tests.cpp
Added TupleView/next_view unit tests exercising scan, deleted-row skipping, projection mapping, computed-projection behavior, schema population, and filter operator view-path.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant SeqScan as SeqScanOperator
    participant Iter as HeapTable::Iterator
    participant Page as PinnedPage
    participant View as TupleView

    Client->>SeqScan: open()  (sets no_txn_ if no txn)
    Client->>SeqScan: next_view(out_view)
    SeqScan->>Iter: next_view(out_view)
    Iter->>Iter: ensure cached_buffer_ & cached_header_
    Iter->>Page: pin/read page & locate record
    Page-->>Iter: payload_data + payload_len + mvcc
    Iter->>View: populate TupleView (payload, schema, xmin/xmax)
    Iter-->>SeqScan: return true/false
    SeqScan->>SeqScan: apply visibility (no_txn_ fast-path or MVCC)
    alt column accessed
        SeqScan->>View: get_value(col_index)
        View->>View: lazy-deserialize column from payload_data
        View-->>SeqScan: common::Value
    end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~80 minutes

Possibly related PRs

Suggested labels

documentation, enhancement

Poem

🐰 I hop through pinned pages, a nimble little scout,
I peek a TupleView where the bytes are tucked about.
No copies weigh my paws, just lazy-deserialize delight,
Rows stream like carrots—crisp, fast, and light! 🥕✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 40.54% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: Zero-Allocation Sequential Scan Optimization' directly and accurately describes the main change: implementing a zero-allocation sequential scan path using TupleView architecture, which is the central feature of this PR.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch perf/zero-allocation-scan

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🧹 Nitpick comments (2)
include/storage/heap_table.hpp (1)

150-155: Document the TupleView lifetime explicitly.

out_view points into the iterator's currently pinned page, so it becomes invalid as soon as the iterator advances to another page, closes, or is destroyed. That contract is important enough to spell out here; otherwise callers can accidentally retain a dangling view.

📝 Suggested doc tweak
         /**
-         * `@brief` Phase 1 optimization: Yields a zero-allocation TupleView
+         * `@brief` Phase 1 optimization: Yields a zero-allocation TupleView
          * `@param`[out] out_view The view struct to populate
+         * `@note` The returned view is only valid until the next iterator advance,
+         *       page transition, iterator close, or iterator destruction.
          * `@return` true if a record was successfully retrieved, false on EOF
          */
         bool next_view(TupleView& out_view);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/storage/heap_table.hpp` around lines 150 - 155, Update the
next_view(TupleView& out_view) documentation to explicitly state TupleView’s
lifetime: explain that out_view points into the iterator’s currently pinned page
and therefore becomes invalid as soon as the iterator advances to a different
page, is closed, or is destroyed; mention callers must copy data out of the
TupleView if they need it beyond the iterator’s current position and reference
the TupleView type and next_view method in the comment so readers can locate the
contract easily.
benchmarks/sqlite_comparison_bench.cpp (1)

237-247: Validate at least one decoded field.

This only proves that next_view() produced num_rows iterations. It will not catch broken decoding, schema mismatches, or pass-through operators returning the wrong columns. Consider pausing timing and checking one or two view.get_value(...) results so the benchmark cannot ratify a semantically broken fast path.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@benchmarks/sqlite_comparison_bench.cpp` around lines 237 - 247, The loop only
counts iterations; update the benchmark to pause timing after the loop (use
state.PauseTiming()), decode at least one field from a TupleView and verify its
value (use HeapTable::TupleView and view.get_value(column_index) or equivalent)
to detect broken decoding/schema mismatches, then resume timing if needed
(state.ResumeTiming()) or call state.SkipWithError on mismatch; ensure you
reference root->next_view(view) and perform a simple non-empty or expected-value
check on view.get_value(...) before accepting the result.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/performance/SQLITE_COMPARISON.md`:
- Around line 31-33: The docs overstate TupleView's skipped-column behavior;
update the second bullet in docs/performance/SQLITE_COMPARISON.md to reflect
that TupleView provides lazy deserialization but does not yet avoid scanning
preceding fields when accessing a later column: mention TupleView and that
current heap_table.cpp payload walking (see TupleView behavior in
heap_table.cpp) still advances from column 0 up to col_index, so skipped later
columns may still incur cost from scanning earlier fields; reword to something
like “Lazy deserialization: values are decoded only when accessed, reducing work
for read columns, but TupleView currently still walks prior fields up to
col_index so later-column access still pays the cost of preceding fields.”

In `@src/executor/operator.cpp`:
- Around line 928-931: ProjectOperator::next_view currently returns the child
view unchanged (child_->next_view(out_view)), which breaks projection semantics;
change it to fetch the child's view into a temporary view, then build a
projected TupleView matching output_schema() by mapping/reordering columns and
evaluating any projection_expressions_ or computed expressions (use whatever
expression-evaluation helper the class has or add a
ProjectOperator::apply_projection(const HeapTable::TupleView& src,
HeapTable::TupleView& dst) helper), and return that projected view via out_view;
ensure you do not expose fields outside the advertised output_schema().
- Around line 933-945: FilterOperator::next_view currently returns the first
child view unconditionally, ignoring condition_, so change the loop to evaluate
the predicate and only return true when the condition passes: after
child_->next_view(out_view) attempt to evaluate condition_ against out_view
(materialize the TupleView first if the predicate requires materialized values),
and if the predicate returns true then return true, otherwise continue the while
loop until child_ is exhausted and return false; ensure you use the existing
condition_ evaluation API (or call TupleView::materialize()/equivalent) rather
than skipping the check.

---

Nitpick comments:
In `@benchmarks/sqlite_comparison_bench.cpp`:
- Around line 237-247: The loop only counts iterations; update the benchmark to
pause timing after the loop (use state.PauseTiming()), decode at least one field
from a TupleView and verify its value (use HeapTable::TupleView and
view.get_value(column_index) or equivalent) to detect broken decoding/schema
mismatches, then resume timing if needed (state.ResumeTiming()) or call
state.SkipWithError on mismatch; ensure you reference root->next_view(view) and
perform a simple non-empty or expected-value check on view.get_value(...) before
accepting the result.

In `@include/storage/heap_table.hpp`:
- Around line 150-155: Update the next_view(TupleView& out_view) documentation
to explicitly state TupleView’s lifetime: explain that out_view points into the
iterator’s currently pinned page and therefore becomes invalid as soon as the
iterator advances to a different page, is closed, or is destroyed; mention
callers must copy data out of the TupleView if they need it beyond the
iterator’s current position and reference the TupleView type and next_view
method in the comment so readers can locate the contract easily.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 49bc0cc7-45ec-4956-a6f0-628e3bf0f9d4

📥 Commits

Reviewing files that changed from the base of the PR and between 9fb125f and fe0a76f.

📒 Files selected for processing (8)
  • README.md
  • benchmarks/sqlite_comparison_bench.cpp
  • docs/performance/SQLITE_COMPARISON.md
  • docs/phases/PHASE_8_ANALYTICS.md
  • include/executor/operator.hpp
  • include/storage/heap_table.hpp
  • src/executor/operator.cpp
  • src/storage/heap_table.cpp

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/executor/operator.cpp`:
- Around line 349-358: The SELECT "*" wildcard branch only populates
column_mapping_ but leaves schema_ as the original "*" expression, causing
materialize() to output a single column; update the same branch (where you
detect parser::ColumnExpr with name "*") to also clear or replace schema_ by
appending the child_schema.columns() entries (copying or moving the child column
descriptors into schema_) so schema_ mirrors column_mapping_ and the expanded
columns; ensure this change is applied in the same block that sets has_star and
populates column_mapping_ so downstream materialize() and any consumers see the
full expanded schema.
- Around line 967-980: ProjectOperator::next_view currently consumes a child row
when is_simple_projection_ is false and returns false, which callers treat as
EOF; fix by materializing the computed projection into operator-owned scratch
storage and returning true (so out_view points at the materialized tuple) or
alternatively prevent the executor from calling next_view when
is_simple_projection_ is false. Implement the first option by adding
operator-owned buffer/scratch storage, evaluate computed expressions into that
storage when child_->next_view(out_view) yields a row, set out_view to reference
the materialized tuple (and still set column_mapping_/schema_ as needed), and
return true; ensure you only return false for real EOF/errors. Reference:
ProjectOperator::next_view, is_simple_projection_, child_, out_view,
column_mapping_, schema_, and the operator-owned scratch buffer you add.
- Line 52: The fast-path visibility check currently sets no_txn_ based solely on
get_txn() == nullptr which is too broad; change the logic to require an explicit
non-MVCC/benchmark mode flag instead of treating a null txn as permission to
skip MVCC checks. Add or use an existing runtime flag (e.g., is_benchmark_mode()
or enable_non_mvcc_fastpath) and update the assignment in operator.cpp (where
no_txn_ is set) and the other affected blocks (the regions you flagged at lines
~64-70 and ~100-129) to set no_txn_ only when get_txn() == nullptr AND the
explicit non-MVCC flag is true; otherwise retain normal MVCC visibility checks.
Ensure any helper functions that branch on no_txn_ respect this new condition
and add a short comment referencing why the optimization is gated.

In `@src/storage/heap_table.cpp`:
- Around line 58-113: TupleView::get_value currently walks the serialized
payload by logical col_index and ignores any projection mapping; update it to
consult TupleView::column_mapping (or out_view.column_mapping) and
TupleView::table_schema to translate the requested logical index to the physical
column index before decoding the payload, returning null on out-of-range or
missing mapping. Also ensure Iterator::next_view initializes the view's
table_schema (anchor to the physical schema) when producing remapped/projected
TupleViews and that ProjectOperator::next_view sets out_view.column_mapping and
out_view.schema already do — use those to look up the physical index inside
get_value (reference symbols: TupleView::get_value, TupleView::column_mapping,
TupleView::table_schema, ProjectOperator::next_view, Iterator::next_view,
out_view.schema, out_view.column_mapping).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7b219ca3-6f2e-4e2b-846e-834c91ac51b6

📥 Commits

Reviewing files that changed from the base of the PR and between fe0a76f and 5d93bc7.

📒 Files selected for processing (6)
  • benchmarks/sqlite_comparison_bench.cpp
  • docs/performance/SQLITE_COMPARISON.md
  • include/executor/operator.hpp
  • include/storage/heap_table.hpp
  • src/executor/operator.cpp
  • src/storage/heap_table.cpp
🚧 Files skipped from review as they are similar to previous changes (2)
  • docs/performance/SQLITE_COMPARISON.md
  • benchmarks/sqlite_comparison_bench.cpp

bool SeqScanOperator::open() {
set_state(ExecState::Open);
iterator_ = std::make_unique<storage::HeapTable::Iterator>(table_->scan(get_memory_resource()));
no_txn_ = (get_txn() == nullptr);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

txn == nullptr is too broad for the MVCC fast path.

This bypass treats any tuple with xmax == 0 as visible. Tuples created by an in-flight or aborted transaction also satisfy that, so null-transaction scans can now surface rows whose xmin is not visible under the normal snapshot rules. Please gate this optimization behind an explicit non-MVCC/benchmark mode rather than get_txn() == nullptr.

Also applies to: 64-70, 100-129

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/executor/operator.cpp` at line 52, The fast-path visibility check
currently sets no_txn_ based solely on get_txn() == nullptr which is too broad;
change the logic to require an explicit non-MVCC/benchmark mode flag instead of
treating a null txn as permission to skip MVCC checks. Add or use an existing
runtime flag (e.g., is_benchmark_mode() or enable_non_mvcc_fastpath) and update
the assignment in operator.cpp (where no_txn_ is set) and the other affected
blocks (the regions you flagged at lines ~64-70 and ~100-129) to set no_txn_
only when get_txn() == nullptr AND the explicit non-MVCC flag is true; otherwise
retain normal MVCC visibility checks. Ensure any helper functions that branch on
no_txn_ respect this new condition and add a short comment referencing why the
optimization is gated.

Comment on lines +349 to +358
// Check if we have a single "*" column and expand it
bool has_star = false;
if (columns_.size() == 1 && columns_[0]->type() == parser::ExprType::Column) {
const auto* c_expr = static_cast<const parser::ColumnExpr*>(columns_[0].get());
if (c_expr->name() == "*") {
has_star = true;
for (size_t i = 0; i < child_schema.columns().size(); ++i) {
column_mapping_.push_back(i);
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Expand SELECT * into schema_ as well as column_mapping_.

The wildcard case only fills column_mapping_. schema_ was already built in the constructor from the raw expression list, so Line 972 still publishes a single "*" column. Downstream materialize() will therefore emit one value instead of the full child row.

Suggested fix
     if (columns_.size() == 1 && columns_[0]->type() == parser::ExprType::Column) {
         const auto* c_expr = static_cast<const parser::ColumnExpr*>(columns_[0].get());
         if (c_expr->name() == "*") {
             has_star = true;
+            schema_ = Schema{};
             for (size_t i = 0; i < child_schema.columns().size(); ++i) {
                 column_mapping_.push_back(i);
+                schema_.add_column(child_schema.get_column(i));
             }
         }
     }

Also applies to: 967-973

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/executor/operator.cpp` around lines 349 - 358, The SELECT "*" wildcard
branch only populates column_mapping_ but leaves schema_ as the original "*"
expression, causing materialize() to output a single column; update the same
branch (where you detect parser::ColumnExpr with name "*") to also clear or
replace schema_ by appending the child_schema.columns() entries (copying or
moving the child column descriptors into schema_) so schema_ mirrors
column_mapping_ and the expanded columns; ensure this change is applied in the
same block that sets has_star and populates column_mapping_ so downstream
materialize() and any consumers see the full expanded schema.

Comment on lines +967 to +980
bool ProjectOperator::next_view(storage::HeapTable::TupleView& out_view) {
if (!child_) return false;
if (child_->next_view(out_view)) {
if (is_simple_projection_) {
out_view.column_mapping = &column_mapping_;
out_view.schema = &schema_;
return true;
} else {
// Fallback: This is not optimal but satisfies the semantics.
// Future work: Batch materialization or local buffer.
// For now, we dont return true for computed stuff in next_view
// to avoid exposing raw data incorrectly.
return false;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Don’t signal EOF for non-simple projections.

When is_simple_projection_ is false, this consumes one child row and returns false immediately. Callers interpret that as end-of-stream, so computed projections such as SELECT a + 1 ... produce zero rows on the next_view() path instead of falling back to materialization.

Either materialize into operator-owned scratch storage here, or make the executor stop using next_view() for this operator when is_simple_projection_ is false.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/executor/operator.cpp` around lines 967 - 980, ProjectOperator::next_view
currently consumes a child row when is_simple_projection_ is false and returns
false, which callers treat as EOF; fix by materializing the computed projection
into operator-owned scratch storage and returning true (so out_view points at
the materialized tuple) or alternatively prevent the executor from calling
next_view when is_simple_projection_ is false. Implement the first option by
adding operator-owned buffer/scratch storage, evaluate computed expressions into
that storage when child_->next_view(out_view) yields a row, set out_view to
reference the materialized tuple (and still set column_mapping_/schema_ as
needed), and return true; ensure you only return false for real EOF/errors.
Reference: ProjectOperator::next_view, is_simple_projection_, child_, out_view,
column_mapping_, schema_, and the operator-owned scratch buffer you add.

@poyrazK
Copy link
Copy Markdown
Owner Author

poyrazK commented Apr 11, 2026

@CodeRabbit review

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 11, 2026

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

@poyrazK
Copy link
Copy Markdown
Owner Author

poyrazK commented Apr 11, 2026

@copilot can you find problems with this pr if it has ?

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 11, 2026

@copilot can you find problems with this pr if it has ?

Here's a review of the problems found in this PR:


🔴 Critical / Correctness Bugs

1. get_value() ignores column_mapping (wrong data for non-identity projections)

TupleView::get_value(col_index) walks the physical payload to the raw byte position col_index, completely ignoring the column_mapping field. When ProjectOperator::next_view applies a projection (e.g. SELECT b, a FROM t), it sets column_mapping = {1, 0} and schema to the projected schema — but get_value(0) still returns physical column 0, not physical column column_mapping[0]. The column_mapping field is populated but never read inside get_value.

For SELECT * the mapping happens to be identity {0,1,2…} so the benchmark passes, masking the bug.

// get_value should be:
size_t physical_idx = (column_mapping && col_index < column_mapping->size())
                        ? (*column_mapping)[col_index] : col_index;
// then walk to physical_idx in the payload

2. table_schema is declared but never set or used

TupleView has a table_schema field documented as "Physical schema of payload_data", but HeapTable::Iterator::next_view never initializes it (remains nullptr). The get_value and materialize methods use schema (which becomes the projected schema after a projection) to bound-check column access, but payload_data always has the full physical layout. This is inconsistent and will cause incorrect bounds checks for subset projections.


3. ProjectOperator::next_view silently drops all rows for non-simple projections

if (child_->next_view(out_view)) {
    if (is_simple_projection_) { return true; }
    else {
        return false;  // ← row was consumed from child, but caller sees "no more rows"
    }
}

When is_simple_projection_ is false, the child row is consumed but false is returned. On the next call, the next child row is consumed and dropped. The result is that any query with a computed expression accessed via next_view returns 0 rows — a silent correctness failure.


🟡 Performance / Design Issues

4. FilterOperator::next_view materializes every row — defeats zero-allocation goal

executor::Tuple t = out_view.materialize(get_memory_resource());  // heap alloc per row
if (condition_->evaluate(&t, &child_schema, get_params()).as_bool()) { ... }

Filtered queries using next_view allocate a full Tuple per row, making the "zero-allocation" scan path strictly more complex with no allocation benefit for filtered queries. The comment acknowledges this as "not optimal" but there is no fallback to next().


5. Dead/unreachable else branch in SeqScanOperator::next_view

no_txn_ is set to (get_txn() == nullptr). If no_txn_ is false, then get_txn() != nullptr, meaning txn is always non-null inside the MVCC block. The else { visible = (out_view.xmax == 0); } branch is unreachable dead code — identical logic is already handled by the no_txn_ early-return above.


6. Significant code duplication between next_view and next_meta in HeapTable::Iterator

The full page-navigation and slot-scanning logic (~60 lines) is copy-pasted into next_view from next_meta. Any future fix to one path (e.g. page corruption handling) won't automatically apply to the other.


🟠 Testing Gap

7. No unit tests for the next_view path

The PR adds a benchmark (BM_CloudSQL_ScanView) but no correctness tests for next_view behavior with: active transactions, deleted tuples (xmax ≠ 0), non-identity projections, filtered scans, or limit/offset combinations.


Summary table:

# Severity Issue
1 🔴 Critical get_value ignores column_mapping — wrong data for non-SELECT * queries
2 🔴 Critical table_schema never set — inconsistent schema for bounds checking
3 🔴 Critical Non-simple projections via next_view silently return 0 rows
4 🟡 Medium FilterOperator::next_view allocates per row — undermines the optimization
5 🟡 Minor Unreachable dead code in SeqScanOperator::next_view
6 🟡 Minor Large code duplication between next_view and next_meta
7 🟠 Medium No unit tests for the new next_view execution path

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (3)
src/executor/operator.cpp (2)

344-353: ⚠️ Potential issue | 🔴 Critical

SELECT * still doesn't expand schema_.

The wildcard branch fills column_mapping_ but leaves schema_ as the prebuilt "*" schema, so projected schema metadata can diverge from emitted columns.

Proposed fix
     if (columns_.size() == 1 && columns_[0]->type() == parser::ExprType::Column) {
         const auto* c_expr = static_cast<const parser::ColumnExpr*>(columns_[0].get());
         if (c_expr->name() == "*") {
             has_star = true;
+            schema_ = Schema{};
             for (size_t i = 0; i < child_schema.columns().size(); ++i) {
                 column_mapping_.push_back(i);
+                schema_.add_column(child_schema.get_column(i));
             }
         }
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/executor/operator.cpp` around lines 344 - 353, The SELECT * branch
currently sets column_mapping_ but leaves schema_ as the original "*" schema
causing metadata mismatch; update the branch that detects has_star (the code
using columns_, column_mapping_, and child_schema) to also set schema_ to
child_schema (or a copy of child_schema.columns()) when expanding the wildcard
so the projected operator's schema_ matches the emitted columns and preserve any
existing logic for column_mapping_ and subsequent processing.

52-52: ⚠️ Potential issue | 🟠 Major

no_txn_ is still too broad for MVCC bypass.

Using get_txn() == nullptr as the only gate means the scan can expose tuples from in-flight/aborted txns (xmin unchecked) as long as xmax == 0. Please gate this behind an explicit non-MVCC mode flag.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/executor/operator.cpp` at line 52, The assignment to no_txn_ should not
rely solely on get_txn() == nullptr because that allows MVCC bypass; change it
to require an explicit non-MVCC mode flag (e.g., non_mvcc_mode_ or a passed-in
option) so no_txn_ is true only when both get_txn() == nullptr AND the operator
is running in non-MVCC mode; update the code that sets no_txn_ (the line
referencing no_txn_ and get_txn()) to check the explicit flag, and if such a
flag/member does not yet exist add a clear boolean (e.g., non_mvcc_mode_) to the
class/ctor and use it to gate MVCC bypass (ensure callers/constructors are
updated to set this flag appropriately).
src/storage/heap_table.cpp (1)

58-60: ⚠️ Potential issue | 🟠 Major

Guard payload_data before decoding.

TupleView::get_value() only checks schema; dereferencing payload_data[cursor] can crash for an uninitialized/invalid view.

Proposed fix
 common::Value HeapTable::TupleView::get_value(size_t col_index) const {
-    if (!schema) return common::Value::make_null();
+    if (!schema || !payload_data) return common::Value::make_null();
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/storage/heap_table.cpp` around lines 58 - 60,
HeapTable::TupleView::get_value currently only checks schema and then
dereferences payload_data[cursor], which can crash if payload_data is
null/uninitialized; update get_value to guard that payload_data is valid (e.g.,
check payload_data is non-null and that cursor < payload_size or equivalent)
before attempting to read/decode, return common::Value::make_null() when the
view is invalid, and keep all existing decoding logic (using payload_data and
cursor) only after those checks so you never dereference payload_data when the
view is invalid.
🧹 Nitpick comments (2)
src/executor/operator.cpp (1)

985-986: FilterOperator::next_view currently re-materializes every row.

This is correct, but it weakens the zero-allocation objective on filtered scans. Consider evaluating predicates directly against TupleView (or reusing a recyclable scratch tuple) for this path.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/executor/operator.cpp` around lines 985 - 986, FilterOperator::next_view
is materializing every Tuple via out_view.materialize(get_memory_resource())
before calling condition_->evaluate, which defeats zero-allocation goals; change
the path so predicates are evaluated directly against a TupleView (out_view) or
use a recyclable scratch Tuple allocated once and reused instead of
materializing per-row. Update FilterOperator::next_view to call
condition_->evaluate with a TupleView overload (or add such an overload to the
predicate API) or introduce a member scratch Tuple (e.g.,
reusable_scratch_tuple) and only materialize into it when required, reusing the
same memory_resource/tuple across iterations; ensure condition_->evaluate,
out_view, and get_params references are adjusted accordingly (or add a small
adapter that converts TupleView to the expected evaluate input) to avoid per-row
allocations from out_view.materialize.
tests/cloudSQL_tests.cpp (1)

1201-1203: Strengthen computed projection validation.

EXPECT_GT(t.get(0).to_int64(), 5) is too permissive and can hide ordering/value bugs. Please assert exact emitted values (6, then 7) to make this test regression-resistant.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/cloudSQL_tests.cpp` around lines 1201 - 1203, The test currently uses a
permissive EXPECT_GT on t.get(0).to_int64(), which can hide ordering/value
regressions; change the assertions to exact equality for the computed projection
by replacing EXPECT_GT(t.get(0).to_int64(), 5) with
EXPECT_EQ(t.get(0).to_int64(), 6) and add a second assertion
EXPECT_EQ(t.get(1).to_int64(), 7) (use the existing test result container t and
its get(index).to_int64() calls) so the test verifies the first emitted value is
exactly 6 and the second is exactly 7.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/storage/heap_table.cpp`:
- Around line 815-833: The code exposes out_view.payload_data/ payload_len in
next_view without verifying the record fits inside the page bounds, allowing
malformed tuple_len to cause OOB reads in get_value(); fix next_view by
validating that record_len >= 18 and that (data + 18 + (record_len - 18)) does
not exceed the page buffer end before assigning out_view.payload_data and
payload_len — if the check fails unpin the page (use
table_.bpm_.unpin_page_by_id with current_page_num_), null out
current_page_/cached_buffer_ and return false; reference the next_view function,
variables record_len/tuple_data_len, data,
out_view.payload_data/out_view.payload_len, current_page_num_,
table_.bpm_.unpin_page_by_id, and get_value() for locating the change.
- Around line 852-859: HeapTable::TupleView::materialize currently dereferences
schema when column_mapping is null which can crash for default/invalid views;
update materialize to guard against a null schema by checking if schema is
non-null before accessing schema->columns().size() (or use
schema->column_count()/an explicit 0 fallback), e.g. compute num_cols as
column_mapping->size() if present, otherwise if schema != nullptr use
schema->columns().size(), else treat num_cols as 0 (or early-return an empty
Tuple) so no null-schema dereference occurs; keep the same logical_count
behavior as get_value and adjust any downstream logic that assumes num_cols
accordingly.

---

Duplicate comments:
In `@src/executor/operator.cpp`:
- Around line 344-353: The SELECT * branch currently sets column_mapping_ but
leaves schema_ as the original "*" schema causing metadata mismatch; update the
branch that detects has_star (the code using columns_, column_mapping_, and
child_schema) to also set schema_ to child_schema (or a copy of
child_schema.columns()) when expanding the wildcard so the projected operator's
schema_ matches the emitted columns and preserve any existing logic for
column_mapping_ and subsequent processing.
- Line 52: The assignment to no_txn_ should not rely solely on get_txn() ==
nullptr because that allows MVCC bypass; change it to require an explicit
non-MVCC mode flag (e.g., non_mvcc_mode_ or a passed-in option) so no_txn_ is
true only when both get_txn() == nullptr AND the operator is running in non-MVCC
mode; update the code that sets no_txn_ (the line referencing no_txn_ and
get_txn()) to check the explicit flag, and if such a flag/member does not yet
exist add a clear boolean (e.g., non_mvcc_mode_) to the class/ctor and use it to
gate MVCC bypass (ensure callers/constructors are updated to set this flag
appropriately).

In `@src/storage/heap_table.cpp`:
- Around line 58-60: HeapTable::TupleView::get_value currently only checks
schema and then dereferences payload_data[cursor], which can crash if
payload_data is null/uninitialized; update get_value to guard that payload_data
is valid (e.g., check payload_data is non-null and that cursor < payload_size or
equivalent) before attempting to read/decode, return common::Value::make_null()
when the view is invalid, and keep all existing decoding logic (using
payload_data and cursor) only after those checks so you never dereference
payload_data when the view is invalid.

---

Nitpick comments:
In `@src/executor/operator.cpp`:
- Around line 985-986: FilterOperator::next_view is materializing every Tuple
via out_view.materialize(get_memory_resource()) before calling
condition_->evaluate, which defeats zero-allocation goals; change the path so
predicates are evaluated directly against a TupleView (out_view) or use a
recyclable scratch Tuple allocated once and reused instead of materializing
per-row. Update FilterOperator::next_view to call condition_->evaluate with a
TupleView overload (or add such an overload to the predicate API) or introduce a
member scratch Tuple (e.g., reusable_scratch_tuple) and only materialize into it
when required, reusing the same memory_resource/tuple across iterations; ensure
condition_->evaluate, out_view, and get_params references are adjusted
accordingly (or add a small adapter that converts TupleView to the expected
evaluate input) to avoid per-row allocations from out_view.materialize.

In `@tests/cloudSQL_tests.cpp`:
- Around line 1201-1203: The test currently uses a permissive EXPECT_GT on
t.get(0).to_int64(), which can hide ordering/value regressions; change the
assertions to exact equality for the computed projection by replacing
EXPECT_GT(t.get(0).to_int64(), 5) with EXPECT_EQ(t.get(0).to_int64(), 6) and add
a second assertion EXPECT_EQ(t.get(1).to_int64(), 7) (use the existing test
result container t and its get(index).to_int64() calls) so the test verifies the
first emitted value is exactly 6 and the second is exactly 7.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: eddbe258-1e9c-4a6c-a133-5585964bc129

📥 Commits

Reviewing files that changed from the base of the PR and between 5d93bc7 and 05df7e7.

📒 Files selected for processing (3)
  • src/executor/operator.cpp
  • src/storage/heap_table.cpp
  • tests/cloudSQL_tests.cpp

Comment on lines +815 to +833
const size_t record_len = static_cast<size_t>(tuple_data_len);
if (record_len < 18) { // 2 len + 8 xmin + 8 xmax
std::cerr << "next_view failed: record_len < 18, it is " << record_len << "\n";
table_.bpm_.unpin_page_by_id(table_.file_id_, current_page_num_, false);
current_page_ = nullptr;
cached_buffer_ = nullptr;
return false;
}

// Read MVCC Header
std::memcpy(&out_view.xmin, data + 2, 8);
std::memcpy(&out_view.xmax, data + 10, 8);

out_view.table_schema = &table_.schema_;
out_view.schema = &table_.schema_;
out_view.column_mapping = nullptr;
out_view.payload_data = data + 18;
out_view.payload_len = record_len - 18;

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Validate record bounds before exposing payload_data.

record_len is trusted from tuple bytes and then used to set out_view.payload_len without checking page boundaries. A malformed record can cause out-of-bounds reads later in get_value().

Proposed fix
                 const size_t record_len = static_cast<size_t>(tuple_data_len);
                 if (record_len < 18) {  // 2 len + 8 xmin + 8 xmax
                     std::cerr << "next_view failed: record_len < 18, it is " << record_len << "\n";
                     table_.bpm_.unpin_page_by_id(table_.file_id_, current_page_num_, false);
                     current_page_ = nullptr;
                     cached_buffer_ = nullptr;
                     return false;
                 }
+                if (offset + record_len > Page::PAGE_SIZE) {
+                    table_.bpm_.unpin_page_by_id(table_.file_id_, current_page_num_, false);
+                    current_page_ = nullptr;
+                    cached_buffer_ = nullptr;
+                    return false;
+                }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/storage/heap_table.cpp` around lines 815 - 833, The code exposes
out_view.payload_data/ payload_len in next_view without verifying the record
fits inside the page bounds, allowing malformed tuple_len to cause OOB reads in
get_value(); fix next_view by validating that record_len >= 18 and that (data +
18 + (record_len - 18)) does not exceed the page buffer end before assigning
out_view.payload_data and payload_len — if the check fails unpin the page (use
table_.bpm_.unpin_page_by_id with current_page_num_), null out
current_page_/cached_buffer_ and return false; reference the next_view function,
variables record_len/tuple_data_len, data,
out_view.payload_data/out_view.payload_len, current_page_num_,
table_.bpm_.unpin_page_by_id, and get_value() for locating the change.

Comment on lines +852 to +859
executor::Tuple HeapTable::TupleView::materialize(std::pmr::memory_resource* mr) const {
if (!mr) mr = std::pmr::get_default_resource();
// Use the same logical_count logic as get_value so that SELECT * views
// (which have column_mapping with more entries than schema->column_count())
// are materialized correctly.
const size_t num_cols = (column_mapping && !column_mapping->empty()) ? column_mapping->size()
: schema->columns().size();

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Avoid null-schema dereference in materialize().

When column_mapping is absent, schema->columns().size() is used unconditionally. A default/invalid view can crash here.

Proposed fix
 executor::Tuple HeapTable::TupleView::materialize(std::pmr::memory_resource* mr) const {
     if (!mr) mr = std::pmr::get_default_resource();
+    if (!schema && (!column_mapping || column_mapping->empty())) {
+        return executor::Tuple(std::pmr::vector<common::Value>(mr));
+    }
     const size_t num_cols = (column_mapping && !column_mapping->empty()) ? column_mapping->size()
-                                                                         : schema->columns().size();
+                                                                         : schema->columns().size();
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/storage/heap_table.cpp` around lines 852 - 859,
HeapTable::TupleView::materialize currently dereferences schema when
column_mapping is null which can crash for default/invalid views; update
materialize to guard against a null schema by checking if schema is non-null
before accessing schema->columns().size() (or use schema->column_count()/an
explicit 0 fallback), e.g. compute num_cols as column_mapping->size() if
present, otherwise if schema != nullptr use schema->columns().size(), else treat
num_cols as 0 (or early-return an empty Tuple) so no null-schema dereference
occurs; keep the same logical_count behavior as get_value and adjust any
downstream logic that assumes num_cols accordingly.

@poyrazK poyrazK merged commit 98a22b5 into main Apr 11, 2026
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants