Skip to content

Improve Stability and Achieve Zero-Warning Build#1

Merged
poyrazK merged 89 commits intomainfrom
feature/recovery-e2e-tests
Feb 26, 2026
Merged

Improve Stability and Achieve Zero-Warning Build#1
poyrazK merged 89 commits intomainfrom
feature/recovery-e2e-tests

Conversation

@poyrazK
Copy link
Copy Markdown
Owner

@poyrazK poyrazK commented Feb 24, 2026

This PR addresses several technical debt items and stability issues:

  • Achieving a zero-warning build with -Werror and STRICT_LINT=ON.
  • Fixes server shutdown hangs by implementing a select-based accept loop with timeouts.
  • Resolves PostgreSQL wire protocol parsing issues and include-cleaner warnings using proper IWYU pragmas.
  • Fixes potential crashes in HashJoinOperator (unchecked optional access).
  • Standardizes test macros and migrates network tests for better coverage.
  • Improves transaction manager const-correctness and snapshot isolation logic.

Summary by CodeRabbit

  • New Features

    • In-memory buffer pool and Page support for stronger persistence and caching.
    • Crash-recovery and logging with a recovery workflow.
    • Improved command-line startup, diagnostics, and safer server lifecycle handling.
  • Bug Fixes

    • Fixed tuple handling in join operations to ensure correct query results.
  • Tests

    • Added buffer-pool and recovery unit tests, end-to-end Python test, and a test-runner script.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Feb 24, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Replaces direct StorageManager usage with a BufferPoolManager and Page abstraction; adds LRUReplacer and RecoveryManager (ARIES-style stubs) with LogManager integration; rewires server/main, executor, transaction, and storage components to use the buffer pool; adds buffer-pool, recovery, and e2e tests and CI/formatting tweaks.

Changes

Cohort / File(s) Summary
Build / CI
CMakeLists.txt, .github/workflows/ci.yml
Added USE_SANITIZER option; unified formatter invocation; CI now runs clang-format -i and fails on diffs.
New Buffer Pool API
include/storage/page.hpp, include/storage/buffer_pool_manager.hpp, include/storage/lru_replacer.hpp
Added Page (PAGE_SIZE, locks, metadata), BufferPoolManager public API (fetch/unpin/new_page/flush/delete/open/close/flush_all) and LRUReplacer declarations.
Buffer Pool Implementation
src/storage/buffer_pool_manager.cpp, src/storage/lru_replacer.cpp
Implemented buffer-pool logic: page caching, eviction (LRU), pin/unpin, new_page, flush/delete, WAL-aware flush coordination, and thread-safety.
Storage-backed Tables & Indexes
include/storage/heap_table.hpp, include/storage/btree_index.hpp, src/storage/heap_table.cpp, src/storage/btree_index.cpp
Switched HeapTable/BTreeIndex to use BufferPoolManager and Page::PAGE_SIZE; migrated read/write/fetch/new_page/unpin flows and updated constructor signatures.
Core API surface changes
include/executor/query_executor.hpp, src/executor/query_executor.cpp, include/network/server.hpp, src/network/server.cpp, include/transaction/transaction_manager.hpp, src/transaction/transaction_manager.cpp
Constructors/factory signatures and members now accept storage::BufferPoolManager& (renamed bpm_) instead of StorageManager&; call sites updated.
Recovery subsystem
include/recovery/recovery_manager.hpp, src/recovery/recovery_manager.cpp, src/recovery/log_manager.cpp, include/recovery/log_record.hpp, src/recovery/log_record.cpp
Added RecoveryManager header+impl (analyze/redo/undo stubs); minor LogManager robustness message and formatting edits; recovery ties bpm_, catalog, and LogManager.
Server & startup wiring
src/main.cpp, src/network/server.cpp
Startup now constructs disk StorageManager, BufferPoolManager, LogManager, runs recovery and log-flush thread, and creates Server with bpm; signal handling and server instance access made thread-safe.
Tests: buffer-pool, recovery, e2e & updates
tests/buffer_pool_tests.cpp, tests/recovery_manager_tests.cpp, tests/e2e/e2e_test.py, tests/run_test.sh, tests/* (many modified)
Added buffer-pool and recovery unit tests, Python e2e client, shell test runner; updated many tests to create StorageManager + BufferPoolManager(pool_size, disk_manager) and adjusted assertions/formatting.
Test utilities
tests/test_utils.hpp
Replaced macro-heavy asserts with function-based expect_* helpers and updated RUN_TEST / EXPECT_* wrappers to throw on failure and standardize reporting.
Executor / Operators
src/executor/operator.cpp
Minor behavior-preserving adjustments (tuple value access) and formatting changes.
Formatting / minor refactors
many headers and sources (parser, token, statement, catalog, storage/storage_manager.cpp, etc.)
Widespread stylistic/formatting edits, small parameter renames (e.g., ColumnInfo ctor args), whitespace-only changes across many files.

Sequence Diagram(s)

sequenceDiagram
    participant App as Application (main)
    participant DM as StorageManager (Disk)
    participant BPM as BufferPoolManager
    participant LM as LogManager
    participant RM as RecoveryManager
    participant TM as TransactionManager
    participant Server as Server
    participant Client as Client

    App->>DM: create disk manager (files)
    App->>LM: init LogManager(file)
    App->>BPM: BufferPoolManager(pool_size, disk_manager, log_manager?)
    App->>RM: RecoveryManager(bpm, catalog, log_manager)
    RM->>LM: read logs (analyze / redo / undo)
    RM->>BPM: fetch_page / write_page during recovery
    App->>TM: TransactionManager(..., bpm)
    App->>Server: Server::create(port, catalog, bpm)
    Client->>Server: SQL requests
    Server->>BPM: fetch_page / new_page / unpin_page
    Server->>TM: acquire/release locks, coordinate transactions
    BPM->>LM: flush WAL when writing dirty pages (if needed)
Loading

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Poem

🐰 I hop through pages, cache aglow,

I keep LRU crumbs in tidy row,
I replay logs when lights go dim,
Pin, unpin, flush — a merry hymn,
Hooray for buffers, recovery, and vim!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 20.09% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Improve Stability and Achieve Zero-Warning Build' directly aligns with the main objectives: fixing stability issues, implementing recovery components, and achieving a zero-warning build with -Werror. It clearly summarizes the primary purpose without being vague.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/recovery-e2e-tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

♻️ Duplicate comments (1)
src/storage/btree_index.cpp (1)

131-133: Duplicate: verify index drop really deletes the file.

drop() now calls bpm_.close_file(...). Please ensure this actually removes the index file; otherwise, stale files may remain on disk. (Same concern as heap table drop.)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/storage/btree_index.cpp` around lines 131 - 133, BTreeIndex::drop
currently just calls bpm_.close_file(filename_), which may not remove the
on-disk index; modify BTreeIndex::drop to ensure the underlying file is actually
deleted: check whether bpm_ exposes a delete/remove API (e.g., bpm_.delete_file
or bpm_.remove_file) and call that after closing, or if not, use
std::filesystem::remove(filename_) to unlink the file; ensure you check and
return the combined success (close + delete) and log or propagate errors so
callers can detect a failed physical deletion.
🧹 Nitpick comments (1)
src/network/server.cpp (1)

332-363: Consider gating or removing debug prints in production path.

The connection-level std::cout traces can spam logs and slow tests; a logger or debug flag would keep noise down.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/network/server.cpp` around lines 332 - 363, The code in the
connection-handling block spams std::cout (e.g., the prints around recv,
"Handling new connection", "recv1/recv2 got ...", and "Failed to read rest of
packet") which should be gated or replaced with a proper logger/debug flag;
update the connection path to use the project logging facility (or wrap these
prints behind a verbose/debug boolean) so normal production/tests don't emit
these messages, keeping behavior of functions like ProtocolReader::read_int32,
executor::QueryExecutor construction, and client_fd/recv error handling
unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/network/server.cpp`:
- Around line 152-160: Run clang-format on src/network/server.cpp to fix
formatting issues reported by CI; reformat the file (including the block
containing the bind(...) and listen(...) calls and surrounding code) so it
complies with the project's clang-format configuration, then stage the updated
file and push the change; ensure the functions and symbols like bind, listen,
and the surrounding socket handling code are still logically unchanged after
formatting.
- Around line 233-251: Server::wait currently checks accept_thread_.joinable()
under thread_mutex_ then calls accept_thread_.join(), which races with stop()
that might join or move the thread; to fix, under the scoped lock move
accept_thread_ into a local std::thread variable (e.g., local_thread =
std::move(accept_thread_)) and clear should_join there, then release the lock
and call local_thread.join() if local_thread.joinable(); this avoids double-join
on accept_thread_ while preserving synchronization with thread_mutex_ and
stop().

In `@src/storage/heap_table.cpp`:
- Around line 109-115: The file src/storage/heap_table.cpp has clang-format
style violations; run clang-format over the file (or your repo's format command)
and commit the formatted changes so spacing/indentation around the buffer
declaration, while loop, read_page(page_num, buffer.data()),
std::memset(buffer.data(), 0, Page::PAGE_SIZE) and PageHeader header{} are
fixed; ensure the formatting follows the project's clang-format rules and that
no other style changes are left uncommitted.
- Around line 422-424: HeapTable::drop currently only calls
bpm_.close_file(filename_) which closes the buffer/file tracking but does not
remove the heap file from disk; change drop() to first ensure the file is closed
via bpm_.close_file(filename_) (if not already), then delete the physical file
using a file-removal API such as std::filesystem::remove(filename_) (or
std::remove) and return success based on the deletion result, handling and
logging any failure; update references to filename_ and bpm_.close_file
accordingly and ensure any exceptions from std::filesystem::remove are caught or
translated into a boolean return.

In `@tests/cloudSQL_tests.cpp`:
- Around line 268-269: Several BufferPoolManager initialization lines are
mis-indented causing clang-format failures; for each block that creates a
StorageManager followed by a BufferPoolManager (e.g., the occurrences that
create StorageManager("./test_data") and BufferPoolManager sm(128,
disk_manager)), fix the indentation to match the surrounding block style or run
clang-format across tests/cloudSQL_tests.cpp so the BufferPoolManager lines are
indented consistently with the StorageManager lines; ensure you update all
listed occurrences (around the StorageManager/BufferPoolManager pairs at the
comment: 275-276, 288-289, 313-314, 327-328, 348-349, 389-390, 434-435, 465-466,
488-489, 518-519, 543-544, 569-570, 597-598, 624-625, 661-662, 704-705, 742-743,
801-802) so clang-format CI passes.

---

Duplicate comments:
In `@src/storage/btree_index.cpp`:
- Around line 131-133: BTreeIndex::drop currently just calls
bpm_.close_file(filename_), which may not remove the on-disk index; modify
BTreeIndex::drop to ensure the underlying file is actually deleted: check
whether bpm_ exposes a delete/remove API (e.g., bpm_.delete_file or
bpm_.remove_file) and call that after closing, or if not, use
std::filesystem::remove(filename_) to unlink the file; ensure you check and
return the combined success (close + delete) and log or propagate errors so
callers can detect a failed physical deletion.

---

Nitpick comments:
In `@src/network/server.cpp`:
- Around line 332-363: The code in the connection-handling block spams std::cout
(e.g., the prints around recv, "Handling new connection", "recv1/recv2 got ...",
and "Failed to read rest of packet") which should be gated or replaced with a
proper logger/debug flag; update the connection path to use the project logging
facility (or wrap these prints behind a verbose/debug boolean) so normal
production/tests don't emit these messages, keeping behavior of functions like
ProtocolReader::read_int32, executor::QueryExecutor construction, and
client_fd/recv error handling unchanged.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b3ae64d and 5c059a1.

📒 Files selected for processing (16)
  • CMakeLists.txt
  • include/executor/query_executor.hpp
  • include/network/server.hpp
  • include/storage/btree_index.hpp
  • include/storage/heap_table.hpp
  • include/transaction/transaction_manager.hpp
  • src/executor/operator.cpp
  • src/executor/query_executor.cpp
  • src/main.cpp
  • src/network/server.cpp
  • src/storage/btree_index.cpp
  • src/storage/heap_table.cpp
  • src/transaction/transaction_manager.cpp
  • tests/cloudSQL_tests.cpp
  • tests/server_tests.cpp
  • tests/transaction_manager_tests.cpp

Comment on lines 233 to 251
void Server::wait() {
const std::scoped_lock<std::mutex> lock(thread_mutex_);
if (accept_thread_.joinable()) {
std::thread::id accept_id;
bool should_join = false;

{
const std::scoped_lock<std::mutex> lock(thread_mutex_);
if (accept_thread_.joinable()) {
accept_id = accept_thread_.get_id();
should_join = true;
}
}

if (should_join) {
// Warning: This could be dangerous if stop() changes accept_thread_ concurrently,
// but typically only one thread will join anyway.
// A safer way is to just do it directly.
// Let's just avoid holding the mutex during join().
accept_thread_.join();
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Fix possible double-join race in wait().

wait() can observe joinable() under lock, then stop() joins concurrently, making the subsequent join throw. Move the thread out under lock and join the local copy.

✅ Suggested fix
 void Server::wait() {
-    std::thread::id accept_id;
-    bool should_join = false;
-
-    {
-        const std::scoped_lock<std::mutex> lock(thread_mutex_);
-        if (accept_thread_.joinable()) {
-            accept_id = accept_thread_.get_id();
-            should_join = true;
-        }
-    }
-
-    if (should_join) {
-        // Warning: This could be dangerous if stop() changes accept_thread_ concurrently,
-        // but typically only one thread will join anyway.
-        // A safer way is to just do it directly.
-        // Let's just avoid holding the mutex during join().
-        accept_thread_.join();
-    }
+    std::thread accept_thread;
+    {
+        const std::scoped_lock<std::mutex> lock(thread_mutex_);
+        accept_thread = std::move(accept_thread_);
+    }
+    if (accept_thread.joinable()) {
+        accept_thread.join();
+    }
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
void Server::wait() {
const std::scoped_lock<std::mutex> lock(thread_mutex_);
if (accept_thread_.joinable()) {
std::thread::id accept_id;
bool should_join = false;
{
const std::scoped_lock<std::mutex> lock(thread_mutex_);
if (accept_thread_.joinable()) {
accept_id = accept_thread_.get_id();
should_join = true;
}
}
if (should_join) {
// Warning: This could be dangerous if stop() changes accept_thread_ concurrently,
// but typically only one thread will join anyway.
// A safer way is to just do it directly.
// Let's just avoid holding the mutex during join().
accept_thread_.join();
}
void Server::wait() {
std::thread accept_thread;
{
const std::scoped_lock<std::mutex> lock(thread_mutex_);
accept_thread = std::move(accept_thread_);
}
if (accept_thread.joinable()) {
accept_thread.join();
}
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/network/server.cpp` around lines 233 - 251, Server::wait currently checks
accept_thread_.joinable() under thread_mutex_ then calls accept_thread_.join(),
which races with stop() that might join or move the thread; to fix, under the
scoped lock move accept_thread_ into a local std::thread variable (e.g.,
local_thread = std::move(accept_thread_)) and clear should_join there, then
release the lock and call local_thread.join() if local_thread.joinable(); this
avoids double-join on accept_thread_ while preserving synchronization with
thread_mutex_ and stop().

Comment on lines 422 to 424
bool HeapTable::drop() {
return storage_manager_.close_file(filename_);
return bpm_.close_file(filename_);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

git ls-files | grep -E "(buffer_pool|disk_manager)" | head -20

Repository: poyrazK/cloudSQL

Length of output: 42


🏁 Script executed:

rg -l "class BufferPoolManager" --type cpp --type hpp

Repository: poyrazK/cloudSQL

Length of output: 87


🏁 Script executed:

rg -l "class DiskManager" --type cpp --type hpp

Repository: poyrazK/cloudSQL

Length of output: 87


🏁 Script executed:

rg -n "close_file|drop_file|delete_file|remove_file" -B2 -A5 --type cpp --type hpp

Repository: poyrazK/cloudSQL

Length of output: 87


🏁 Script executed:

fd -type f "buffer_pool_manager|disk_manager" . | head -20

Repository: poyrazK/cloudSQL

Length of output: 230


🏁 Script executed:

fd -type f "\.(cpp|h|hpp)$" | xargs rg -l "class BufferPoolManager" 2>/dev/null | head -10

Repository: poyrazK/cloudSQL

Length of output: 230


🏁 Script executed:

find . -name "*buffer_pool*" -o -name "*disk_manager*" | head -20

Repository: poyrazK/cloudSQL

Length of output: 42


🏁 Script executed:

rg "class BufferPoolManager|close_file|delete_file|remove_file" -l

Repository: poyrazK/cloudSQL

Length of output: 182


🏁 Script executed:

find . -name "*.h" -o -name "*.hpp" -o -name "*.cpp" | xargs rg "class BufferPoolManager" 2>/dev/null

Repository: poyrazK/cloudSQL

Length of output: 44


🏁 Script executed:

find . -name "*.h" -o -name "*.hpp" -o -name "*.cpp" | xargs rg "class DiskManager" 2>/dev/null

Repository: poyrazK/cloudSQL

Length of output: 44


🏁 Script executed:

find . -type d -name "include" -o -name "src" | head -5

Repository: poyrazK/cloudSQL

Length of output: 75


🏁 Script executed:

ls -la include/ src/ 2>/dev/null | head -30

Repository: poyrazK/cloudSQL

Length of output: 1287


🏁 Script executed:

ls -la include/storage/ src/storage/

Repository: poyrazK/cloudSQL

Length of output: 727


🏁 Script executed:

grep -r "class BufferPoolManager\|class DiskManager" include/storage src/storage

Repository: poyrazK/cloudSQL

Length of output: 42


🏁 Script executed:

grep -r "close_file\|delete_file\|remove_file" include/storage src/storage

Repository: poyrazK/cloudSQL

Length of output: 433


🏁 Script executed:

cat -n src/storage/storage_manager.cpp

Repository: poyrazK/cloudSQL

Length of output: 5750


🏁 Script executed:

cat -n include/storage/storage_manager.hpp

Repository: poyrazK/cloudSQL

Length of output: 3297


🏁 Script executed:

sed -n '420,430p' src/storage/heap_table.cpp

Repository: poyrazK/cloudSQL

Length of output: 296


Switch drop() to actually delete the file, not just close it.

StorageManager::close_file() only closes the file handle and removes it from the internal tracking map—it does not delete the file from disk. The heap file will persist and can reappear on the next open(). Replace the call with a file deletion API (e.g., std::filesystem::remove() or std::remove()).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/storage/heap_table.cpp` around lines 422 - 424, HeapTable::drop currently
only calls bpm_.close_file(filename_) which closes the buffer/file tracking but
does not remove the heap file from disk; change drop() to first ensure the file
is closed via bpm_.close_file(filename_) (if not already), then delete the
physical file using a file-removal API such as
std::filesystem::remove(filename_) (or std::remove) and return success based on
the deletion result, handling and logging any failure; update references to
filename_ and bpm_.close_file accordingly and ensure any exceptions from
std::filesystem::remove are caught or translated into a boolean return.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
tests/server_tests.cpp (1)

329-330: ⚠️ Potential issue | 🟡 Minor

Misleading "All tests passed" message when failures occur.

Line 330 unconditionally prints "All tests passed" before the failure check on line 331. When tests_failed > 0, this message is incorrect.

Suggested fix
     std::cout << "========================\n";
-    std::cout << "All tests passed: " << tests_passed << "\n";
-    if (tests_failed > 0) {
-        std::cout << "Tests failed: " << tests_failed << "\n";
-        return 1;
-    }
-    return 0;
+    std::cout << "Results: " << tests_passed << " passed, " << tests_failed << " failed\n";
+    return (tests_failed > 0) ? 1 : 0;
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/server_tests.cpp` around lines 329 - 330, The current code always
prints "All tests passed: " with tests_passed even when failures exist; update
the printing logic in tests/server_tests.cpp so that you check the tests_failed
variable before printing — if tests_failed == 0 print "All tests passed: " with
tests_passed, otherwise print a summary showing both tests_passed and
tests_failed (or a clear failure message). Locate the output around the
variables tests_passed and tests_failed where std::cout is used and adjust the
conditional to ensure the "All tests passed" message is emitted only when
tests_failed == 0.
♻️ Duplicate comments (1)
src/network/server.cpp (1)

233-238: ⚠️ Potential issue | 🔴 Critical

Deadlock: wait() joins accept_thread_ while holding thread_mutex_.

accept_connections() (line 303) acquires thread_mutex_ on every accepted connection. If wait() holds thread_mutex_ and blocks on accept_thread_.join(), the accept loop will deadlock trying to re-acquire the same mutex. The past review's suggested fix (move the thread out under lock, join outside) also resolves this.

Proposed fix
 void Server::wait() {
-    const std::scoped_lock<std::mutex> lock(thread_mutex_);
-    if (accept_thread_.joinable()) {
-        accept_thread_.join();
+    std::thread local_thread;
+    {
+        const std::scoped_lock<std::mutex> lock(thread_mutex_);
+        local_thread = std::move(accept_thread_);
+    }
+    if (local_thread.joinable()) {
+        local_thread.join();
     }
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/network/server.cpp` around lines 233 - 238, Server::wait() currently
holds thread_mutex_ while calling accept_thread_.join(), causing a deadlock with
accept_connections() which also locks thread_mutex_; fix by only holding the
lock to transfer/move accept_thread_ out of the member, then release the lock
and call join on the moved local thread. Concretely, in Server::wait() acquire
the mutex, std::move accept_thread_ into a local std::thread (or swap), let the
lock go out of scope, then if the local thread.joinable() call join(); this uses
the existing accept_thread_ and thread_mutex_ symbols and avoids joining while
holding the mutex.
🧹 Nitpick comments (3)
include/storage/page.hpp (2)

24-24: Redundant reset_memory() call — data_ is already value-initialized.

Line 61 declares data_ with {} which zero-initializes the array. The constructor then calls reset_memory() which fills with zero again. One of these is redundant.

Also applies to: 59-59

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/storage/page.hpp` at line 24, The Page() constructor redundantly
calls reset_memory() even though the member array data_ is value-initialized
with {}; remove the extra initialization by deleting the reset_memory() call
from the Page() constructor (leave data_ declared with {}), or alternatively
remove the {} and keep reset_memory()—choose one consistent approach; update the
Page() constructor (symbol: Page()) and the member declaration (symbol: data_)
so initialization happens only once.

41-41: get_file_name() returns std::string by value — prefer const std::string&.

This accessor is likely called frequently (e.g., during page lookups/eviction). Returning by const reference avoids unnecessary heap allocations.

Suggested fix
-    [[nodiscard]] std::string get_file_name() const { return file_name_; }
+    [[nodiscard]] const std::string& get_file_name() const { return file_name_; }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/storage/page.hpp` at line 41, The accessor get_file_name() currently
returns std::string by value causing allocations; change its signature to return
a const std::string& (e.g., [[nodiscard]] const std::string& get_file_name()
const) so callers receive a reference to the member file_name_ without copying,
and keep the function const and nodiscard attributes; ensure file_name_ remains
a std::string data member so the returned reference is valid for the object's
lifetime.
tests/cloudSQL_tests.cpp (1)

361-444: Network tests are disabled with #if 0.

These tests (NetworkTest_Handshake, NetworkTest_MultiClient) are compiled out. If they were migrated to tests/server_tests.cpp, this dead code should be removed rather than left behind.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/cloudSQL_tests.cpp` around lines 361 - 444, The two disabled tests
enclosed by the `#if` 0/#endif block (NetworkTest_Handshake and
NetworkTest_MultiClient) are dead code and should be removed; delete the entire
`#if` 0 ... `#endif` block that contains these tests, and if their active
equivalents were migrated (e.g., to server tests), ensure there are no duplicate
test definitions by verifying NetworkTest_Handshake and NetworkTest_MultiClient
exist only in the intended test file and not left behind as commented-out code;
remove the `#if` 0/#endif markers and the enclosed test functions to clean up the
codebase.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@include/storage/page.hpp`:
- Around line 49-54: The Page currently keeps LSN in lsn_ separate from data_,
so update the flush/read paths to persist the LSN: before
BufferPoolManager::flush_page() calls StorageManager::write_page(), serialize
get_lsn() into the agreed bytes at the start of the Page::data_ buffer (e.g.,
first 4 bytes) so the on-disk page contains the LSN, and likewise when pages are
read back from StorageManager::read_page() deserialize those bytes into
Page::set_lsn(); modify BufferPoolManager::flush_page() and the page load path
to perform this packing/unpacking, ensuring get_lsn()/set_lsn() remain
authoritative and consistent with data_ prior to any disk I/O.

---

Outside diff comments:
In `@tests/server_tests.cpp`:
- Around line 329-330: The current code always prints "All tests passed: " with
tests_passed even when failures exist; update the printing logic in
tests/server_tests.cpp so that you check the tests_failed variable before
printing — if tests_failed == 0 print "All tests passed: " with tests_passed,
otherwise print a summary showing both tests_passed and tests_failed (or a clear
failure message). Locate the output around the variables tests_passed and
tests_failed where std::cout is used and adjust the conditional to ensure the
"All tests passed" message is emitted only when tests_failed == 0.

---

Duplicate comments:
In `@src/network/server.cpp`:
- Around line 233-238: Server::wait() currently holds thread_mutex_ while
calling accept_thread_.join(), causing a deadlock with accept_connections()
which also locks thread_mutex_; fix by only holding the lock to transfer/move
accept_thread_ out of the member, then release the lock and call join on the
moved local thread. Concretely, in Server::wait() acquire the mutex, std::move
accept_thread_ into a local std::thread (or swap), let the lock go out of scope,
then if the local thread.joinable() call join(); this uses the existing
accept_thread_ and thread_mutex_ symbols and avoids joining while holding the
mutex.

---

Nitpick comments:
In `@include/storage/page.hpp`:
- Line 24: The Page() constructor redundantly calls reset_memory() even though
the member array data_ is value-initialized with {}; remove the extra
initialization by deleting the reset_memory() call from the Page() constructor
(leave data_ declared with {}), or alternatively remove the {} and keep
reset_memory()—choose one consistent approach; update the Page() constructor
(symbol: Page()) and the member declaration (symbol: data_) so initialization
happens only once.
- Line 41: The accessor get_file_name() currently returns std::string by value
causing allocations; change its signature to return a const std::string& (e.g.,
[[nodiscard]] const std::string& get_file_name() const) so callers receive a
reference to the member file_name_ without copying, and keep the function const
and nodiscard attributes; ensure file_name_ remains a std::string data member so
the returned reference is valid for the object's lifetime.

In `@tests/cloudSQL_tests.cpp`:
- Around line 361-444: The two disabled tests enclosed by the `#if` 0/#endif block
(NetworkTest_Handshake and NetworkTest_MultiClient) are dead code and should be
removed; delete the entire `#if` 0 ... `#endif` block that contains these tests, and
if their active equivalents were migrated (e.g., to server tests), ensure there
are no duplicate test definitions by verifying NetworkTest_Handshake and
NetworkTest_MultiClient exist only in the intended test file and not left behind
as commented-out code; remove the `#if` 0/#endif markers and the enclosed test
functions to clean up the codebase.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5c059a1 and 1bb00ba.

📒 Files selected for processing (5)
  • include/storage/page.hpp
  • src/network/server.cpp
  • tests/cloudSQL_tests.cpp
  • tests/server_tests.cpp
  • tests/transaction_manager_tests.cpp

Comment on lines +49 to +54
// Get Page LSN (for recovery/WAL)
// Assuming page layout reserves first 8 bytes (or 4 bytes) for PageLSN depending on implementation
// We will store LSN explicitly here to avoid parsing the header if not strictly needed,
// but typically it's written in the first 4 bytes of data_.
[[nodiscard]] int32_t get_lsn() const { return lsn_; }
void set_lsn(int32_t lsn) { lsn_ = lsn; }
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, find all relevant files
find . -type f -name "*.hpp" -o -name "*.cpp" | grep -E "(buffer_pool|page)" | head -20

Repository: poyrazK/cloudSQL

Length of output: 86


🏁 Script executed:

# Search for BufferPoolManager class and flush-related methods
rg -n 'class BufferPoolManager' --type=cpp -A 20

Repository: poyrazK/cloudSQL

Length of output: 1135


🏁 Script executed:

# Search for flush, write, or persistence operations related to pages and LSN
rg -n 'flush|writePage|write_page' --type=cpp -C 3

Repository: poyrazK/cloudSQL

Length of output: 16323


🏁 Script executed:

# Check page.hpp for LSN comments and member declarations
fd page.hpp -x cat -n {}

Repository: poyrazK/cloudSQL

Length of output: 2789


🏁 Script executed:

# Search for any LSN serialization or deserialization logic
rg -n 'lsn_.*=|memcpy.*lsn|lsn.*data' --type=cpp -C 2

Repository: poyrazK/cloudSQL

Length of output: 6912


🏁 Script executed:

# Find BufferPoolManager definition
find . -type f \( -name "*.hpp" -o -name "*.cpp" \) | xargs grep -l "class BufferPoolManager"

Repository: poyrazK/cloudSQL

Length of output: 86


🏁 Script executed:

# Search for BufferPoolManager in header files
fd -e hpp -e cpp | xargs grep -n "BufferPoolManager" | head -30

Repository: poyrazK/cloudSQL

Length of output: 2713


🏁 Script executed:

# Look for buffer_pool or bpm files
fd buffer_pool -e hpp -e cpp

Repository: poyrazK/cloudSQL

Length of output: 42


🏁 Script executed:

# Search for BufferPoolManager class definition (not just references)
rg -n "class BufferPoolManager" --type=cpp --type=hpp

Repository: poyrazK/cloudSQL

Length of output: 87


🏁 Script executed:

# List all files in include/storage
ls -la include/storage/

Repository: poyrazK/cloudSQL

Length of output: 433


🏁 Script executed:

# Check if BufferPoolManager is defined in page.hpp or another file
rg -n "^class BufferPoolManager|^struct BufferPoolManager" --type=cpp --type=hpp -B 2

Repository: poyrazK/cloudSQL

Length of output: 87


🏁 Script executed:

# Search for BufferPoolManager class definition without file type
rg -n "class BufferPoolManager" -B 2 -A 30

Repository: poyrazK/cloudSQL

Length of output: 1204


🏁 Script executed:

# List all header and cpp files
find include src -type f -name "*.hpp" -o -name "*.cpp" | sort

Repository: poyrazK/cloudSQL

Length of output: 1150


🏁 Script executed:

# Search in storage_manager.hpp for BufferPoolManager
rg -n "BufferPoolManager" include/storage/storage_manager.hpp -B 3 -A 20

Repository: poyrazK/cloudSQL

Length of output: 42


🏁 Script executed:

# Check if BufferPoolManager is defined inline or in a separate location
rg -n "BufferPoolManager.*{" -A 50

Repository: poyrazK/cloudSQL

Length of output: 163


🏁 Script executed:

# List what's in storage_manager.hpp
cat include/storage/storage_manager.hpp

Repository: poyrazK/cloudSQL

Length of output: 2639


🏁 Script executed:

# Search for BufferPoolManager in all files
rg "BufferPoolManager" -l

Repository: poyrazK/cloudSQL

Length of output: 461


🏁 Script executed:

# Check if BufferPoolManager might be a typedef or alias
rg "BufferPoolManager\s*="

Repository: poyrazK/cloudSQL

Length of output: 42


🏁 Script executed:

# Search more broadly for "Buffer" or "Pool" in filenames
find . -type f -name "*buffer*" -o -name "*pool*"

Repository: poyrazK/cloudSQL

Length of output: 42


🏁 Script executed:

# Check main.cpp where BufferPoolManager is instantiated
cat src/main.cpp | grep -A 20 "BufferPoolManager"

Repository: poyrazK/cloudSQL

Length of output: 963


🏁 Script executed:

# Search for the full definition with line numbers
rg -n "BufferPoolManager" src/main.cpp -B 5 -A 10

Repository: poyrazK/cloudSQL

Length of output: 818


🏁 Script executed:

# Check if there's a header file we missed
find . -type f \( -name "*.h" -o -name "*.hpp" \)

Repository: poyrazK/cloudSQL

Length of output: 764


🏁 Script executed:

# Search all cpp files for BufferPoolManager class definition
rg "class BufferPoolManager" --type-list

Repository: poyrazK/cloudSQL

Length of output: 449


🏁 Script executed:

# Try a simpler search - look for where BufferPoolManager is defined
rg "class BufferPoolManager"

Repository: poyrazK/cloudSQL

Length of output: 120


🏁 Script executed:

# Check if BufferPoolManager is perhaps an alias for StorageManager
rg "using BufferPoolManager"

Repository: poyrazK/cloudSQL

Length of output: 42


🏁 Script executed:

# Search in all text for BufferPoolManager definition
grep -r "class BufferPoolManager" .

Repository: poyrazK/cloudSQL

Length of output: 122


🏁 Script executed:

# Look for any file containing "BufferPoolManager" content
find . -type f -exec grep -l "BufferPoolManager" {} \;

Repository: poyrazK/cloudSQL

Length of output: 489


🏁 Script executed:

# Check all files in include/storage more carefully
ls -la include/storage/
cat include/storage/*.hpp

Repository: poyrazK/cloudSQL

Length of output: 15134


🏁 Script executed:

# Check if there are any files without standard extensions
find include/storage -type f

Repository: poyrazK/cloudSQL

Length of output: 183


🏁 Script executed:

# Try to find buffer_pool_manager specifically
find . -iname "*buffer*"

Repository: poyrazK/cloudSQL

Length of output: 42


🏁 Script executed:

# Found it! Check buffer_pool_manager.hpp which is referenced but not in ls output
cat include/storage/buffer_pool_manager.hpp

Repository: poyrazK/cloudSQL

Length of output: 131


🏁 Script executed:

# Also search for its implementation
find . -name "*buffer_pool_manager*"

Repository: poyrazK/cloudSQL

Length of output: 42


🏁 Script executed:

# Search for all includes of buffer_pool_manager
rg "buffer_pool_manager" -n

Repository: poyrazK/cloudSQL

Length of output: 1034


🏁 Script executed:

# Check what files reference BufferPoolManager
rg "include.*buffer" -n

Repository: poyrazK/cloudSQL

Length of output: 1034


🏁 Script executed:

# Look for any errors or TODOs about BufferPoolManager
rg "BufferPoolManager|buffer.pool" -n -B 2 -A 2

Repository: poyrazK/cloudSQL

Length of output: 24131


🏁 Script executed:

# Check btree_index.hpp line 14 more carefully
head -20 include/storage/btree_index.hpp

Repository: poyrazK/cloudSQL

Length of output: 430


🏁 Script executed:

# The header file is referenced but doesn't exist. Let's check if there's a cpp file
find . -name "*buffer_pool*"

Repository: poyrazK/cloudSQL

Length of output: 42


🏁 Script executed:

# Check CMakeLists to see how the project is being compiled
cat CMakeLists.txt | grep -i buffer -B 3 -A 3

Repository: poyrazK/cloudSQL

Length of output: 447


🏁 Script executed:

# See what tests reference buffer_pool_tests
cat tests/buffer_pool_tests.cpp 2>/dev/null | head -100

Repository: poyrazK/cloudSQL

Length of output: 42


🏁 Script executed:

# Check if buffer_pool_tests.cpp exists
ls -la tests/buffer_pool_tests.cpp

Repository: poyrazK/cloudSQL

Length of output: 134


🏁 Script executed:

# List all test files
ls tests/*.cpp

Repository: poyrazK/cloudSQL

Length of output: 223


🏁 Script executed:

# Verify that buffer_pool_manager.hpp is truly missing
ls -la include/storage/buffer_pool_manager.hpp

Repository: poyrazK/cloudSQL

Length of output: 146


🏁 Script executed:

# Check if there's a src/storage/buffer_pool_manager.cpp
ls -la src/storage/

Repository: poyrazK/cloudSQL

Length of output: 379


BufferPoolManager must serialize page LSN to persist it with page data.

The Page class stores lsn_ as a separate member, not as part of data_. Your comment correctly notes that LSNs are "typically written in the first 4 bytes of data_," but this implementation keeps them separate. When BufferPoolManager flushes dirty pages to disk, it will write only the data_ buffer—the lsn_ value will remain in memory and be lost on crash. Recovery will lack page LSNs needed to check durability of modifications.

Ensure BufferPoolManager.flush_page() serializes the current lsn_ value into data_ before calling StorageManager.write_page(), and deserializes it back on read, or document why a different recovery strategy is used.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/storage/page.hpp` around lines 49 - 54, The Page currently keeps LSN
in lsn_ separate from data_, so update the flush/read paths to persist the LSN:
before BufferPoolManager::flush_page() calls StorageManager::write_page(),
serialize get_lsn() into the agreed bytes at the start of the Page::data_ buffer
(e.g., first 4 bytes) so the on-disk page contains the LSN, and likewise when
pages are read back from StorageManager::read_page() deserialize those bytes
into Page::set_lsn(); modify BufferPoolManager::flush_page() and the page load
path to perform this packing/unpacking, ensuring get_lsn()/set_lsn() remain
authoritative and consistent with data_ prior to any disk I/O.

- Fixed corrupted CMakeLists.txt with proper formatting and structure.
- Refactored test macros in test_utils.hpp to use type-safe inline functions.
- Eliminated cppcoreguidelines-avoid-do-while warnings in tests.
- Added LRUReplacer to manage page eviction policies.
- Adheres to project standards with clean abstractions and no-warning build.
- Added BufferPoolManager to manage in-memory caching of disk pages.
- Integrated with LRUReplacer for efficient page eviction.
- Resolved type mismatches between storage manager and transaction manager.
- Implemented RecoveryManager stubs for Analysis, Redo, and Undo phases.
- Adhered to the Rule of Five for resource management.
- Resolved static-member warnings by adding temporary stub usage.
- Updated LogRecord to support various operation types (Insert, Delete, Update).
- Ensured consistency with the WAL format.
…integration

- Updated Transaction and TransactionManager headers to use BufferPoolManager.
- Enhanced snapshot logic for better MVCC isolation.
- Refactored Page, HeapTable, and BTreeIndex headers to support BufferPoolManager.
- Improved abstraction for disk-backed data structures.
- Updated HeapTable and BTreeIndex to use BufferPoolManager for page access.
- Refactored StorageManager to handle low-level I/O through BufferPoolManager.
- Resolved various linting warnings related to pointer arithmetic and type safety.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant