Skip to content

chore: migrate to GoogleTest and restore coverage#3

Merged
poyrazK merged 12 commits intomainfrom
chore/gtest-migration-v2
Feb 28, 2026
Merged

chore: migrate to GoogleTest and restore coverage#3
poyrazK merged 12 commits intomainfrom
chore/gtest-migration-v2

Conversation

@poyrazK
Copy link
Copy Markdown
Owner

@poyrazK poyrazK commented Feb 27, 2026

Restoring all tests and migrating to GTest.

Summary by CodeRabbit

  • Bug Fixes

    • Prevent duplicate table and index creation with explicit errors.
  • Refactor

    • Simplified transaction lifecycle and public API; streamlined commit/abort and lock release flows.
    • Overhauled buffer pool and LRU replacer internals for safer memory and concurrency behavior.
    • Redesigned server protocol handling and connection lifecycle for a leaner query flow.
  • Tests

    • Expanded unit and integration tests (buffer pool, LRU replacer, recovery/logging, transactions, statements, server).
  • Chores

    • CI test targets added and coverage filtering updated.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Feb 27, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

Refactors buffer pool storage/locking, adds storage page allocation APIs, renames transaction lock accessors and overhauls TransactionManager lifecycle (begin/commit/abort), tightens catalog duplicate checks, simplifies PostgreSQL server wire protocol, and expands CI/tests with recovery and buffer-pool test targets.

Changes

Cohort / File(s) Summary
CI & Coverage
​.github/workflows/ci.yml
Added recovery_manager_tests and buffer_pool_tests to CI test targets; updated LCOV filter to exclude */_deps/*.
Buffer Pool (API + Impl)
include/storage/buffer_pool_manager.hpp, src/storage/buffer_pool_manager.cpp
Switched internal page storage from std::vector<Page> to std::unique_ptr<Page[]>; added get_log_manager() accessor; reworked fetch/unpin/new/delete/flush flows, moved to std::scoped_lock, removed WAL/LSN pre-eviction coupling, adjusted free_list_/replacer usage and lifecycle.
LRU Replacer
src/storage/lru_replacer.cpp
Replaced std::lock_guard with std::scoped_lock; adjusted victim/pin/unpin behavior and removed const from victim(uint32_t* const)victim(uint32_t*).
Storage Manager
include/storage/storage_manager.hpp, src/storage/storage_manager.cpp
Added allocate_page(const std::string&) implementation returning page index based on file size; added a stub deallocate_page(...) no-op.
Transaction API
include/transaction/transaction.hpp
Renamed lock accessors to get_shared_lock_set() and get_exclusive_lock_set() and changed returns to return sets by value (copied) instead of const references.
TransactionManager (API & Impl)
include/transaction/transaction_manager.hpp, src/transaction/transaction_manager.cpp
Made constructor public/explicit, added public begin/commit/abort/get_transaction; removed undo logic, simplified begin/commit/abort flows and active/completed tracking; added default destructor and disabled copy/move.
Catalog
src/catalog/catalog.cpp
Added runtime guards to throw std::runtime_error on duplicate table/index creation; fixed index insertion path and added <stdexcept> include.
Server (network)
src/network/server.cpp
Major refactor of PostgreSQL wire protocol handling and connection loop: streamlined startup/SSL denial path, simplified message I/O and query loop, reduced result/message complexity and internal helpers.
Tests (new/updated)
tests/buffer_pool_tests.cpp, tests/recovery_tests.cpp, tests/cloudSQL_tests.cpp, tests/cloudSQL_tests.cpp.old, tests/server_tests.cpp, tests/statement_tests.cpp, tests/lock_manager_tests.cpp
Added/expanded buffer-pool and recovery tests; reorganized/renamed large CloudSQL test suites; updated server and statement tests; removed minor comment edits in lock tests.

Sequence Diagram(s)

sequenceDiagram
  participant Client
  participant TxMgr as TransactionManager
  participant LogMgr as LogManager
  participant LockMgr as LockManager

  Client->>TxMgr: commit(txn)
  TxMgr->>LogMgr: append(COMMIT, prev_lsn)
  LogMgr-->>TxMgr: lsn_assigned
  TxMgr->>LogMgr: flush()
  LogMgr-->>TxMgr: flush_ack
  TxMgr->>LockMgr: release_shared_and_exclusive(txn.lock_sets)
  LockMgr-->>TxMgr: released
  TxMgr-->>Client: commit_complete
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~50 minutes

Possibly related PRs

Poem

🐰 I hopped through pages, pointers bright,
Logs ticked LSNs beneath the night,
Locks I nudged and set them free,
Tests now hum in harmony,
A little hop — the DB's delight!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 19.54% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'chore: migrate to GoogleTest and restore coverage' accurately describes a primary change evident in the PR: migration of tests to GoogleTest framework (evidenced by new test files using GTest macros) and restoration of test coverage by reintroducing comprehensive test suites.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch chore/gtest-migration-v2

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/network/server.cpp (1)

171-180: ⚠️ Potential issue | 🔴 Critical

stop() can hang because worker threads are joined before client sockets are closed.

At Line 171, workers may block in recv() forever while sockets are still open; joins occur before Line 178 closes descriptors.

🛠️ Suggested fix
-    for (auto& worker : workers) {
-        if (worker.joinable()) {
-            worker.join();
-        }
-    }
-
-    // Now close the fds
-    for (const int fd : fds) {
-        static_cast<void>(close(fd));
-    }
+    // Close/shutdown sockets first to unblock worker recv()
+    for (const int fd : fds) {
+        static_cast<void>(shutdown(fd, SHUT_RDWR));
+        static_cast<void>(close(fd));
+    }
+
+    for (auto& worker : workers) {
+        if (worker.joinable()) {
+            worker.join();
+        }
+    }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/network/server.cpp` around lines 171 - 180, In stop() the worker threads
(workers) can block in recv() because client sockets (fds) are still open;
change the shutdown sequence to first close or shutdown all client descriptors
(iterate over fds and call shutdown(fd, SHUT_RDWR) and then close(fd)) to
unblock any blocking recv() calls, then join the worker threads
(workers.joinable()/join()). Update the logic around the fds and workers in
stop() so descriptors are closed/shutdown before joining threads to avoid
hanging; reference the stop() function, the workers container, the fds
container, and any recv() usage in worker loops.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/network/server.cpp`:
- Around line 349-363: The handler currently skips sending RowDescription ('T')
and DataRow ('D') when res.success() is true; implement sending of a
RowDescription message followed by one DataRow per row in res.rows() (or at
least one DataRow for the prototype) before emitting CommandComplete ('C') and
ReadyForQuery ('Z'); locate the block that checks res.success() in server.cpp
(uses res.rows(), res.row_count(), client_fd and send()) and construct proper
PostgreSQL-style messages: emit a 'T' message describing column count and column
metadata, then for each row emit a 'D' message with the column values
(null/length-prefixed strings as appropriate), and only after those send the
existing CommandComplete and ReadyForQuery so tests in tests/server_tests.cpp
receive T and D before C/Z.
- Around line 298-311: The code fails to validate startup packets and does not
confirm full reads or length bounds; update the handling around
ProtocolReader::read_int32(buffer.data() + HEADER_SIZE) and the SSL branch so
that non-SSL startup codes are explicitly rejected, and all recv_all calls are
checked for full success. Specifically, after reading the initial code, validate
that the next packet is a well-formed startup (reject unexpected codes and close
client_fd via close when invalid), and in the SSL branch verify len =
ProtocolReader::read_int32(buffer.data()) satisfies len >= HEADER_SIZE and len
<= buffer.size() before calling recv_all, then ensure recv_all returns exactly
(len - HEADER_SIZE) and handle short/failed reads by closing client_fd and
returning; apply the same strict read/length checks to any subsequent packet
parsing to avoid buffer misuse.
- Around line 329-339: Validate the incoming message length returned by
ProtocolReader::read_int32 before allocating or reading payloads: in the code
path handling type == 'Q', ensure len is >= 4 (so payload_len = len - 4 is
non-negative) and also enforce a reasonable upper bound to avoid huge
allocations; if validation fails, close/abort the connection instead of
proceeding. Use recv_all(client_fd, sql_buf.data(), payload_len) return value to
check full receipt and construct the SQL string with the explicit sized
constructor (std::string(sql_buf.data(), payload_len)) rather than relying on
NUL-termination. Ensure error paths for recv_all propagate/handle failures
cleanly.

In `@src/storage/buffer_pool_manager.cpp`:
- Around line 116-129: The code currently calls
storage_manager_.allocate_page(file_name) before securing a frame, which can
allocate a disk page that is leaked if no frame is available; move the
frame-reservation logic (checking free_list_ and calling
replacer_.victim(&frame_id)) before calling storage_manager_.allocate_page, then
call allocate_page only after a frame_id is successfully obtained, assign
*page_id = target_page_id (if page_id non-null) and build the key with
make_page_key; additionally, if you must keep the current order, ensure that
when victim() fails you properly release any reserved frame (e.g., push frame_id
back onto free_list_) and avoid returning nullptr after already allocating a
disk page.
- Line 30: The destructor BufferPoolManager::~BufferPoolManager is currently =
default and thus skips flushing dirty frames; replace the default destructor
with an explicit implementation in class BufferPoolManager that iterates cached
frames on destruction and performs a final durability flush for any dirty pages
(use the existing methods such as FlushPage, FlushAll, or WriteBackPerFrame as
available), acquiring/holding necessary locks (e.g., page table or frame
mutexes) to safely check and write back each dirty frame before releasing
resources, and ensure errors during flush are logged/handled but do not leak
resources.

In `@src/storage/lru_replacer.cpp`:
- Around line 39-52: LRUReplacer::unpin currently silently drops inserts when
lru_list_.size() >= capacity_; update the implementation and API to make this
explicit: change unpin to return bool (true if inserted, false if dropped),
update references to check the return value, and update the function docstring
to state the capacity-bound behavior and that lru_map_ and lru_list_ are
unchanged when capacity is reached; ensure lru_list_, lru_map_, and capacity_
are used as the condition check and add/return paths accordingly, and add or
adjust unit tests to cover the boundary case where lru_list_.size() == capacity_
so the new return value is validated.

In `@src/storage/storage_manager.cpp`:
- Around line 167-180: StorageManager::allocate_page currently returns 0 on
open_file failure (ambiguous with a valid page 0) and doesn't check
file->tellg() for failure; change the signature to return
std::optional<uint32_t> and include <optional>, return std::nullopt when
open_file(filename) fails, and after file->seekg(0, std::ios::end) check if
file->tellg() returns std::streampos(-1) (or file->fail()) and return
std::nullopt on error; otherwise compute pages =
static_cast<uint32_t>(static_cast<uint64_t>(size) / PAGE_SIZE) and return that
value—refer to StorageManager::allocate_page, open_files_, open_file,
file->seekg and file->tellg in your edits.

In `@tests/cloudSQL_tests.cpp`:
- Around line 325-337: The GROUP BY test is nondeterministic because the query
"SELECT cat, COUNT(val), SUM(val) FROM agg_test GROUP BY cat" has no ordering;
modify the test to produce deterministic ordering—either add an ORDER BY clause
to the parsed SQL (e.g., "ORDER BY cat") before creating the Lexer/Parser or,
after exec.execute(*stmt) is called, sort the returned result rows from
res.rows() (or index into them by locating the row where
get(0).to_string()=="A") before asserting values; update the code around
Lexer/Parser.parse_statement and the use of exec.execute and res.rows()
accordingly so the assertion for row 0 reliably checks the 'A' row.

---

Outside diff comments:
In `@src/network/server.cpp`:
- Around line 171-180: In stop() the worker threads (workers) can block in
recv() because client sockets (fds) are still open; change the shutdown sequence
to first close or shutdown all client descriptors (iterate over fds and call
shutdown(fd, SHUT_RDWR) and then close(fd)) to unblock any blocking recv()
calls, then join the worker threads (workers.joinable()/join()). Update the
logic around the fds and workers in stop() so descriptors are closed/shutdown
before joining threads to avoid hanging; reference the stop() function, the
workers container, the fds container, and any recv() usage in worker loops.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2cdf09d and d7e39e5.

📒 Files selected for processing (16)
  • .github/workflows/ci.yml
  • include/storage/buffer_pool_manager.hpp
  • include/storage/storage_manager.hpp
  • include/transaction/transaction.hpp
  • src/catalog/catalog.cpp
  • src/network/server.cpp
  • src/storage/buffer_pool_manager.cpp
  • src/storage/lru_replacer.cpp
  • src/storage/storage_manager.cpp
  • src/transaction/transaction_manager.cpp
  • tests/buffer_pool_tests.cpp
  • tests/cloudSQL_tests.cpp
  • tests/cloudSQL_tests.cpp.old
  • tests/recovery_tests.cpp
  • tests/server_tests.cpp
  • tests/statement_tests.cpp

Comment on lines 298 to 311
const uint32_t code = ProtocolReader::read_int32(buffer.data() + HEADER_SIZE);
if (code == 80877103) { // SSL Request
const char n_response = 'N';
static_cast<void>(send(client_fd, &n_response, 1, 0));

// Expect startup packet next
n = recv_all(client_fd, buffer.data(), HEADER_SIZE);
if (n < static_cast<ssize_t>(HEADER_SIZE)) {
static_cast<void>(close(client_fd));
return;
}
len = ProtocolReader::read_int32(buffer.data());
if (len < HEADER_SIZE || len > buffer.size()) {
return;
}
n = recv_all(client_fd, std::next(buffer.data(), static_cast<std::ptrdiff_t>(HEADER_SIZE)),
len - HEADER_SIZE);
if (n < static_cast<ssize_t>(len - HEADER_SIZE)) {
return;
}
protocol = ProtocolReader::read_int32(
std::next(buffer.data(), static_cast<std::ptrdiff_t>(HEADER_SIZE)));
}

if (protocol != static_cast<uint32_t>(PROTOCOL_VERSION_3)) {
return;
static_cast<void>(recv_all(client_fd, buffer.data() + HEADER_SIZE, len - HEADER_SIZE));
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Handshake validation is incomplete and allows invalid startup packets/buffer misuse.

At Line 299, non-SSL startup codes are not rejected; and in the SSL branch (Lines 309-310), the second packet length is consumed without re-validating bounds or checking full read success.

🛠️ Suggested fix
     const uint32_t code = ProtocolReader::read_int32(buffer.data() + HEADER_SIZE);
     if (code == 80877103) {  // SSL Request
         const char n_response = 'N';
         static_cast<void>(send(client_fd, &n_response, 1, 0));

         // Expect startup packet next
         n = recv_all(client_fd, buffer.data(), HEADER_SIZE);
         if (n < static_cast<ssize_t>(HEADER_SIZE)) {
             static_cast<void>(close(client_fd));
             return;
         }
         len = ProtocolReader::read_int32(buffer.data());
-        static_cast<void>(recv_all(client_fd, buffer.data() + HEADER_SIZE, len - HEADER_SIZE));
+        if (len > buffer.size() || len < HEADER_SIZE) {
+            static_cast<void>(close(client_fd));
+            return;
+        }
+        n = recv_all(client_fd, buffer.data() + HEADER_SIZE, len - HEADER_SIZE);
+        if (n < static_cast<ssize_t>(len - HEADER_SIZE) ||
+            ProtocolReader::read_int32(buffer.data() + HEADER_SIZE) != 196608U) {
+            static_cast<void>(close(client_fd));
+            return;
+        }
+    } else if (code != 196608U) {
+        static_cast<void>(close(client_fd));
+        return;
     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/network/server.cpp` around lines 298 - 311, The code fails to validate
startup packets and does not confirm full reads or length bounds; update the
handling around ProtocolReader::read_int32(buffer.data() + HEADER_SIZE) and the
SSL branch so that non-SSL startup codes are explicitly rejected, and all
recv_all calls are checked for full success. Specifically, after reading the
initial code, validate that the next packet is a well-formed startup (reject
unexpected codes and close client_fd via close when invalid), and in the SSL
branch verify len = ProtocolReader::read_int32(buffer.data()) satisfies len >=
HEADER_SIZE and len <= buffer.size() before calling recv_all, then ensure
recv_all returns exactly (len - HEADER_SIZE) and handle short/failed reads by
closing client_fd and returning; apply the same strict read/length checks to any
subsequent packet parsing to avoid buffer misuse.

Comment on lines +329 to 339
n = recv_all(client_fd, buffer.data(), 4);
if (n < 4) {
break;
}
len = ProtocolReader::read_int32(buffer.data());
if (len < HEADER_SIZE) {
break;
}

std::vector<char> body(len - HEADER_SIZE);
if (len > HEADER_SIZE) {
n = recv_all(client_fd, body.data(), len - HEADER_SIZE);
if (n < static_cast<ssize_t>(len - HEADER_SIZE)) {
break;
}
}

if (type == 'Q') { /* Simple Query */
const std::string sql(body.data());
static_cast<void>(stats_.queries_executed.fetch_add(1));
if (type == 'Q') {
std::vector<char> sql_buf(len - 4);
static_cast<void>(recv_all(client_fd, sql_buf.data(), len - 4));
const std::string sql(sql_buf.data());

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Query message length is not validated before allocation/read, enabling malformed-packet crashes.

At Line 333, len is trusted directly. For type == 'Q', len < 4 underflows len - 4, and std::string sql(sql_buf.data()) at Line 338 assumes NUL termination from untrusted input.

🛠️ Suggested fix
         len = ProtocolReader::read_int32(buffer.data());

         if (type == 'Q') {
+            if (len < 5 || len > PKT_BUF_SIZE) {  // payload + trailing '\0'
+                break;
+            }
             std::vector<char> sql_buf(len - 4);
-            static_cast<void>(recv_all(client_fd, sql_buf.data(), len - 4));
-            const std::string sql(sql_buf.data());
+            n = recv_all(client_fd, sql_buf.data(), len - 4);
+            if (n < static_cast<ssize_t>(len - 4) || sql_buf.back() != '\0') {
+                break;
+            }
+            const std::string sql(sql_buf.data(), sql_buf.size() - 1);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/network/server.cpp` around lines 329 - 339, Validate the incoming message
length returned by ProtocolReader::read_int32 before allocating or reading
payloads: in the code path handling type == 'Q', ensure len is >= 4 (so
payload_len = len - 4 is non-negative) and also enforce a reasonable upper bound
to avoid huge allocations; if validation fails, close/abort the connection
instead of proceeding. Use recv_all(client_fd, sql_buf.data(), payload_len)
return value to check full receipt and construct the SQL string with the
explicit sized constructor (std::string(sql_buf.data(), payload_len)) rather
than relying on NUL-termination. Ensure error paths for recv_all
propagate/handle failures cleanly.

Comment on lines +349 to 363
if (res.success()) {
// Row Description (T)
if (!res.rows().empty()) {
// Simplified: send one RowDescription and then DataRows
// For this prototype, we just send CommandComplete
}

/* 3. Send CommandComplete ('C') */
std::vector<char> complete = {'C'};
const std::string msg = (stmt->type() == parser::StmtType::Select)
? "SELECT " + std::to_string(result.row_count())
: "OK";
ProtocolWriter::append_int32(complete,
static_cast<uint32_t>(4 + msg.size() + 1));
ProtocolWriter::append_string(complete, msg);
static_cast<void>(send(client_fd, complete.data(), complete.size(), 0));
// Command Complete (C)
const std::string tag = "SELECT " + std::to_string(res.row_count());
const uint32_t tag_len = htonl(static_cast<uint32_t>(tag.size() + 4 + 1));
const char c_type = 'C';
static_cast<void>(send(client_fd, &c_type, 1, 0));
static_cast<void>(send(client_fd, &tag_len, 4, 0));
static_cast<void>(send(client_fd, tag.c_str(), tag.size() + 1, 0));
} else {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Successful SELECT responses skip RowDescription/DataRow, breaking protocol expectations.

At Line 350, row metadata/data sending is left unimplemented, yet clients/tests (see tests/server_tests.cpp, Line 136 and Line 148) expect T and D messages before C/Z.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/network/server.cpp` around lines 349 - 363, The handler currently skips
sending RowDescription ('T') and DataRow ('D') when res.success() is true;
implement sending of a RowDescription message followed by one DataRow per row in
res.rows() (or at least one DataRow for the prototype) before emitting
CommandComplete ('C') and ReadyForQuery ('Z'); locate the block that checks
res.success() in server.cpp (uses res.rows(), res.row_count(), client_fd and
send()) and construct proper PostgreSQL-style messages: emit a 'T' message
describing column count and column metadata, then for each row emit a 'D'
message with the column values (null/length-prefixed strings as appropriate),
and only after those send the existing CommandComplete and ReadyForQuery so
tests in tests/server_tests.cpp receive T and D before C/Z.

BufferPoolManager::~BufferPoolManager() {
flush_all_pages();
}
BufferPoolManager::~BufferPoolManager() = default;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Destructor drops dirty pages by skipping final flush.

At Line 30, defaulting the destructor removes the last-chance durability flush for dirty frames still resident in memory.

🛠️ Suggested fix
-BufferPoolManager::~BufferPoolManager() = default;
+BufferPoolManager::~BufferPoolManager() {
+    flush_all_pages();
+}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/storage/buffer_pool_manager.cpp` at line 30, The destructor
BufferPoolManager::~BufferPoolManager is currently = default and thus skips
flushing dirty frames; replace the default destructor with an explicit
implementation in class BufferPoolManager that iterates cached frames on
destruction and performs a final durability flush for any dirty pages (use the
existing methods such as FlushPage, FlushAll, or WriteBackPerFrame as
available), acquiring/holding necessary locks (e.g., page table or frame
mutexes) to safely check and write back each dirty frame before releasing
resources, and ensure errors during flush are logged/handled but do not leak
resources.

Comment on lines +116 to 129
const uint32_t target_page_id = storage_manager_.allocate_page(file_name);
if (page_id != nullptr) {
// NOLINTNEXTLINE(cppcoreguidelines-pro-type-const-cast)
const_cast<uint32_t&>(*page_id) = target_page_id;
}
const std::string key = make_page_key(file_name, target_page_id);

uint32_t frame_id = 0;
if (!free_list_.empty()) {
frame_id = free_list_.front();
free_list_.pop_front();
} else {
if (!replacer_.victim(&frame_id)) {
return nullptr;
}
Page* const victim_page = &pages_[frame_id];
if (victim_page->is_dirty_) {
if (log_manager_ != nullptr && victim_page->lsn_ != -1) {
if (victim_page->lsn_ > log_manager_->get_persistent_lsn()) {
log_manager_->flush(true);
}
}
storage_manager_.write_page(victim_page->file_name_, victim_page->page_id_,
victim_page->get_data());
}
page_table_.erase(make_page_key(victim_page->file_name_, victim_page->page_id_));
frame_id = free_list_.back();
free_list_.pop_back();
} else if (!replacer_.victim(&frame_id)) {
return nullptr;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

new_page allocates disk pages before frame reservation, causing leaked/orphan pages on failure.

At Line 116, allocate_page() is called before checking whether a frame is available. If no free/victim frame exists (Line 127), this returns nullptr after allocating a page id.

🛠️ Suggested fix
-    const uint32_t target_page_id = storage_manager_.allocate_page(file_name);
-    if (page_id != nullptr) {
-        // NOLINTNEXTLINE(cppcoreguidelines-pro-type-const-cast)
-        const_cast<uint32_t&>(*page_id) = target_page_id;
-    }
-    const std::string key = make_page_key(file_name, target_page_id);
-
     uint32_t frame_id = 0;
     if (!free_list_.empty()) {
         frame_id = free_list_.back();
         free_list_.pop_back();
     } else if (!replacer_.victim(&frame_id)) {
         return nullptr;
     }
+
+    const uint32_t target_page_id = storage_manager_.allocate_page(file_name);
+    if (page_id != nullptr) {
+        // NOLINTNEXTLINE(cppcoreguidelines-pro-type-const-cast)
+        const_cast<uint32_t&>(*page_id) = target_page_id;
+    }
+    const std::string key = make_page_key(file_name, target_page_id);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const uint32_t target_page_id = storage_manager_.allocate_page(file_name);
if (page_id != nullptr) {
// NOLINTNEXTLINE(cppcoreguidelines-pro-type-const-cast)
const_cast<uint32_t&>(*page_id) = target_page_id;
}
const std::string key = make_page_key(file_name, target_page_id);
uint32_t frame_id = 0;
if (!free_list_.empty()) {
frame_id = free_list_.front();
free_list_.pop_front();
} else {
if (!replacer_.victim(&frame_id)) {
return nullptr;
}
Page* const victim_page = &pages_[frame_id];
if (victim_page->is_dirty_) {
if (log_manager_ != nullptr && victim_page->lsn_ != -1) {
if (victim_page->lsn_ > log_manager_->get_persistent_lsn()) {
log_manager_->flush(true);
}
}
storage_manager_.write_page(victim_page->file_name_, victim_page->page_id_,
victim_page->get_data());
}
page_table_.erase(make_page_key(victim_page->file_name_, victim_page->page_id_));
frame_id = free_list_.back();
free_list_.pop_back();
} else if (!replacer_.victim(&frame_id)) {
return nullptr;
}
uint32_t frame_id = 0;
if (!free_list_.empty()) {
frame_id = free_list_.back();
free_list_.pop_back();
} else if (!replacer_.victim(&frame_id)) {
return nullptr;
}
const uint32_t target_page_id = storage_manager_.allocate_page(file_name);
if (page_id != nullptr) {
// NOLINTNEXTLINE(cppcoreguidelines-pro-type-const-cast)
const_cast<uint32_t&>(*page_id) = target_page_id;
}
const std::string key = make_page_key(file_name, target_page_id);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/storage/buffer_pool_manager.cpp` around lines 116 - 129, The code
currently calls storage_manager_.allocate_page(file_name) before securing a
frame, which can allocate a disk page that is leaked if no frame is available;
move the frame-reservation logic (checking free_list_ and calling
replacer_.victim(&frame_id)) before calling storage_manager_.allocate_page, then
call allocate_page only after a frame_id is successfully obtained, assign
*page_id = target_page_id (if page_id non-null) and build the key with
make_page_key; additionally, if you must keep the current order, ensure that
when victim() fails you properly release any reserved frame (e.g., push frame_id
back onto free_list_) and avoid returning nullptr after already allocating a
disk page.

Comment on lines 39 to 52
void LRUReplacer::unpin(uint32_t frame_id) {
const std::lock_guard<std::mutex> lock(latch_);
if (lru_map_.find(frame_id) != lru_map_.end()) {
// Already in the replacer's candidate list
const std::scoped_lock<std::mutex> lock(latch_);

if (lru_map_.count(frame_id) != 0) {
return;
}

if (lru_list_.size() >= capacity_) {
return;
}

// Add to the front of the list (most recently used)
lru_list_.push_front(frame_id);
lru_map_[frame_id] = lru_list_.begin();

// Enforce capacity constraint (shouldn't happen strictly if used with BufferPool properly, but
// safe)
if (lru_list_.size() > capacity_) {
const uint32_t lru_frame = lru_list_.back();
static_cast<void>(lru_map_.erase(lru_frame));
lru_list_.pop_back();
}
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

cd /
find . -name "*.cpp" -o -name "*.h" | grep -i replacer | head -20

Repository: poyrazK/cloudSQL

Length of output: 170


🏁 Script executed:

rg -n "class LRUReplacer" --type cpp -A 20

Repository: poyrazK/cloudSQL

Length of output: 1941


🏁 Script executed:

rg -n "LRUReplacer::" --type cpp | grep -E "(constructor|LRUReplacer\()" | head -10

Repository: poyrazK/cloudSQL

Length of output: 160


🏁 Script executed:

rg -n "void unpin" --type cpp -B 5 -A 15

Repository: poyrazK/cloudSQL

Length of output: 1379


🏁 Script executed:

cat -n include/storage/lru_replacer.hpp | sed -n '39,70p'

Repository: poyrazK/cloudSQL

Length of output: 1237


🏁 Script executed:

rg -n "unpin" --type cpp -l | head -5

Repository: poyrazK/cloudSQL

Length of output: 207


🏁 Script executed:

# Look for test files related to LRUReplacer
find . -name "*test*.cpp" -o -name "*test*.cc" | xargs grep -l "unpin" 2>/dev/null | head -5

Repository: poyrazK/cloudSQL

Length of output: 89


🏁 Script executed:

cat -n src/storage/lru_replacer.cpp

Repository: poyrazK/cloudSQL

Length of output: 1800


🏁 Script executed:

cat -n tests/buffer_pool_tests.cpp | head -100

Repository: poyrazK/cloudSQL

Length of output: 3438


🏁 Script executed:

# Search for unpin tests to understand expected behavior
rg -n "unpin" tests/buffer_pool_tests.cpp -B 3 -A 5

Repository: poyrazK/cloudSQL

Length of output: 3069


🏁 Script executed:

cat -n src/storage/buffer_pool_manager.cpp | head -80

Repository: poyrazK/cloudSQL

Length of output: 2979


🏁 Script executed:

rg -n "unpin_page" src/storage/buffer_pool_manager.cpp -B 2 -A 8 | head -40

Repository: poyrazK/cloudSQL

Length of output: 443


🏁 Script executed:

# Check for any comments or constraints about unpin behavior
rg -n "unpin" include/storage/ -B 3 -A 3

Repository: poyrazK/cloudSQL

Length of output: 1881


🏁 Script executed:

cat -n src/storage/buffer_pool_manager.cpp | sed -n '70,95p'

Repository: poyrazK/cloudSQL

Length of output: 871


🏁 Script executed:

# Check if unpin_page in BufferPoolManager calls replacer_.unpin
rg -n "replacer_\.unpin" src/storage/buffer_pool_manager.cpp -B 5 -A 5

Repository: poyrazK/cloudSQL

Length of output: 250


🏁 Script executed:

# Verify the relationship: pool_size = capacity
rg -n "replacer_" src/storage/buffer_pool_manager.cpp | head -10

Repository: poyrazK/cloudSQL

Length of output: 368


🏁 Script executed:

# Check if there are any scenarios where more frames than pool_size could be unpinned
rg -n "new_page\|fetch_page" src/storage/buffer_pool_manager.cpp -B 2 -A 8 | head -60

Repository: poyrazK/cloudSQL

Length of output: 42


🏁 Script executed:

# Verify if replacer can ever have more than capacity frames in practice
rg -n "pin_count" src/storage/buffer_pool_manager.cpp -B 2 -A 2

Repository: poyrazK/cloudSQL

Length of output: 1291


🏁 Script executed:

# Trace through the test case to verify the size never exceeds capacity
cat -n tests/buffer_pool_tests.cpp | sed -n '25,60p'

Repository: poyrazK/cloudSQL

Length of output: 1191


🏁 Script executed:

# Check the logic: a frame can only be unpinned if it was pinned first
# Let's see if there's any way to unpin a frame that wasn't pinned
rg -n "unpin" src/storage/buffer_pool_manager.cpp -B 5 -A 3

Repository: poyrazK/cloudSQL

Length of output: 538


🏁 Script executed:

# Verify: in normal operation, replacer size should never exceed capacity
# Let's check if there are any other unpin calls besides the one in buffer_pool_manager
rg -n "\.unpin\(" src/ --type cpp

Repository: poyrazK/cloudSQL

Length of output: 133


The silent drop at capacity is real but undocumented—consider making this explicit.

The capacity check on line 46 silently prevents frames from being added once lru_list_.size() >= capacity_. While this behavior is intentional, it's:

  1. Not documented: The unpin() API comment doesn't mention that unpinning can silently fail when at capacity.
  2. Not tested: The test suite avoids exercising this condition; there's no test case verifying behavior when unpin() is called at capacity.
  3. Silently hides issues: If capacity doesn't match the actual pool size, this silent drop masks the mismatch rather than surfacing it.

In normal BufferPoolManager operation, this should rarely occur since capacity equals pool_size, but the lack of documentation and error indication makes this fragile. Consider either:

  • Documenting this behavior explicitly in the function's docstring, or
  • Returning a bool to indicate success/failure (matching the semantic of unpin_page() in BufferPoolManager), or
  • Adding a test case that validates the boundary condition.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/storage/lru_replacer.cpp` around lines 39 - 52, LRUReplacer::unpin
currently silently drops inserts when lru_list_.size() >= capacity_; update the
implementation and API to make this explicit: change unpin to return bool (true
if inserted, false if dropped), update references to check the return value, and
update the function docstring to state the capacity-bound behavior and that
lru_map_ and lru_list_ are unchanged when capacity is reached; ensure lru_list_,
lru_map_, and capacity_ are used as the condition check and add/return paths
accordingly, and add or adjust unit tests to cover the boundary case where
lru_list_.size() == capacity_ so the new return value is validated.

Comment on lines +167 to +180
uint32_t StorageManager::allocate_page(const std::string& filename) {
if (open_files_.find(filename) == open_files_.end()) {
if (!open_file(filename)) {
return 0;
}
}

auto& file = open_files_[filename];
file->clear();
file->seekg(0, std::ios::end);
const std::streampos size = file->tellg();

return static_cast<uint32_t>(static_cast<uint64_t>(size) / PAGE_SIZE);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Ambiguous error return and missing tellg() failure check.

Two issues:

  1. Returning 0 on open_file failure (line 170) is ambiguous since 0 is also the valid page index for an empty file.
  2. tellg() can return -1 on failure, which would produce an incorrect page count when cast and divided.

Consider returning std::optional<uint32_t> or using a sentinel like UINT32_MAX for errors, and add a tellg() failure check.

🛠️ Proposed fix
 uint32_t StorageManager::allocate_page(const std::string& filename) {
     if (open_files_.find(filename) == open_files_.end()) {
         if (!open_file(filename)) {
-            return 0;
+            return UINT32_MAX;  // Or use std::optional<uint32_t>
         }
     }

     auto& file = open_files_[filename];
     file->clear();
     file->seekg(0, std::ios::end);
     const std::streampos size = file->tellg();
+
+    if (size == std::streampos(-1)) {
+        return UINT32_MAX;  // tellg() failed
+    }

     return static_cast<uint32_t>(static_cast<uint64_t>(size) / PAGE_SIZE);
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
uint32_t StorageManager::allocate_page(const std::string& filename) {
if (open_files_.find(filename) == open_files_.end()) {
if (!open_file(filename)) {
return 0;
}
}
auto& file = open_files_[filename];
file->clear();
file->seekg(0, std::ios::end);
const std::streampos size = file->tellg();
return static_cast<uint32_t>(static_cast<uint64_t>(size) / PAGE_SIZE);
}
uint32_t StorageManager::allocate_page(const std::string& filename) {
if (open_files_.find(filename) == open_files_.end()) {
if (!open_file(filename)) {
return UINT32_MAX; // Or use std::optional<uint32_t>
}
}
auto& file = open_files_[filename];
file->clear();
file->seekg(0, std::ios::end);
const std::streampos size = file->tellg();
if (size == std::streampos(-1)) {
return UINT32_MAX; // tellg() failed
}
return static_cast<uint32_t>(static_cast<uint64_t>(size) / PAGE_SIZE);
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/storage/storage_manager.cpp` around lines 167 - 180,
StorageManager::allocate_page currently returns 0 on open_file failure
(ambiguous with a valid page 0) and doesn't check file->tellg() for failure;
change the signature to return std::optional<uint32_t> and include <optional>,
return std::nullopt when open_file(filename) fails, and after file->seekg(0,
std::ios::end) check if file->tellg() returns std::streampos(-1) (or
file->fail()) and return std::nullopt on error; otherwise compute pages =
static_cast<uint32_t>(static_cast<uint64_t>(size) / PAGE_SIZE) and return that
value—refer to StorageManager::allocate_page, open_files_, open_file,
file->seekg and file->tellg in your edits.

Comment on lines +325 to +337
auto lex =
std::make_unique<Lexer>("SELECT cat, COUNT(val), SUM(val) FROM agg_test GROUP BY cat");
auto stmt = Parser(std::move(lex)).parse_statement();
ASSERT_NE(stmt, nullptr);

const auto res = exec.execute(*stmt);
EXPECT_TRUE(res.success());

EXPECT_EQ(res.row_count(), 2U);
/* Row 0: 'A', 2, 30 */
EXPECT_STREQ(res.rows()[0].get(0).to_string().c_str(), "A");
EXPECT_STREQ(res.rows()[0].get(1).to_string().c_str(), "2");
EXPECT_STREQ(res.rows()[0].get(2).to_string().c_str(), "30");
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Make GROUP BY assertion deterministic to avoid flaky test failures.

At Line 325, the query groups by cat but does not order results; asserting rows()[0] == 'A' is nondeterministic.

✅ Suggested fix
-    auto lex =
-        std::make_unique<Lexer>("SELECT cat, COUNT(val), SUM(val) FROM agg_test GROUP BY cat");
+    auto lex = std::make_unique<Lexer>(
+        "SELECT cat, COUNT(val), SUM(val) FROM agg_test GROUP BY cat ORDER BY cat");
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
auto lex =
std::make_unique<Lexer>("SELECT cat, COUNT(val), SUM(val) FROM agg_test GROUP BY cat");
auto stmt = Parser(std::move(lex)).parse_statement();
ASSERT_NE(stmt, nullptr);
const auto res = exec.execute(*stmt);
EXPECT_TRUE(res.success());
EXPECT_EQ(res.row_count(), 2U);
/* Row 0: 'A', 2, 30 */
EXPECT_STREQ(res.rows()[0].get(0).to_string().c_str(), "A");
EXPECT_STREQ(res.rows()[0].get(1).to_string().c_str(), "2");
EXPECT_STREQ(res.rows()[0].get(2).to_string().c_str(), "30");
auto lex = std::make_unique<Lexer>(
"SELECT cat, COUNT(val), SUM(val) FROM agg_test GROUP BY cat ORDER BY cat");
auto stmt = Parser(std::move(lex)).parse_statement();
ASSERT_NE(stmt, nullptr);
const auto res = exec.execute(*stmt);
EXPECT_TRUE(res.success());
EXPECT_EQ(res.row_count(), 2U);
/* Row 0: 'A', 2, 30 */
EXPECT_STREQ(res.rows()[0].get(0).to_string().c_str(), "A");
EXPECT_STREQ(res.rows()[0].get(1).to_string().c_str(), "2");
EXPECT_STREQ(res.rows()[0].get(2).to_string().c_str(), "30");
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/cloudSQL_tests.cpp` around lines 325 - 337, The GROUP BY test is
nondeterministic because the query "SELECT cat, COUNT(val), SUM(val) FROM
agg_test GROUP BY cat" has no ordering; modify the test to produce deterministic
ordering—either add an ORDER BY clause to the parsed SQL (e.g., "ORDER BY cat")
before creating the Lexer/Parser or, after exec.execute(*stmt) is called, sort
the returned result rows from res.rows() (or index into them by locating the row
where get(0).to_string()=="A") before asserting values; update the code around
Lexer/Parser.parse_statement and the use of exec.execute and res.rows()
accordingly so the assertion for row 0 reliably checks the 'A' row.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/network/server.cpp (1)

164-180: ⚠️ Potential issue | 🟡 Minor

Potential double-close of client file descriptors.

handle_connection() removes the fd from client_fds_ and closes it at lines 390-397. However, stop() swaps out client_fds_ at line 168, then joins workers at lines 171-175, and finally closes all fds at lines 178-180.

If a worker completes and removes its fd from client_fds_ before the swap, everything is fine. But if the swap happens while a worker is between removing its fd and closing it, the fd could still be in the swapped-out list and get closed twice, or the worker could close an fd that was already closed by stop().

Consider either:

  1. Having handle_connection only remove the fd (not close it), letting stop() handle all closes, or
  2. Having stop() signal workers to exit without closing fds from the main thread.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/network/server.cpp` around lines 164 - 180, The code risks double-closing
client fds because both handle_connection() and stop() may close the same fd;
fix by choosing one owner of close responsibility—simplest: modify
handle_connection to remove the fd from client_fds_ but never call close(fd)
(leave closure to stop()), and ensure handle_connection no longer attempts to
close or re-close that fd; keep stop()’s logic that swaps client_fds_ under
thread_mutex_, joins worker_threads_, then iterates fds to close each; update
uses of client_fds_, worker_threads_, thread_mutex_, and the close call sites
accordingly so only stop() performs the actual close after joining workers.
🧹 Nitpick comments (2)
tests/statement_tests.cpp (2)

10-10: Unused include.

<stdexcept> is included but no exception types or throw statements are used in this file. Consider removing it.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/statement_tests.cpp` at line 10, Remove the unused include directive
'<stdexcept>' from the file (it's not used anywhere in
tests/statement_tests.cpp); locate the include line containing stdexcept and
delete it so only necessary headers remain, and run the tests/compile to confirm
nothing else depends on it.

115-119: Consider adding assertions for statement structure.

The substring checks correctly handle non-deterministic map iteration, but verifying the overall statement structure would strengthen the test.

💡 Suggested additional assertions
     // Map iteration order is unspecified, so check substrings
     const std::string sql = stmt->to_string();
+    EXPECT_NE(sql.find("UPDATE products SET"), std::string::npos);
     EXPECT_NE(sql.find("price = 100"), std::string::npos);
     EXPECT_NE(sql.find("stock = 50"), std::string::npos);
+    EXPECT_NE(sql.find("WHERE id = 1"), std::string::npos);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/statement_tests.cpp` around lines 115 - 119, Add stronger structural
assertions around stmt->to_string(): in addition to the existing substring
checks for "price = 100" and "stock = 50", assert that the SQL contains the
expected statement keyword(s) (e.g., "UPDATE" and "SET") and that the two
assignments appear together in the SET clause in either order (use a regex or
two alternatives to accept "price = 100, stock = 50" or "stock = 50, price =
100"). This keeps map-order tolerance while ensuring the overall statement
structure is correct.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/network/server.cpp`:
- Around line 357-363: The command tag is hardcoded as "SELECT " +
std::to_string(res.row_count()) which mislabels all successful statements;
update the code that constructs tag (currently using tag, res.row_count(), and
sent via send(client_fd,...)) to branch on the actual statement/result type
(inspect the parsed statement object e.g. stmt or metadata on res) and build the
proper PostgreSQL command tag strings (e.g. "SELECT <rows>", "INSERT 0
<rows_or_oid>", "UPDATE <rows>", "DELETE <rows>", "CREATE TABLE" etc.), then
send that constructed tag as before; ensure you use the correct values from res
(row count, inserted OID if available) and preserve the existing send(client_fd,
...) flow.
- Around line 240-259: The loop in accept_connections reads listen_fd_ under
state_mutex_ then calls select() on the local fd outside the lock, allowing
stop() to close listen_fd_ concurrently; fix by re-checking the fd and running_
state immediately after select returns: after select() returns, acquire
state_mutex_ and verify running_ is true and that listen_fd_ == fd and fd >= 0,
if not skip handling and continue; additionally handle select() errors (check
res < 0 for errno == EINTR and retry, treat EBADF by continuing or logging) so
accept_connections, stop(), listen_fd_, and state_mutex_ coordinate safely (or
replace this wake mechanism with a self-pipe/eventfd if you prefer to avoid the
race entirely).
- Around line 365-381: Build a proper PostgreSQL ErrorResponse payload instead
of sending the raw error string: assemble a payload string that contains
field-type bytes and null-terminated field values (e.g., 'S' + "ERROR\0", 'M' +
err + '\0', 'C' + "XX000\0"), terminate the whole payload with an extra '\0',
compute the length as 4 + payload.size(), htonl that into e_len, then send the
message type 'E', the 4-byte length, and the payload (replace the current uses
of err, e_len, e_type and the three send(...) calls with this new payload
construction and sends) so clients can parse the ErrorResponse per the
PostgreSQL protocol.

---

Outside diff comments:
In `@src/network/server.cpp`:
- Around line 164-180: The code risks double-closing client fds because both
handle_connection() and stop() may close the same fd; fix by choosing one owner
of close responsibility—simplest: modify handle_connection to remove the fd from
client_fds_ but never call close(fd) (leave closure to stop()), and ensure
handle_connection no longer attempts to close or re-close that fd; keep stop()’s
logic that swaps client_fds_ under thread_mutex_, joins worker_threads_, then
iterates fds to close each; update uses of client_fds_, worker_threads_,
thread_mutex_, and the close call sites accordingly so only stop() performs the
actual close after joining workers.

---

Nitpick comments:
In `@tests/statement_tests.cpp`:
- Line 10: Remove the unused include directive '<stdexcept>' from the file (it's
not used anywhere in tests/statement_tests.cpp); locate the include line
containing stdexcept and delete it so only necessary headers remain, and run the
tests/compile to confirm nothing else depends on it.
- Around line 115-119: Add stronger structural assertions around
stmt->to_string(): in addition to the existing substring checks for "price =
100" and "stock = 50", assert that the SQL contains the expected statement
keyword(s) (e.g., "UPDATE" and "SET") and that the two assignments appear
together in the SET clause in either order (use a regex or two alternatives to
accept "price = 100, stock = 50" or "stock = 50, price = 100"). This keeps
map-order tolerance while ensuring the overall statement structure is correct.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d7e39e5 and a2f9261.

📒 Files selected for processing (4)
  • include/storage/buffer_pool_manager.hpp
  • src/network/server.cpp
  • tests/statement_tests.cpp
  • tests/test_utils.hpp
✅ Files skipped from review due to trivial changes (1)
  • tests/test_utils.hpp
🚧 Files skipped from review as they are similar to previous changes (1)
  • include/storage/buffer_pool_manager.hpp

Comment on lines +240 to +259
while (true) {
int fd = -1;
{
const std::scoped_lock<std::mutex> lock(state_mutex_);
if (!running_) {
break;
}
fd = listen_fd_;
}

/* Use select with timeout to allow periodic is_running() check */
fd_set read_fds;
FD_ZERO(&read_fds);
FD_SET(fd, &read_fds);
struct timeval timeout {
0, SELECT_TIMEOUT_USEC
SELECT_TIMEOUT_SEC, 0
};

const int res = select(fd + 1, &read_fds, nullptr, nullptr, &timeout);
if (res <= 0) {
continue; /* Timeout or error */
continue;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Race condition between accept_connections and stop() on listen_fd_.

The loop reads listen_fd_ into local fd under the lock (line 247), then uses fd for FD_SET and select() outside the lock (lines 252-257). If stop() executes between releasing state_mutex_ and calling select(), it will set listen_fd_ = -1 and close the fd. The select() call then operates on a closed/invalid fd, which could cause undefined behavior or errors.

Consider either:

  1. Using a self-pipe or eventfd to wake the select loop when stopping, or
  2. Checking that fd is still valid after select() returns, or
  3. Accepting that select() on a closed fd returns an error and handling it gracefully.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/network/server.cpp` around lines 240 - 259, The loop in
accept_connections reads listen_fd_ under state_mutex_ then calls select() on
the local fd outside the lock, allowing stop() to close listen_fd_ concurrently;
fix by re-checking the fd and running_ state immediately after select returns:
after select() returns, acquire state_mutex_ and verify running_ is true and
that listen_fd_ == fd and fd >= 0, if not skip handling and continue;
additionally handle select() errors (check res < 0 for errno == EINTR and retry,
treat EBADF by continuing or logging) so accept_connections, stop(), listen_fd_,
and state_mutex_ coordinate safely (or replace this wake mechanism with a
self-pipe/eventfd if you prefer to avoid the race entirely).

Comment on lines +357 to +363
// Command Complete (C)
const std::string tag = "SELECT " + std::to_string(res.row_count());
const uint32_t tag_len = htonl(static_cast<uint32_t>(tag.size() + 4 + 1));
const char c_type = 'C';
static_cast<void>(send(client_fd, &c_type, 1, 0));
static_cast<void>(send(client_fd, &tag_len, 4, 0));
static_cast<void>(send(client_fd, tag.c_str(), tag.size() + 1, 0));
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Command tag always uses "SELECT" regardless of actual command type.

The tag is hardcoded as "SELECT " + row_count for all successful queries. For INSERT, UPDATE, DELETE, CREATE TABLE, etc., PostgreSQL expects different tags like INSERT 0 1, UPDATE 5, CREATE TABLE, etc.

🛠️ Suggested approach

Consider checking the statement type from the parsed stmt or the result to construct the appropriate command tag:

std::string tag;
// Determine tag based on statement type or result
if (/* is SELECT */) {
    tag = "SELECT " + std::to_string(res.row_count());
} else if (/* is INSERT */) {
    tag = "INSERT 0 " + std::to_string(res.row_count());
} else if (/* is CREATE TABLE */) {
    tag = "CREATE TABLE";
}
// ... etc
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/network/server.cpp` around lines 357 - 363, The command tag is hardcoded
as "SELECT " + std::to_string(res.row_count()) which mislabels all successful
statements; update the code that constructs tag (currently using tag,
res.row_count(), and sent via send(client_fd,...)) to branch on the actual
statement/result type (inspect the parsed statement object e.g. stmt or metadata
on res) and build the proper PostgreSQL command tag strings (e.g. "SELECT
<rows>", "INSERT 0 <rows_or_oid>", "UPDATE <rows>", "DELETE <rows>", "CREATE
TABLE" etc.), then send that constructed tag as before; ensure you use the
correct values from res (row count, inserted OID if available) and preserve the
existing send(client_fd, ...) flow.

Comment on lines +365 to 381
// Error Response (E)
const std::string err = res.error();
const uint32_t e_len = htonl(static_cast<uint32_t>(err.size() + 4 + 1));
const char e_type = 'E';
static_cast<void>(send(client_fd, &e_type, 1, 0));
static_cast<void>(send(client_fd, &e_len, 4, 0));
static_cast<void>(send(client_fd, err.c_str(), err.size() + 1, 0));
}
}
} catch (...) { /* Handle parsing/exec errors */
std::vector<char> complete = {'C'};
ProtocolWriter::append_int32(complete, ERROR_MSG_LEN);
ProtocolWriter::append_string(complete, "ERROR");
static_cast<void>(send(client_fd, complete.data(), complete.size(), 0));
} catch (const std::exception& e) {
const std::string err = e.what();
const uint32_t e_len = htonl(static_cast<uint32_t>(err.size() + 4 + 1));
const char e_type = 'E';
static_cast<void>(send(client_fd, &e_type, 1, 0));
static_cast<void>(send(client_fd, &e_len, 4, 0));
static_cast<void>(send(client_fd, err.c_str(), err.size() + 1, 0));
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

ErrorResponse format does not conform to PostgreSQL protocol.

PostgreSQL's ErrorResponse message requires field type codes (e.g., 'S' for severity, 'M' for message, 'C' for SQLSTATE code) before each field value, terminated by a null byte for each field and a final null byte. The current implementation sends only a raw error string, which clients will not parse correctly.

🛠️ Suggested fix for proper ErrorResponse format
                     } else {
                         // Error Response (E)
                         const std::string err = res.error();
-                        const uint32_t e_len = htonl(static_cast<uint32_t>(err.size() + 4 + 1));
-                        const char e_type = 'E';
-                        static_cast<void>(send(client_fd, &e_type, 1, 0));
-                        static_cast<void>(send(client_fd, &e_len, 4, 0));
-                        static_cast<void>(send(client_fd, err.c_str(), err.size() + 1, 0));
+                        // Build proper ErrorResponse: 'S' severity + 'M' message + 'C' code + terminator
+                        std::vector<char> err_msg;
+                        err_msg.push_back('S'); // Severity field
+                        const char* severity = "ERROR";
+                        err_msg.insert(err_msg.end(), severity, severity + 6);
+                        err_msg.push_back('M'); // Message field
+                        err_msg.insert(err_msg.end(), err.begin(), err.end());
+                        err_msg.push_back('\0');
+                        err_msg.push_back('C'); // SQLSTATE code field
+                        const char* sqlstate = "42000"; // Generic syntax error
+                        err_msg.insert(err_msg.end(), sqlstate, sqlstate + 6);
+                        err_msg.push_back('\0'); // Terminator
+                        
+                        const char e_type = 'E';
+                        uint32_t e_len = htonl(static_cast<uint32_t>(err_msg.size() + 4));
+                        static_cast<void>(send(client_fd, &e_type, 1, 0));
+                        static_cast<void>(send(client_fd, &e_len, 4, 0));
+                        static_cast<void>(send(client_fd, err_msg.data(), err_msg.size(), 0));
                     }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/network/server.cpp` around lines 365 - 381, Build a proper PostgreSQL
ErrorResponse payload instead of sending the raw error string: assemble a
payload string that contains field-type bytes and null-terminated field values
(e.g., 'S' + "ERROR\0", 'M' + err + '\0', 'C' + "XX000\0"), terminate the whole
payload with an extra '\0', compute the length as 4 + payload.size(), htonl that
into e_len, then send the message type 'E', the 4-byte length, and the payload
(replace the current uses of err, e_len, e_type and the three send(...) calls
with this new payload construction and sends) so clients can parse the
ErrorResponse per the PostgreSQL protocol.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/network/server.cpp (1)

174-183: ⚠️ Potential issue | 🔴 Critical

stop() can hang by joining workers before unblocking their recv().

At Lines 174-183, worker threads are joined before client sockets are closed/shutdown. If a worker is blocked in recv, shutdown may block indefinitely.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/network/server.cpp` around lines 174 - 183, In stop(), joining worker
threads before unblocking their blocked recv() can hang; change the shutdown
sequence in stop() to first iterate over the client fds (fds) and call
shutdown(fd, SHUT_RDWR) (or at least shutdown for read) and then close(fd) to
unblock any blocking recv() in worker threads, and only after all sockets are
shutdown/closed call join() on each thread in workers (check worker.joinable()).
Reference symbols: stop(), workers, fds, recv(), shutdown(), close(),
worker.joinable().
tests/cloudSQL_tests.cpp (1)

277-278: ⚠️ Potential issue | 🔴 Critical

Tests still call a removed TransactionManager constructor overload.

At Line 277, TransactionManager tm(lm, *catalog, sm); uses a 3-argument signature that no longer exists. This is compile-blocking and the same pattern repeats in later tests (e.g., Lines 309, 334, 367, 394, 422, 454, 484, 523, 567, 611, 669).

🛠️ Suggested fix pattern
-    TransactionManager tm(lm, *catalog, sm);
+    TransactionManager tm(lm);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/cloudSQL_tests.cpp` around lines 277 - 278, Tests call a removed
TransactionManager constructor overload; find every instantiation like
"TransactionManager tm(lm, *catalog, sm);" (also in the other failing test
spots) and update them to call the current TransactionManager constructor
signature declared in the TransactionManager class (open the TransactionManager
header to see required parameters) — pass the additional required argument(s) or
default/config object the new ctor needs so the tests compile (keep the
QueryExecutor usage "QueryExecutor exec(*catalog, sm, lm, tm);" unchanged).
♻️ Duplicate comments (7)
src/storage/buffer_pool_manager.cpp (2)

31-31: ⚠️ Potential issue | 🔴 Critical

Destructor still skips a final dirty-page flush.

At Line 31, default destruction bypasses flush_all_pages(), so dirty frames can be dropped on shutdown.

🛠️ Suggested fix
-BufferPoolManager::~BufferPoolManager() = default;
+BufferPoolManager::~BufferPoolManager() {
+    flush_all_pages();
+}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/storage/buffer_pool_manager.cpp` at line 31, The BufferPoolManager
destructor currently uses the default and skips flushing dirty pages; modify
BufferPoolManager::~BufferPoolManager() to explicitly call
this->flush_all_pages() (or flush_all_pages()) and handle/report errors as
appropriate before allowing destruction to proceed (ensure any necessary locks
are acquired/released consistent with the class's shutdown semantics) so all
dirty frames are written out prior to object teardown.

117-130: ⚠️ Potential issue | 🔴 Critical

new_page still allocates on disk before frame reservation.

At Line 117, disk page allocation happens before free/victim frame acquisition. If Line 128 fails, nullptr is returned after allocating a page id, leaving an orphan page.

🛠️ Suggested fix
-    const uint32_t target_page_id = storage_manager_.allocate_page(file_name);
-    if (page_id != nullptr) {
-        // NOLINTNEXTLINE(cppcoreguidelines-pro-type-const-cast)
-        const_cast<uint32_t&>(*page_id) = target_page_id;
-    }
-    const std::string key = make_page_key(file_name, target_page_id);
-
     uint32_t frame_id = 0;
     if (!free_list_.empty()) {
         frame_id = free_list_.back();
         free_list_.pop_back();
     } else if (!replacer_.victim(&frame_id)) {
         return nullptr;
     }
+
+    const uint32_t target_page_id = storage_manager_.allocate_page(file_name);
+    if (page_id != nullptr) {
+        // NOLINTNEXTLINE(cppcoreguidelines-pro-type-const-cast)
+        const_cast<uint32_t&>(*page_id) = target_page_id;
+    }
+    const std::string key = make_page_key(file_name, target_page_id);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/storage/buffer_pool_manager.cpp` around lines 117 - 130, The function
allocates the disk page (storage_manager_.allocate_page) before reserving a
buffer frame, which can leave an orphan page if frame acquisition fails; change
the order so you first obtain a frame_id from free_list_ or replacer_.victim
(using free_list_, replacer_.victim and the local frame_id) and only call
storage_manager_.allocate_page(file_name) after a frame is successfully
reserved; then set *page_id, construct the key with make_page_key, and proceed —
if frame acquisition fails, return nullptr without calling
storage_manager_.allocate_page.
src/network/server.cpp (4)

338-341: ⚠️ Potential issue | 🔴 Critical

Query payload length is not validated before allocation/read.

At Line 339, len - 4 is used directly. If len < 4, this underflows and can trigger oversized allocation/reads. Line 340 also ignores short reads, and Line 341 assumes trusted NUL termination.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/network/server.cpp` around lines 338 - 341, Validate the incoming query
payload length before allocating/read: ensure 'len' is >= 4 and <= a defined
maximum payload size, return or close the connection on invalid lengths; then
allocate 'sql_buf' with that validated size, call recv_all(client_fd,
sql_buf.data(), payload_len) and check its return value equals payload_len
(handle short/failed reads), and construct the SQL string using the vector size
(e.g., std::string(sql_buf.data(), payload_len)) instead of relying on NUL
termination; update the handling around the 'type == 'Q'' branch and
functions/variables recv_all, client_fd, sql_buf, len, and std::string sql
accordingly to propagate errors.

368-374: ⚠️ Potential issue | 🟠 Major

Error responses are not PostgreSQL-wire compliant.

Lines 370-374 and 379-383 send raw error text only. ErrorResponse should include typed fields (e.g., severity/message/code) with proper null-terminated field framing.

Also applies to: 377-383

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/network/server.cpp` around lines 368 - 374, The current sends write raw
error text instead of a PostgreSQL ErrorResponse with typed, null-terminated
fields; replace the raw sends in the error-path (the block using err, e_len,
e_type and the send(...) calls) with a properly framed ErrorResponse: build a
payload string of one or more field entries each starting with the single-byte
field type (e.g., 'S' for severity, 'M' for message, 'C' for SQLSTATE code)
followed by the null-terminated text, end the payload with an additional '\0',
compute the packet length as htonl(static_cast<uint32_t>(payload.size() + 4)),
send the 'E' message type byte, send that length, then send the payload;
populate at minimum the 'M' field from res.error() and provide a sensible
severity/code (or map from res) so the receiver sees typed fields rather than
raw text.

301-314: ⚠️ Potential issue | 🔴 Critical

Startup/SSL handshake validation is still incomplete.

Line 301 accepts non-SSL startup codes implicitly, and Lines 312-314 read the second packet without validating len bounds or full read success. This leaves malformed-packet and buffer-safety gaps.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/network/server.cpp` around lines 301 - 314, The code currently only
special-cases the SSL request value (80877103) and then reads a subsequent
packet without validating the returned length or verifying recv_all succeeded;
update the ProtocolReader::read_int32 handling so that after detecting SSL
Request (constant 80877103) you explicitly validate the next recv_all return
value and the startup packet length (len) before using it: ensure len >=
HEADER_SIZE and len <= buffer.size(), check that recv_all(client_fd,
buffer.data() + HEADER_SIZE, len - HEADER_SIZE) returns exactly (len -
HEADER_SIZE), and on any failure close client_fd and abort processing; also
handle the non-SSL startup path explicitly (i.e., treat any other code as a
normal startup packet or reject unexpected codes) to avoid silently accepting
malformed packets.

353-367: ⚠️ Potential issue | 🟠 Major

Successful query responses are protocol-incomplete and mis-tagged.

Lines 354-358 skip RowDescription/DataRow, and Line 361 hardcodes "SELECT <n>" for all successful statements. That breaks protocol expectations for non-SELECT commands and row-bearing responses.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/network/server.cpp` around lines 353 - 367, The success branch in the
response writer (the res.success() block) omits proper RowDescription/DataRow
frames for queries that return rows and always emits a hardcoded CommandComplete
tag "SELECT <n>", which mislabels non-SELECT commands; update the logic in the
res.success() branch to: inspect res.rows() and if non-empty emit a correct
RowDescription (T) followed by DataRow (D) messages for each row using the
existing send(...) calls, otherwise build the CommandComplete tag dynamically
from the executed statement type (e.g., use res.command_type() or similar) and
row_count to produce "INSERT <n>", "UPDATE <n>", "DELETE <n>", or "SELECT <n>"
as appropriate, and then send the proper 'C' message (replace the current tag
variable generation) so clients receive protocol-complete responses.
tests/cloudSQL_tests.cpp (1)

345-357: ⚠️ Potential issue | 🟠 Major

GROUP BY assertion is nondeterministic without ordering.

At Line 346, query result order is unspecified, but Lines 355-357 assume row 0 is category A. This can flake.

🛠️ Suggested fix
-    auto lex =
-        std::make_unique<Lexer>("SELECT cat, COUNT(val), SUM(val) FROM agg_test GROUP BY cat");
+    auto lex = std::make_unique<Lexer>(
+        "SELECT cat, COUNT(val), SUM(val) FROM agg_test GROUP BY cat ORDER BY cat");
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/cloudSQL_tests.cpp` around lines 345 - 357, The test assumes a
deterministic GROUP BY order but the SQL result is unordered; update the test to
make ordering deterministic by changing the parsed query passed to Lexer/Parser
in this block (the string used to construct Lexer for Parser::parse_statement)
to include an ORDER BY (e.g., ORDER BY cat) or alternatively sort/lookup rows
from exec.execute(*stmt) before asserting; reference the
Lexer/Parser.parse_statement instantiation and the exec.execute call and adjust
either the SQL string or the assertions to be order-independent so the expected
category 'A' check cannot flake.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@include/storage/storage_manager.hpp`:
- Around line 83-87: The deallocate_page declaration exposes a misleading
void/no-op contract; change the API and implementation so callers can detect
failure and persistent pages are actually reclaimed: update the declaration of
deallocate_page in storage_manager.hpp (and corresponding implementation in
src/storage/storage_manager.cpp) from static void deallocate_page(const
std::string& filename, uint32_t page_num) to return a status (e.g., bool or an
error code/Status type) and implement real deallocation logic that
removes/reclaims the page and returns success/failure; update all callers of
StorageManager::deallocate_page to check the returned status and handle/report
errors accordingly.

In `@src/storage/buffer_pool_manager.cpp`:
- Around line 61-66: The disk read return value from storage_manager_.read_page
is ignored; update the FetchPage path in BufferPoolManager to check the result
of storage_manager_.read_page(file_name, page_id, page->get_data()) and handle
failures by undoing the partial state (reset page->page_id_, page->file_name_,
page->pin_count_, page->is_dirty_ as appropriate), remove any page_table_ entry
pointing to the frame, release any locks/locks on the frame, and return nullptr
(or an error) instead of returning a live page pointer; ensure this cleanup
references the same frame pointer and the page_table_ and does not leak the
frame or leave inconsistent metadata.

In `@src/transaction/transaction_manager.cpp`:
- Around line 38-44: The commit/abort paths dereference txn and use
active_transactions_[txn->get_id()] which can default-insert and accept
null/non-active txns; add a guard at the start of TransactionManager::commit and
TransactionManager::abort that (1) checks txn is non-null, (2) verifies the
transaction id exists and is active by using
active_transactions_.find(txn->get_id()) (not operator[]) and confirming the
state, and (3) early-return or error-log/throw if the check fails; only then
proceed to use txn->get_id(), txn->get_prev_lsn(), txn->set_prev_lsn(), move
entries to completed_transactions_, and call
log_manager_->append_log_record/flush.

---

Outside diff comments:
In `@src/network/server.cpp`:
- Around line 174-183: In stop(), joining worker threads before unblocking their
blocked recv() can hang; change the shutdown sequence in stop() to first iterate
over the client fds (fds) and call shutdown(fd, SHUT_RDWR) (or at least shutdown
for read) and then close(fd) to unblock any blocking recv() in worker threads,
and only after all sockets are shutdown/closed call join() on each thread in
workers (check worker.joinable()). Reference symbols: stop(), workers, fds,
recv(), shutdown(), close(), worker.joinable().

In `@tests/cloudSQL_tests.cpp`:
- Around line 277-278: Tests call a removed TransactionManager constructor
overload; find every instantiation like "TransactionManager tm(lm, *catalog,
sm);" (also in the other failing test spots) and update them to call the current
TransactionManager constructor signature declared in the TransactionManager
class (open the TransactionManager header to see required parameters) — pass the
additional required argument(s) or default/config object the new ctor needs so
the tests compile (keep the QueryExecutor usage "QueryExecutor exec(*catalog,
sm, lm, tm);" unchanged).

---

Duplicate comments:
In `@src/network/server.cpp`:
- Around line 338-341: Validate the incoming query payload length before
allocating/read: ensure 'len' is >= 4 and <= a defined maximum payload size,
return or close the connection on invalid lengths; then allocate 'sql_buf' with
that validated size, call recv_all(client_fd, sql_buf.data(), payload_len) and
check its return value equals payload_len (handle short/failed reads), and
construct the SQL string using the vector size (e.g.,
std::string(sql_buf.data(), payload_len)) instead of relying on NUL termination;
update the handling around the 'type == 'Q'' branch and functions/variables
recv_all, client_fd, sql_buf, len, and std::string sql accordingly to propagate
errors.
- Around line 368-374: The current sends write raw error text instead of a
PostgreSQL ErrorResponse with typed, null-terminated fields; replace the raw
sends in the error-path (the block using err, e_len, e_type and the send(...)
calls) with a properly framed ErrorResponse: build a payload string of one or
more field entries each starting with the single-byte field type (e.g., 'S' for
severity, 'M' for message, 'C' for SQLSTATE code) followed by the
null-terminated text, end the payload with an additional '\0', compute the
packet length as htonl(static_cast<uint32_t>(payload.size() + 4)), send the 'E'
message type byte, send that length, then send the payload; populate at minimum
the 'M' field from res.error() and provide a sensible severity/code (or map from
res) so the receiver sees typed fields rather than raw text.
- Around line 301-314: The code currently only special-cases the SSL request
value (80877103) and then reads a subsequent packet without validating the
returned length or verifying recv_all succeeded; update the
ProtocolReader::read_int32 handling so that after detecting SSL Request
(constant 80877103) you explicitly validate the next recv_all return value and
the startup packet length (len) before using it: ensure len >= HEADER_SIZE and
len <= buffer.size(), check that recv_all(client_fd, buffer.data() +
HEADER_SIZE, len - HEADER_SIZE) returns exactly (len - HEADER_SIZE), and on any
failure close client_fd and abort processing; also handle the non-SSL startup
path explicitly (i.e., treat any other code as a normal startup packet or reject
unexpected codes) to avoid silently accepting malformed packets.
- Around line 353-367: The success branch in the response writer (the
res.success() block) omits proper RowDescription/DataRow frames for queries that
return rows and always emits a hardcoded CommandComplete tag "SELECT <n>", which
mislabels non-SELECT commands; update the logic in the res.success() branch to:
inspect res.rows() and if non-empty emit a correct RowDescription (T) followed
by DataRow (D) messages for each row using the existing send(...) calls,
otherwise build the CommandComplete tag dynamically from the executed statement
type (e.g., use res.command_type() or similar) and row_count to produce "INSERT
<n>", "UPDATE <n>", "DELETE <n>", or "SELECT <n>" as appropriate, and then send
the proper 'C' message (replace the current tag variable generation) so clients
receive protocol-complete responses.

In `@src/storage/buffer_pool_manager.cpp`:
- Line 31: The BufferPoolManager destructor currently uses the default and skips
flushing dirty pages; modify BufferPoolManager::~BufferPoolManager() to
explicitly call this->flush_all_pages() (or flush_all_pages()) and handle/report
errors as appropriate before allowing destruction to proceed (ensure any
necessary locks are acquired/released consistent with the class's shutdown
semantics) so all dirty frames are written out prior to object teardown.
- Around line 117-130: The function allocates the disk page
(storage_manager_.allocate_page) before reserving a buffer frame, which can
leave an orphan page if frame acquisition fails; change the order so you first
obtain a frame_id from free_list_ or replacer_.victim (using free_list_,
replacer_.victim and the local frame_id) and only call
storage_manager_.allocate_page(file_name) after a frame is successfully
reserved; then set *page_id, construct the key with make_page_key, and proceed —
if frame acquisition fails, return nullptr without calling
storage_manager_.allocate_page.

In `@tests/cloudSQL_tests.cpp`:
- Around line 345-357: The test assumes a deterministic GROUP BY order but the
SQL result is unordered; update the test to make ordering deterministic by
changing the parsed query passed to Lexer/Parser in this block (the string used
to construct Lexer for Parser::parse_statement) to include an ORDER BY (e.g.,
ORDER BY cat) or alternatively sort/lookup rows from exec.execute(*stmt) before
asserting; reference the Lexer/Parser.parse_statement instantiation and the
exec.execute call and adjust either the SQL string or the assertions to be
order-independent so the expected category 'A' check cannot flake.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a2f9261 and abf46d8.

📒 Files selected for processing (9)
  • include/storage/storage_manager.hpp
  • include/transaction/transaction_manager.hpp
  • src/network/server.cpp
  • src/storage/buffer_pool_manager.cpp
  • src/storage/storage_manager.cpp
  • src/transaction/transaction_manager.cpp
  • tests/cloudSQL_tests.cpp
  • tests/lock_manager_tests.cpp
  • tests/server_tests.cpp
💤 Files with no reviewable changes (1)
  • tests/lock_manager_tests.cpp
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/storage/storage_manager.cpp
  • tests/server_tests.cpp

Comment on lines +83 to +87
/**
* @brief Deallocate a page (stub for future use)
*/
static void deallocate_page(const std::string& filename, uint32_t page_num);

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

deallocate_page currently exposes a misleading contract.

At Line 86, deallocate_page is static void, while the current implementation is a stub/no-op (src/storage/storage_manager.cpp Lines 184-187). Callers cannot detect failure, and persistent pages are never actually reclaimed.

💡 Suggested API direction
-    static void deallocate_page(const std::string& filename, uint32_t page_num);
+    bool deallocate_page(const std::string& filename, uint32_t page_num);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
/**
* @brief Deallocate a page (stub for future use)
*/
static void deallocate_page(const std::string& filename, uint32_t page_num);
/**
* `@brief` Deallocate a page (stub for future use)
*/
static bool deallocate_page(const std::string& filename, uint32_t page_num);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/storage/storage_manager.hpp` around lines 83 - 87, The
deallocate_page declaration exposes a misleading void/no-op contract; change the
API and implementation so callers can detect failure and persistent pages are
actually reclaimed: update the declaration of deallocate_page in
storage_manager.hpp (and corresponding implementation in
src/storage/storage_manager.cpp) from static void deallocate_page(const
std::string& filename, uint32_t page_num) to return a status (e.g., bool or an
error code/Status type) and implement real deallocation logic that
removes/reclaims the page and returns success/failure; update all callers of
StorageManager::deallocate_page to check the returned status and handle/report
errors accordingly.

Comment on lines +61 to +66
page->page_id_ = page_id;
page->file_name_ = file_name;
page->pin_count_ = 1;
page->is_dirty_ = false;
storage_manager_.read_page(file_name, page_id, page->get_data());

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🔴 Critical

Handle read_page failure before returning a fetched frame.

At Line 65, read_page(...) return value is ignored. If disk read fails, the method still returns a page pointer with freshly published metadata/page-table state, which can propagate corrupted behavior.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/storage/buffer_pool_manager.cpp` around lines 61 - 66, The disk read
return value from storage_manager_.read_page is ignored; update the FetchPage
path in BufferPoolManager to check the result of
storage_manager_.read_page(file_name, page_id, page->get_data()) and handle
failures by undoing the partial state (reset page->page_id_, page->file_name_,
page->pin_count_, page->is_dirty_ as appropriate), remove any page_table_ entry
pointing to the frame, release any locks/locks on the frame, and return nullptr
(or an error) instead of returning a live page pointer; ensure this cleanup
references the same frame pointer and the page_table_ and does not leak the
frame or leave inconsistent metadata.

Comment on lines 38 to 44
void TransactionManager::commit(Transaction* txn) {
if (txn == nullptr) {
return;
}

if (log_manager_ != nullptr) {
recovery::LogRecord log(txn->get_id(), txn->get_prev_lsn(),
recovery::LogRecordType::COMMIT);
static_cast<void>(log_manager_->append_log_record(log));
recovery::LogRecord record(txn->get_id(), txn->get_prev_lsn(),
recovery::LogRecordType::COMMIT);
const recovery::lsn_t lsn = log_manager_->append_log_record(record);
txn->set_prev_lsn(lsn);
log_manager_->flush(true);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Guard commit/abort against null or non-active transactions.

txn is dereferenced without validation, and Lines 60/92 use active_transactions_[txn->get_id()], which default-inserts when missing. This can mask logic errors and move null entries into completed_transactions_.

🛠️ Suggested guard pattern
 void TransactionManager::commit(Transaction* txn) {
+    if (txn == nullptr) {
+        return;
+    }
+    std::unique_ptr<Transaction> finished_txn;
+    {
+        const std::scoped_lock<std::mutex> lock(manager_latch_);
+        const auto it = active_transactions_.find(txn->get_id());
+        if (it == active_transactions_.end()) {
+            return;
+        }
+        finished_txn = std::move(it->second);
+        active_transactions_.erase(it);
+    }
+    txn = finished_txn.get();
     ...
-    completed_transactions_.push_back(std::move(active_transactions_[txn->get_id()]));
-    static_cast<void>(active_transactions_.erase(txn->get_id()));
+    completed_transactions_.push_back(std::move(finished_txn));
 }

Also applies to: 58-61, 70-75, 90-93

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/transaction/transaction_manager.cpp` around lines 38 - 44, The
commit/abort paths dereference txn and use active_transactions_[txn->get_id()]
which can default-insert and accept null/non-active txns; add a guard at the
start of TransactionManager::commit and TransactionManager::abort that (1)
checks txn is non-null, (2) verifies the transaction id exists and is active by
using active_transactions_.find(txn->get_id()) (not operator[]) and confirming
the state, and (3) early-return or error-log/throw if the check fails; only then
proceed to use txn->get_id(), txn->get_prev_lsn(), txn->set_prev_lsn(), move
entries to completed_transactions_, and call
log_manager_->append_log_record/flush.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
include/transaction/transaction_manager.hpp (1)

75-76: Consider bounding or periodically pruning completed_transactions_.

The completed_transactions_ deque retains finished transactions indefinitely. In a long-running server, this could lead to unbounded memory growth. Consider adding a maximum size limit or a cleanup mechanism that prunes old entries.

💡 Example approach
// In commit() or abort(), after moving to completed_transactions_:
constexpr size_t MAX_COMPLETED = 1000;
if (completed_transactions_.size() > MAX_COMPLETED) {
    completed_transactions_.pop_front();
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@include/transaction/transaction_manager.hpp` around lines 75 - 76, The deque
completed_transactions_ currently retains finished Transaction objects forever;
add a bounded size or periodic pruning to avoid unbounded memory growth by
trimming old entries when transactions are moved into completed_transactions_.
Modify the places that push into completed_transactions_ (e.g., commit() and
abort() or whatever methods move a Transaction into completed_transactions_) to
enforce a cap like MAX_COMPLETED (or call a pruneCompletedTransactions() helper)
that pops from the front until size <= MAX_COMPLETED, or implement a periodic
cleanup routine that removes entries older than a TTL; ensure the logic
references completed_transactions_, Transaction, and the commit()/abort() code
paths so old transactions are dropped safely.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/cloudSQL_tests.cpp`:
- Around line 754-758: There are uncommitted edits around the EXPECT_THROW
assertion for duplicate index creation in the test using catalog->create_index
and the const oid_t tid variable; ensure you stage and commit the intended
changes to tests/cloudSQL_tests.cpp containing the
EXPECT_THROW(catalog->create_index(...)) block so CI no longer fails on git diff
--exit-code, or revert unintended local edits before pushing; verify the test
uses the correct catalog->create_index signature and that the duplicate-index
assertion remains committed.

---

Nitpick comments:
In `@include/transaction/transaction_manager.hpp`:
- Around line 75-76: The deque completed_transactions_ currently retains
finished Transaction objects forever; add a bounded size or periodic pruning to
avoid unbounded memory growth by trimming old entries when transactions are
moved into completed_transactions_. Modify the places that push into
completed_transactions_ (e.g., commit() and abort() or whatever methods move a
Transaction into completed_transactions_) to enforce a cap like MAX_COMPLETED
(or call a pruneCompletedTransactions() helper) that pops from the front until
size <= MAX_COMPLETED, or implement a periodic cleanup routine that removes
entries older than a TTL; ensure the logic references completed_transactions_,
Transaction, and the commit()/abort() code paths so old transactions are dropped
safely.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between abf46d8 and ba8d181.

📒 Files selected for processing (3)
  • include/transaction/transaction_manager.hpp
  • src/network/server.cpp
  • tests/cloudSQL_tests.cpp

Comment on lines +754 to +758
/* Duplicate index */
const oid_t tid = catalog->create_table("idx_fail", cols);
static_cast<void>(catalog->create_index("my_idx", tid, {0}, IndexType::BTree, true));
EXPECT_THROW(catalog->create_index("my_idx", tid, {0}, IndexType::BTree, true),
std::exception);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Pipeline failure: uncommitted changes detected around EXPECT_THROW usage.

The CI reports git diff --exit-code failed, indicating uncommitted modifications in this file at or near the EXPECT_THROW statements. Ensure all changes are committed before merging.

🧰 Tools
🪛 GitHub Actions: cloudSQL CI

[error] 754-756: Command 'git diff --exit-code' returned exit code 1, indicating uncommitted changes in tests/cloudSQL_tests.cpp (diff shows updated EXPECT_THROW usage).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/cloudSQL_tests.cpp` around lines 754 - 758, There are uncommitted edits
around the EXPECT_THROW assertion for duplicate index creation in the test using
catalog->create_index and the const oid_t tid variable; ensure you stage and
commit the intended changes to tests/cloudSQL_tests.cpp containing the
EXPECT_THROW(catalog->create_index(...)) block so CI no longer fails on git diff
--exit-code, or revert unintended local edits before pushing; verify the test
uses the correct catalog->create_index signature and that the duplicate-index
assertion remains committed.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
tests/cloudSQL_tests.cpp (1)

347-359: ⚠️ Potential issue | 🟠 Major

Make this aggregate assertion deterministic.

This still depends on implicit row order (GROUP BY without ordering), so rows()[0] == "A" can flake.

✅ Suggested fix
-    auto lex =
-        std::make_unique<Lexer>("SELECT cat, COUNT(val), SUM(val) FROM agg_test GROUP BY cat");
+    auto lex = std::make_unique<Lexer>(
+        "SELECT cat, COUNT(val), SUM(val) FROM agg_test GROUP BY cat ORDER BY cat");
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/cloudSQL_tests.cpp` around lines 347 - 359, The test assumes a
deterministic row order from "SELECT cat, COUNT(val), SUM(val) FROM agg_test
GROUP BY cat"; change the test to make ordering deterministic by adding an
explicit ORDER BY to the parsed SQL (e.g., "… GROUP BY cat ORDER BY cat") or,
alternatively, keep the current query but change the assertions after
exec.execute(*stmt) to locate the row for cat == "A" (scan res.rows() for the
row where get(0).to_string() == "A") and then assert its COUNT and SUM values;
update references around Lexer/Parser usage and the checks on res.rows()
accordingly.
🧹 Nitpick comments (2)
tests/cloudSQL_tests.cpp (2)

431-446: Strengthen transaction test with a pre-commit visibility check.

The test verifies visibility after COMMIT, but not invisibility before COMMIT; adding that assertion makes isolation validation explicit.

🧪 Suggested assertion addition
     static_cast<void>(qexec1.execute(*Parser(std::make_unique<Lexer>("BEGIN")).parse_statement()));
     static_cast<void>(
         qexec1.execute(*Parser(std::make_unique<Lexer>("INSERT INTO txn_test VALUES (1, 100)"))
                             .parse_statement()));

     QueryExecutor qexec2(*catalog, sm, lm, tm);
+
+    const auto res_precommit =
+        qexec2.execute(*Parser(std::make_unique<Lexer>("SELECT val FROM txn_test WHERE id = 1"))
+                            .parse_statement());
+    EXPECT_EQ(res_precommit.row_count(), 0U);

     const auto res_commit =
         qexec1.execute(*Parser(std::make_unique<Lexer>("COMMIT")).parse_statement());
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/cloudSQL_tests.cpp` around lines 431 - 446, Add a pre-commit visibility
assertion to ensure the uncommitted INSERT by qexec1 is not visible to qexec2:
after executing the INSERT with qexec1 (the Parser/Lexer call that inserts into
txn_test) run qexec2.execute with the same SELECT statement used later and
assert that its row_count() is 0 (or that the select result indicates no rows)
before performing the COMMIT and checking post-commit visibility; update the
test around QueryExecutor qexec1/qexec2, the insert execute call, and the SELECT
execute call to include this additional assertion.

185-208: Harden test cleanup against early assertion exits.

Cleanup runs only at the end of the test body; if an assertion fails earlier, files are left behind and can contaminate later tests.

♻️ Suggested refactor (RAII cleanup)
 TEST(CloudSQLTests, StoragePersistence) {
     const std::string filename = "persist_test";
     const std::string filepath = "./test_data/" + filename + ".heap";
-    static_cast<void>(std::remove(filepath.c_str()));
+    struct ScopedFileDelete {
+        explicit ScopedFileDelete(std::string p) : path(std::move(p)) {}
+        ~ScopedFileDelete() { static_cast<void>(std::remove(path.c_str())); }
+        std::string path;
+    } cleanup(filepath);
+    static_cast<void>(std::remove(filepath.c_str()));  // pre-test cleanup
@@
-    static_cast<void>(std::remove(filepath.c_str()));
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/cloudSQL_tests.cpp` around lines 185 - 208, The test currently only
removes the temp file at the end of the TEST body (using std::remove on
filepath), so an early assertion failure can leave files behind; fix by
introducing RAII cleanup: create and instantiate a small helper (e.g.,
FileRemover) at the top of TEST(CloudSQLTests, StoragePersistence) that takes
filepath (constructed from filename) and removes the file in its destructor, or
convert the test into a test fixture with SetUp/TearDown that removes filepath,
then remove the two inline std::remove calls and rely on the RAII/fixture to
always delete the file even if assertions fail.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@tests/cloudSQL_tests.cpp`:
- Around line 347-359: The test assumes a deterministic row order from "SELECT
cat, COUNT(val), SUM(val) FROM agg_test GROUP BY cat"; change the test to make
ordering deterministic by adding an explicit ORDER BY to the parsed SQL (e.g.,
"… GROUP BY cat ORDER BY cat") or, alternatively, keep the current query but
change the assertions after exec.execute(*stmt) to locate the row for cat == "A"
(scan res.rows() for the row where get(0).to_string() == "A") and then assert
its COUNT and SUM values; update references around Lexer/Parser usage and the
checks on res.rows() accordingly.

---

Nitpick comments:
In `@tests/cloudSQL_tests.cpp`:
- Around line 431-446: Add a pre-commit visibility assertion to ensure the
uncommitted INSERT by qexec1 is not visible to qexec2: after executing the
INSERT with qexec1 (the Parser/Lexer call that inserts into txn_test) run
qexec2.execute with the same SELECT statement used later and assert that its
row_count() is 0 (or that the select result indicates no rows) before performing
the COMMIT and checking post-commit visibility; update the test around
QueryExecutor qexec1/qexec2, the insert execute call, and the SELECT execute
call to include this additional assertion.
- Around line 185-208: The test currently only removes the temp file at the end
of the TEST body (using std::remove on filepath), so an early assertion failure
can leave files behind; fix by introducing RAII cleanup: create and instantiate
a small helper (e.g., FileRemover) at the top of TEST(CloudSQLTests,
StoragePersistence) that takes filepath (constructed from filename) and removes
the file in its destructor, or convert the test into a test fixture with
SetUp/TearDown that removes filepath, then remove the two inline std::remove
calls and rely on the RAII/fixture to always delete the file even if assertions
fail.

ℹ️ Review info

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ba8d181 and 17801ab.

📒 Files selected for processing (1)
  • tests/cloudSQL_tests.cpp

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant