Skip to content

test: add unit tests for operators, storage, and distributed#23

Merged
poyrazK merged 8 commits intomainfrom
test/coverage-add-operators
Apr 11, 2026
Merged

test: add unit tests for operators, storage, and distributed#23
poyrazK merged 8 commits intomainfrom
test/coverage-add-operators

Conversation

@poyrazK
Copy link
Copy Markdown
Owner

@poyrazK poyrazK commented Apr 11, 2026

Summary

Add comprehensive unit tests for previously untested components:

Test File Tests Description
raft_manager_tests.cpp 8 Multi-Raft group management
raft_protocol_tests.cpp 8 RaftGroup protocol implementation
columnar_table_tests.cpp 11 Columnar storage lifecycle
storage_manager_tests.cpp 16 Low-level storage I/O
rpc_server_tests.cpp 12 RPC server lifecycle and handlers
operator_tests.cpp 20 Volcano-style execution operators

Test Plan

  • All 23 tests pass locally (ctest --output-on-failure)
  • CI passes on GitHub Actions

Summary by CodeRabbit

  • Tests

    • Added comprehensive unit test suites covering Raft consensus management, Raft protocol behavior, columnar table storage, query execution operators, RPC server communication, and persistent storage management. Tests validate lifecycle operations, data handling, replication, and edge case scenarios.
  • Chores

    • Updated build configuration to register new test targets.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Apr 11, 2026

Warning

Rate limit exceeded

@poyrazK has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 20 minutes and 2 seconds before requesting another review.

Your organization is not enrolled in usage-based pricing. Contact your admin to enable usage-based pricing to continue reviews beyond the rate limit, or try again in 20 minutes and 2 seconds.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c60e2490-2706-48e0-a411-351a0c4a9b09

📥 Commits

Reviewing files that changed from the base of the PR and between 3e8117a and 4dd759a.

📒 Files selected for processing (2)
  • tests/raft_manager_tests.cpp
  • tests/storage_manager_tests.cpp
📝 Walkthrough

Walkthrough

This pull request adds six new GoogleTest test files covering Raft replication management, columnar storage operations, RPC server functionality, and query execution operators, along with CMake configuration updates to register these test executables.

Changes

Cohort / File(s) Summary
CMake Test Configuration
CMakeLists.txt
Added six add_cloudsql_test(...) entries to register new test executables: raft_manager_tests, raft_protocol_tests, columnar_table_tests, storage_manager_tests, rpc_server_tests, and operator_tests.
Raft System Tests
tests/raft_manager_tests.cpp, tests/raft_protocol_tests.cpp
Introduced unit tests for RaftManager multi-group management (group creation, retrieval, lifecycle) and RaftGroup replication protocol (leadership establishment, state machine application, group state persistence).
Storage & Data Tests
tests/columnar_table_tests.cpp, tests/storage_manager_tests.cpp
Added test coverage for columnar table creation, appending VectorBatch instances, reading persisted data across data types and null handling; and storage manager file operations including page-level read/write, file allocation, and directory management.
Network & Execution Tests
tests/rpc_server_tests.cpp, tests/operator_tests.cpp
Introduced RPC server lifecycle and handler management tests (repeated start/stop, handler registration, concurrent client handling); and Volcano-style operator tests covering BufferScan, Limit, Filter, Project, Sort, Aggregate, HashJoin, and multi-operator pipelines.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Suggested labels

enhancement

Poem

🐰 Tests sprouted like carrots in spring—
Six new suites to make our SQL sing!
Raft groups, storage, and operators aligned,
Network and schemas in harmonious mind.
Each test a burrow of confidence deep! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 12.50% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately describes the main objective of the PR: adding comprehensive unit tests for operators, storage, and distributed components across six new test files.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch test/coverage-add-operators

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

🧹 Nitpick comments (2)
tests/operator_tests.cpp (1)

578-588: OperatorTypeEnum test is tautological.

Every assertion compares a symbol to itself, so this test will never catch regressions. Validate concrete values (or serialized mappings) instead.

Suggested fix
 TEST_F(OperatorTests, OperatorTypeEnum) {
-    EXPECT_EQ(OperatorType::SeqScan, OperatorType::SeqScan);
-    EXPECT_EQ(OperatorType::IndexScan, OperatorType::IndexScan);
-    EXPECT_EQ(OperatorType::Filter, OperatorType::Filter);
-    EXPECT_EQ(OperatorType::Project, OperatorType::Project);
-    EXPECT_EQ(OperatorType::HashJoin, OperatorType::HashJoin);
-    EXPECT_EQ(OperatorType::Sort, OperatorType::Sort);
-    EXPECT_EQ(OperatorType::Aggregate, OperatorType::Aggregate);
-    EXPECT_EQ(OperatorType::Limit, OperatorType::Limit);
-    EXPECT_EQ(OperatorType::BufferScan, OperatorType::BufferScan);
+    EXPECT_EQ(static_cast<uint8_t>(OperatorType::SeqScan), 0);
+    EXPECT_EQ(static_cast<uint8_t>(OperatorType::IndexScan), 1);
+    EXPECT_EQ(static_cast<uint8_t>(OperatorType::Filter), 2);
+    EXPECT_EQ(static_cast<uint8_t>(OperatorType::Project), 3);
+    EXPECT_EQ(static_cast<uint8_t>(OperatorType::HashJoin), 4);
+    EXPECT_EQ(static_cast<uint8_t>(OperatorType::Sort), 5);
+    EXPECT_EQ(static_cast<uint8_t>(OperatorType::Aggregate), 6);
+    EXPECT_EQ(static_cast<uint8_t>(OperatorType::Limit), 7);
+    EXPECT_EQ(static_cast<uint8_t>(OperatorType::BufferScan), 8);
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/operator_tests.cpp` around lines 578 - 588, The test OperatorTypeEnum
is tautological because it compares each OperatorType symbol to itself; update
TEST_F(OperatorTests, OperatorTypeEnum) to assert against concrete
representations instead: either compare each enum to its expected underlying
integer value (cast to int) or validate the enum-to-string/serialization mapping
(e.g., using any existing toString/OperatorTypeToString or operator<< for
OperatorType) so the test fails on accidental reordering or remapping of enum
values; locate the enum OperatorType and the test in tests/operator_tests.cpp
and replace the self-equalities with assertions that check the actual expected
int or string for each symbol.
tests/raft_protocol_tests.cpp (1)

90-103: Several tests don’t assert observable behavior.

StatePersistence, LoadStateNonExistent, StopWithoutStart, and SetStateMachine currently pass even if behavior regresses, as long as they don’t crash.

Also applies to: 105-108, 124-127, 129-133

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/raft_protocol_tests.cpp` around lines 90 - 103, Tests like
StatePersistence, LoadStateNonExistent, StopWithoutStart, and SetStateMachine
currently only check for absence of crashes; add explicit assertions to verify
observable behavior: for StatePersistence assert that after creating
local_group, writing state (via RaftGroup::apply or similar) and stopping, the
loaded_group reads back the persisted state (use
RaftGroup::get_last_applied_index, get_state, or the state machine's persisted
value); for LoadStateNonExistent assert that constructing a group for a
non-existent id yields a defined initial state (e.g., last_applied == 0 or state
== Follower); for StopWithoutStart assert that calling stop() without start()
leaves the group's state unchanged or sets an explicit stopped flag (assert via
RaftGroup::is_running or similar); for SetStateMachine assert that
set_state_machine correctly registers the machine by invoking a known transition
or reading state from the state machine (use state_machine_->last_committed or
get_state()); update tests at the referenced ranges (90-103, 105-108, 124-127,
129-133) to include these checks and use RaftGroup methods (start, stop,
set_state_machine, apply, get_last_applied_index, is_running, get_state) and the
state_machine_ helper to validate concrete outcomes rather than only absence of
crashes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/columnar_table_tests.cpp`:
- Around line 238-246: The test OpenWithoutCreate can fail due to leftover files
for the table name "col_test_missing"; before constructing ColumnarTable in this
test, delete any existing artifacts for that table (e.g., call
std::filesystem::remove_all or your test harness's cleanup helper for the table
name) so ColumnarTable(name, *sm_, schema) runs against a clean state; reference
the test function OpenWithoutCreate and the local variable name
("col_test_missing") and perform the cleanup prior to creating the
Schema/ColumnarTable instance.

In `@tests/raft_manager_tests.cpp`:
- Around line 33-35: The test fixture should fail fast if RpcServer::start()
fails: after creating rpc_ with std::make_unique<RpcServer>(TEST_PORT) call
RpcServer::start(), check its return status or ensure it started successfully
(e.g., assert or throw) inside SetUp() before constructing manager_; if start()
returns a bool or throws, use EXPECT_TRUE/ASSERT_TRUE or catch and FAIL() so
that manager_ = std::make_unique<RaftManager>("node1", *cm_, *rpc_); is only
reached when rpc_ is running.

In `@tests/raft_protocol_tests.cpp`:
- Around line 29-34: The MockStateMachine's applied_entries_ vector is accessed
concurrently (apply() on Raft worker threads and tests reading
applied_entries_), causing a data race; make applied_entries_ thread-safe by
adding a std::mutex member to MockStateMachine, lock it inside apply() when
mutating applied_entries_, and provide a const-safe accessor method (e.g.,
size() or entries()) that locks the mutex before reading so tests call that
instead of accessing applied_entries_ directly; update tests to use the new
accessor(s) rather than reading applied_entries_ directly.
- Around line 43-45: Add a failing assertion immediately after calling
rpc_->start() to ensure the RPC server was started successfully (e.g., assert
that rpc_->start() succeeded or did not throw); place this check directly after
the rpc_->start() call in the fixture setup that constructs rpc_ (references:
rpc_, RpcServer, start(), TEST_PORT, state_machine_) so the test fails fast
instead of continuing with an invalid fixture.

In `@tests/rpc_server_tests.cpp`:
- Around line 84-93: The test HandlerOverride currently only checks that a
handler was retrieved; modify it to invoke the retrieved handler and assert it
reflects the second override: after calling
server_->set_handler(RpcType::Heartbeat, handler1) and then
server_->set_handler(RpcType::Heartbeat, handler2), obtain retrieved =
server_->get_handler(RpcType::Heartbeat) and call retrieved(...) with
appropriate dummy RpcHeader and payload, then assert call_count equals the
expected value produced by handler2 (i.e., increment by 10) to confirm the
override took effect.
- Around line 145-181: The test races on call_count because server callbacks may
run concurrently; change the plain int call_count to a thread-safe counter
(e.g., std::atomic<int> call_count{0}) and update the handler lambda (used in
server_->set_handler for RpcType::Heartbeat) to increment it atomically
(call_count.fetch_add(1, std::memory_order_relaxed) or ++call_count). Add the
necessary include (<atomic>) and keep the rest of the test logic the same so
EXPECT_EQ reads the atomic's value.
- Around line 107-137: The test races on the plain bool named called used across
threads in the heartbeat handler and the polling loop; change its type to
std::atomic<bool> (e.g., std::atomic<bool> called{false}) and `#include` <atomic>,
keep the lambda capture [&called] but ensure you use called.store(true) in the
handler and called.load() in the polling loop (or rely on implicit atomic bool
operators), so reads/writes are atomic and the race is eliminated.

In `@tests/storage_manager_tests.cpp`:
- Around line 113-116: The test calls write_page(filename, 0, write_buf) with
write_buf sized 512 which is smaller than StorageManager::PAGE_SIZE, risking a
buffer overflow; fix by allocating a buffer with size StorageManager::PAGE_SIZE
(or use a std::vector<uint8_t> of that size) and initialize it (e.g., std::fill)
before passing it to StorageManager::write_page so the full page is backed by
valid memory; update the variable (write_buf) and its initialization accordingly
in the tests/storage_manager_tests.cpp test that invokes write_page.

---

Nitpick comments:
In `@tests/operator_tests.cpp`:
- Around line 578-588: The test OperatorTypeEnum is tautological because it
compares each OperatorType symbol to itself; update TEST_F(OperatorTests,
OperatorTypeEnum) to assert against concrete representations instead: either
compare each enum to its expected underlying integer value (cast to int) or
validate the enum-to-string/serialization mapping (e.g., using any existing
toString/OperatorTypeToString or operator<< for OperatorType) so the test fails
on accidental reordering or remapping of enum values; locate the enum
OperatorType and the test in tests/operator_tests.cpp and replace the
self-equalities with assertions that check the actual expected int or string for
each symbol.

In `@tests/raft_protocol_tests.cpp`:
- Around line 90-103: Tests like StatePersistence, LoadStateNonExistent,
StopWithoutStart, and SetStateMachine currently only check for absence of
crashes; add explicit assertions to verify observable behavior: for
StatePersistence assert that after creating local_group, writing state (via
RaftGroup::apply or similar) and stopping, the loaded_group reads back the
persisted state (use RaftGroup::get_last_applied_index, get_state, or the state
machine's persisted value); for LoadStateNonExistent assert that constructing a
group for a non-existent id yields a defined initial state (e.g., last_applied
== 0 or state == Follower); for StopWithoutStart assert that calling stop()
without start() leaves the group's state unchanged or sets an explicit stopped
flag (assert via RaftGroup::is_running or similar); for SetStateMachine assert
that set_state_machine correctly registers the machine by invoking a known
transition or reading state from the state machine (use
state_machine_->last_committed or get_state()); update tests at the referenced
ranges (90-103, 105-108, 124-127, 129-133) to include these checks and use
RaftGroup methods (start, stop, set_state_machine, apply,
get_last_applied_index, is_running, get_state) and the state_machine_ helper to
validate concrete outcomes rather than only absence of crashes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 0cab81d2-1ef3-4692-94af-ae870109134d

📥 Commits

Reviewing files that changed from the base of the PR and between 98a22b5 and 3e8117a.

📒 Files selected for processing (7)
  • CMakeLists.txt
  • tests/columnar_table_tests.cpp
  • tests/operator_tests.cpp
  • tests/raft_manager_tests.cpp
  • tests/raft_protocol_tests.cpp
  • tests/rpc_server_tests.cpp
  • tests/storage_manager_tests.cpp

Comment on lines +238 to +246
TEST_F(ColumnarTableTests, OpenWithoutCreate) {
const std::string name = "col_test_missing";

Schema schema;
schema.add_column("id", common::ValueType::TYPE_INT64);

ColumnarTable table(name, *sm_, schema);
ASSERT_FALSE(table.open());
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

OpenWithoutCreate can be flaky due to stale files.

This test doesn’t remove prior artifacts for col_test_missing, so Line 245 can intermittently fail on reruns.

Suggested fix
 TEST_F(ColumnarTableTests, OpenWithoutCreate) {
     const std::string name = "col_test_missing";
+    cleanup_table(name);

     Schema schema;
     schema.add_column("id", common::ValueType::TYPE_INT64);

     ColumnarTable table(name, *sm_, schema);
     ASSERT_FALSE(table.open());
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
TEST_F(ColumnarTableTests, OpenWithoutCreate) {
const std::string name = "col_test_missing";
Schema schema;
schema.add_column("id", common::ValueType::TYPE_INT64);
ColumnarTable table(name, *sm_, schema);
ASSERT_FALSE(table.open());
}
TEST_F(ColumnarTableTests, OpenWithoutCreate) {
const std::string name = "col_test_missing";
cleanup_table(name);
Schema schema;
schema.add_column("id", common::ValueType::TYPE_INT64);
ColumnarTable table(name, *sm_, schema);
ASSERT_FALSE(table.open());
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/columnar_table_tests.cpp` around lines 238 - 246, The test
OpenWithoutCreate can fail due to leftover files for the table name
"col_test_missing"; before constructing ColumnarTable in this test, delete any
existing artifacts for that table (e.g., call std::filesystem::remove_all or
your test harness's cleanup helper for the table name) so ColumnarTable(name,
*sm_, schema) runs against a clean state; reference the test function
OpenWithoutCreate and the local variable name ("col_test_missing") and perform
the cleanup prior to creating the Schema/ColumnarTable instance.

Comment on lines +29 to +34
class MockStateMachine : public RaftStateMachine {
public:
void apply(const LogEntry& entry) override { applied_entries_.push_back(entry); }

std::vector<LogEntry> applied_entries_;
};
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

MockStateMachine::applied_entries_ is not thread-safe.

apply() may run on Raft worker threads while tests read applied_entries_.size() (Line 87), which creates a race on std::vector.

Suggested fix
 class MockStateMachine : public RaftStateMachine {
    public:
-    void apply(const LogEntry& entry) override { applied_entries_.push_back(entry); }
-
-    std::vector<LogEntry> applied_entries_;
+    void apply(const LogEntry& entry) override {
+        std::lock_guard<std::mutex> lock(mu_);
+        applied_entries_.push_back(entry);
+    }
+    size_t applied_count() const {
+        std::lock_guard<std::mutex> lock(mu_);
+        return applied_entries_.size();
+    }
+
+   private:
+    mutable std::mutex mu_;
+    std::vector<LogEntry> applied_entries_;
 };
-    EXPECT_EQ(state_machine_->applied_entries_.size(), 0);
+    EXPECT_EQ(state_machine_->applied_count(), 0U);

Also applies to: 87-87

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/raft_protocol_tests.cpp` around lines 29 - 34, The MockStateMachine's
applied_entries_ vector is accessed concurrently (apply() on Raft worker threads
and tests reading applied_entries_), causing a data race; make applied_entries_
thread-safe by adding a std::mutex member to MockStateMachine, lock it inside
apply() when mutating applied_entries_, and provide a const-safe accessor method
(e.g., size() or entries()) that locks the mutex before reading so tests call
that instead of accessing applied_entries_ directly; update tests to use the new
accessor(s) rather than reading applied_entries_ directly.

Comment on lines +43 to +45
rpc_ = std::make_unique<RpcServer>(TEST_PORT);
rpc_->start();
state_machine_ = std::make_unique<MockStateMachine>();
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Fail fast if RPC server setup fails.

Add an assertion on Line 44 to avoid cascading failures from invalid fixture setup.

Suggested fix
         rpc_ = std::make_unique<RpcServer>(TEST_PORT);
-        rpc_->start();
+        ASSERT_TRUE(rpc_->start());
         state_machine_ = std::make_unique<MockStateMachine>();
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/raft_protocol_tests.cpp` around lines 43 - 45, Add a failing assertion
immediately after calling rpc_->start() to ensure the RPC server was started
successfully (e.g., assert that rpc_->start() succeeded or did not throw); place
this check directly after the rpc_->start() call in the fixture setup that
constructs rpc_ (references: rpc_, RpcServer, start(), TEST_PORT,
state_machine_) so the test fails fast instead of continuing with an invalid
fixture.

Comment on lines +84 to +93
TEST_F(RpcServerTests, HandlerOverride) {
int call_count = 0;
auto handler1 = [&](const RpcHeader&, const std::vector<uint8_t>&, int) { call_count++; };
auto handler2 = [&](const RpcHeader&, const std::vector<uint8_t>&, int) { call_count += 10; };

server_->set_handler(RpcType::Heartbeat, handler1);
server_->set_handler(RpcType::Heartbeat, handler2);
auto retrieved = server_->get_handler(RpcType::Heartbeat);
ASSERT_NE(retrieved, nullptr);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

HandlerOverride doesn’t verify override behavior yet.

It only checks non-null retrieval. Trigger the returned handler and assert it reflects handler2.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/rpc_server_tests.cpp` around lines 84 - 93, The test HandlerOverride
currently only checks that a handler was retrieved; modify it to invoke the
retrieved handler and assert it reflects the second override: after calling
server_->set_handler(RpcType::Heartbeat, handler1) and then
server_->set_handler(RpcType::Heartbeat, handler2), obtain retrieved =
server_->get_handler(RpcType::Heartbeat) and call retrieved(...) with
appropriate dummy RpcHeader and payload, then assert call_count equals the
expected value produced by handler2 (i.e., increment by 10) to confirm the
override took effect.

Comment on lines +107 to +137
bool called = false;
server_->set_handler(RpcType::Heartbeat,
[&called](const RpcHeader& h, const std::vector<uint8_t>& p, int fd) {
called = true;
EXPECT_EQ(p.size(), 0U);
});

// Connect and send RPC with zero payload
int fd = socket(AF_INET, SOCK_STREAM, 0);
ASSERT_GE(fd, 0);

sockaddr_in addr{};
addr.sin_family = AF_INET;
addr.sin_port = htons(port_);
inet_pton(AF_INET, "127.0.0.1", &addr.sin_addr);

ASSERT_EQ(connect(fd, (sockaddr*)&addr, sizeof(addr)), 0);

// Send header
RpcHeader hdr;
hdr.type = RpcType::Heartbeat;
hdr.payload_len = 0;
char h_buf[RpcHeader::HEADER_SIZE];
hdr.encode(h_buf);
send(fd, h_buf, RpcHeader::HEADER_SIZE, 0);

// Give time for the server to process and call the handler
for (int i = 0; i < 10 && !called; ++i) {
std::this_thread::sleep_for(std::chrono::milliseconds(50));
}
EXPECT_TRUE(called);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

called is raced across threads in ZeroPayloadHandler.

The handler writes called while the test thread reads it in the polling loop; use std::atomic<bool>.

Suggested fix
-    bool called = false;
+    std::atomic<bool> called{false};
     server_->set_handler(RpcType::Heartbeat,
                          [&called](const RpcHeader& h, const std::vector<uint8_t>& p, int fd) {
-                             called = true;
+                             called.store(true, std::memory_order_relaxed);
                              EXPECT_EQ(p.size(), 0U);
                          });
@@
-    for (int i = 0; i < 10 && !called; ++i) {
+    for (int i = 0; i < 10 && !called.load(std::memory_order_relaxed); ++i) {
         std::this_thread::sleep_for(std::chrono::milliseconds(50));
     }
-    EXPECT_TRUE(called);
+    EXPECT_TRUE(called.load(std::memory_order_relaxed));
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
bool called = false;
server_->set_handler(RpcType::Heartbeat,
[&called](const RpcHeader& h, const std::vector<uint8_t>& p, int fd) {
called = true;
EXPECT_EQ(p.size(), 0U);
});
// Connect and send RPC with zero payload
int fd = socket(AF_INET, SOCK_STREAM, 0);
ASSERT_GE(fd, 0);
sockaddr_in addr{};
addr.sin_family = AF_INET;
addr.sin_port = htons(port_);
inet_pton(AF_INET, "127.0.0.1", &addr.sin_addr);
ASSERT_EQ(connect(fd, (sockaddr*)&addr, sizeof(addr)), 0);
// Send header
RpcHeader hdr;
hdr.type = RpcType::Heartbeat;
hdr.payload_len = 0;
char h_buf[RpcHeader::HEADER_SIZE];
hdr.encode(h_buf);
send(fd, h_buf, RpcHeader::HEADER_SIZE, 0);
// Give time for the server to process and call the handler
for (int i = 0; i < 10 && !called; ++i) {
std::this_thread::sleep_for(std::chrono::milliseconds(50));
}
EXPECT_TRUE(called);
std::atomic<bool> called{false};
server_->set_handler(RpcType::Heartbeat,
[&called](const RpcHeader& h, const std::vector<uint8_t>& p, int fd) {
called.store(true, std::memory_order_relaxed);
EXPECT_EQ(p.size(), 0U);
});
// Connect and send RPC with zero payload
int fd = socket(AF_INET, SOCK_STREAM, 0);
ASSERT_GE(fd, 0);
sockaddr_in addr{};
addr.sin_family = AF_INET;
addr.sin_port = htons(port_);
inet_pton(AF_INET, "127.0.0.1", &addr.sin_addr);
ASSERT_EQ(connect(fd, (sockaddr*)&addr, sizeof(addr)), 0);
// Send header
RpcHeader hdr;
hdr.type = RpcType::Heartbeat;
hdr.payload_len = 0;
char h_buf[RpcHeader::HEADER_SIZE];
hdr.encode(h_buf);
send(fd, h_buf, RpcHeader::HEADER_SIZE, 0);
// Give time for the server to process and call the handler
for (int i = 0; i < 10 && !called.load(std::memory_order_relaxed); ++i) {
std::this_thread::sleep_for(std::chrono::milliseconds(50));
}
EXPECT_TRUE(called.load(std::memory_order_relaxed));
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/rpc_server_tests.cpp` around lines 107 - 137, The test races on the
plain bool named called used across threads in the heartbeat handler and the
polling loop; change its type to std::atomic<bool> (e.g., std::atomic<bool>
called{false}) and `#include` <atomic>, keep the lambda capture [&called] but
ensure you use called.store(true) in the handler and called.load() in the
polling loop (or rely on implicit atomic bool operators), so reads/writes are
atomic and the race is eliminated.

Comment on lines +145 to +181
int call_count = 0;
server_->set_handler(
RpcType::Heartbeat,
[&call_count](const RpcHeader&, const std::vector<uint8_t>&, int) { call_count++; });

std::vector<int> fds;
for (int i = 0; i < 5; ++i) {
int fd = socket(AF_INET, SOCK_STREAM, 0);
sockaddr_in addr{};
addr.sin_family = AF_INET;
addr.sin_port = htons(port_);
inet_pton(AF_INET, "127.0.0.1", &addr.sin_addr);
if (connect(fd, (sockaddr*)&addr, sizeof(addr)) == 0) {
fds.push_back(fd);
}
}

// Send RPCs
for (int fd : fds) {
RpcHeader hdr;
hdr.type = RpcType::Heartbeat;
hdr.payload_len = 0;
char h_buf[RpcHeader::HEADER_SIZE];
hdr.encode(h_buf);
send(fd, h_buf, RpcHeader::HEADER_SIZE, 0);
}

// Give time for the server to process all 5
for (int i = 0; i < 20 && call_count < 5; ++i) {
std::this_thread::sleep_for(std::chrono::milliseconds(50));
}

for (int fd : fds) {
close(fd);
}

EXPECT_EQ(call_count, 5);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

call_count is raced in MultipleConnections.

Callbacks may run concurrently, so incrementing a plain int is undefined behavior.

Suggested fix
-    int call_count = 0;
+    std::atomic<int> call_count{0};
     server_->set_handler(
         RpcType::Heartbeat,
-        [&call_count](const RpcHeader&, const std::vector<uint8_t>&, int) { call_count++; });
+        [&call_count](const RpcHeader&, const std::vector<uint8_t>&, int) {
+            call_count.fetch_add(1, std::memory_order_relaxed);
+        });
@@
-    for (int i = 0; i < 20 && call_count < 5; ++i) {
+    for (int i = 0; i < 20 && call_count.load(std::memory_order_relaxed) < 5; ++i) {
         std::this_thread::sleep_for(std::chrono::milliseconds(50));
     }
@@
-    EXPECT_EQ(call_count, 5);
+    EXPECT_EQ(call_count.load(std::memory_order_relaxed), 5);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/rpc_server_tests.cpp` around lines 145 - 181, The test races on
call_count because server callbacks may run concurrently; change the plain int
call_count to a thread-safe counter (e.g., std::atomic<int> call_count{0}) and
update the handler lambda (used in server_->set_handler for RpcType::Heartbeat)
to increment it atomically (call_count.fetch_add(1, std::memory_order_relaxed)
or ++call_count). Add the necessary include (<atomic>) and keep the rest of the
test logic the same so EXPECT_EQ reads the atomic's value.

- raft_manager_tests: Add ASSERT_TRUE for RpcServer::start() in SetUp()
- storage_manager_tests: Fix buffer overflow by using PAGE_SIZE instead of 512
@poyrazK poyrazK merged commit 267a75a into main Apr 11, 2026
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant