fix(server): different graph name share the same backend#3027
Conversation
imbajin
left a comment
There was a problem hiding this comment.
修复方向正确,用图命名空间前缀隔离 server ID 能解决跨图 key 碰撞问题。但存在一个阻塞性逻辑缺陷 + 测试缺失,建议修复后再合并。
PR 勾选了"已有测试覆盖",但全量搜索 hugegraph-server/hugegraph-test/ 后,没有任何测试覆盖"两图共享同一 backend + 相同 nodeId"的场景。
建议补充回归测试,至少验证:
~sys_graph和hugegraph使用相同nodeId(如"server-1")初始化时,selfNodeId()返回值互不相同- 两个图能并发启动,不抛出 parse error 或
"already in cluster"异常
存量部署 backend 中的 ServerInfo vertex key 格式为 server-1;升级后代码改为以 DEFAULT-hugegraph/server-1 查询,旧记录查不到便重新写入新记录,旧的 server-1 记录以孤儿形式留在 backend 中不被清理。对功能影响有限,但建议在 PR 描述或 Release Notes 中注明,避免运维困惑。
🧹 小建议:spaceGraphName 字符集
spaceGraphName() 的值由用户配置拼接而来,没有对 / 字符的校验。若用户将 graphSpace 配置为含 / 的字符串,生成的 ID 会出现多个 /,可能引发后续解析歧义。建议加一行断言或在文档中说明 graphSpace 不允许包含 /。
| // Scope server id to graph to avoid cross-graph collision | ||
| return IdGenerator.of(this.graph.spaceGraphName() + "/" + | ||
| this.globalNodeInfo.nodeId().asString()); | ||
| } |
There was a problem hiding this comment.
initServerInfo() 中读写路径 ID 不一致(阻塞性问题)
此处修改后 selfNodeId() 返回命名空间化的 ID(如 DEFAULT-hugegraph/server-1),但调用方 initServerInfo() 的存在性检查仍使用原始裸 ID,两条路径的 key 不同:
// initServerInfo()(本 PR 未修改这部分)
Id serverId = nodeInfo.nodeId(); // ❌ 裸 ID:"server-1"
HugeServerInfo existed = this.serverInfo(serverId); // ❌ 用 "server-1" 查询
// ...
this.globalNodeInfo = nodeInfo; // line 140,赋值后 selfNodeId() 才变
this.saveServerInfo(this.selfNodeId(), ...); // ✅ 保存 "DEFAULT-hugegraph/server-1"后果:
- lines 109–122 的 alive 检查(防止同名节点重复启动的保护)永远查不到已命名空间化的旧记录,保护逻辑被静默跳过
- 重启时旧的命名空间记录不会被正确识别,会直接写入新记录,旧记录成为孤儿数据积累在 backend 中
建议修法: 在 initServerInfo() 入口先设置 globalNodeInfo,再统一用 selfNodeId() 做后续查询,保持读写 key 一致:
public synchronized void initServerInfo(GlobalMasterInfo nodeInfo) {
E.checkArgument(nodeInfo != null, "The global node info can't be null");
this.globalNodeInfo = nodeInfo; // 先赋值,selfNodeId() 立即返回命名空间 ID
Id serverId = this.selfNodeId(); // ✅ 与 save / remove 路径使用同一个 key
HugeServerInfo existed = this.serverInfo(serverId);
if (existed != null && existed.alive()) {
// ... 存活检查 & 等待逻辑不变 ...
}
E.checkArgument(existed == null || !existed.alive(),
"The server with name '%s' already in cluster", serverId);
// master 唯一性检查不变 ...
this.saveServerInfo(serverId, this.selfNodeRole());
}这样读取和写入始终使用同一个 ID,alive 检查和 master 唯一性保护才能真正生效。
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #3027 +/- ##
=============================================
+ Coverage 35.85% 93.25% +57.40%
+ Complexity 338 65 -273
=============================================
Files 802 9 -793
Lines 67995 267 -67728
Branches 8902 22 -8880
=============================================
- Hits 24381 249 -24132
+ Misses 41008 8 -41000
+ Partials 2606 10 -2596 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Purpose of the PR
During backend initialization, the server creates two graphs — ~sys_graph and hugegraph (default) — that share the same store backend. This leads to a key collision:
Main Changes
Change
ServerInfoManager.selfNodeId()which returns "server-1" previously to "{graphname}/server-1"Verifying these changes
Does this PR potentially affect the following parts?
Documentation Status
Doc - TODODoc - DoneDoc - No Need