Antalya 26.3: Fix rescheduleTasksFromReplica#1747
Conversation
…rash Antalya 26.1: Fix rescheduleTasksFromReplica
Verification report: Altinity/ClickHouse PR #1747ConclusionPR is merged. PR test workflow is clean at the test level on head; only the chronic antalya-26.3 regression suites are red, all at baseline rates. No PR-caused regression found. The diff is 3 lines in
CI on head
|
| Check | Top failing tests on PR-1747 builds (30d) | Baseline (antalya-26.3, 30d) |
Class |
|---|---|---|---|
Swarms (Release + Aarch64) |
swarm joins / join clause, cluster discovery / multiple paths, node failure / network failure, node failure / cpu overload, swarm join sanity / join with clause (×2 each) |
30–44% on every PR | Pre-existing broken |
S3Export (partition) (Release + Aarch64) |
sanity / no partition by (×2) |
50% | Pre-existing broken |
Iceberg (1) (Release + Aarch64) |
rest catalog / sort key timezone / day transform utc (×2), rest catalog / iceberg iterator race condition (×2) |
41% / 28% | Missing-dep + pre-existing flaky |
Iceberg (2) (Release + Aarch64) |
chronic glue-catalog / race-condition variants | chronic | Pre-existing flaky |
Parquet (Release + Aarch64) |
postgresql/mysql round-trip compression-type variants (×2 each) | ~36% | Pre-existing flaky |
Regression DB on /PRs/1747/ builds (30d): 152 Fail / 5,358 OK ≈ 2.8%. Every top failure matches the all-PR baseline fail rate on antalya-26.3.
Related to PR diff?
PR is a 3-line fix in rescheduleTasksFromReplica (1 file, replicated-task scheduling path).
| Failing test | Diff overlap | Related? |
|---|---|---|
swarms / *, parquet / *, s3_export_partition / *, iceberg / * |
none — none of these suites exercise rescheduleTasksFromReplica |
No |
No failing test intersects the rescheduling code path.
Recommendations
- No action on this PR. Merged and effectively clean — PR test workflow is green at the test level, and the regression failures are 100% chronic baseline.
- Re-verify after the companion 26.1 → 26.3 frontports land — same list as the prior 26.3 verification reports.
- Same chronic-baseline cleanup recommendation as
VERIFICATION_PR_1640.mdfor swarms / parquet / s3_export_partition / iceberg scenarios.
Local checkout
cd /Users/alsugilyazova/workspace/altinity-clickhouse/ClickHouse
gh pr checkout 1747 --repo Altinity/ClickHouse
# HEAD: c2744abe886243da0b4e82745ba22dbd7c198c27
Audit: PR #1747 — Antalya 26.3: Fix
|
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Fix rescheduleTasksFromReplica (#1568 by @ianton-ru).
CI/CD Options
Exclude tests:
Regression jobs to run:
Cherry-picked from #1568.
Documentation entry for user-facing changes
Fix incorrect change from c523f29
getReplicaForFileusesreplica_to_files_to_be_processedto find best replica for file, With removing lost replica aftergetReplicaForFilecall,getReplicaForFilechooses the same replica, so rescheduling makes no sense, files will be choosen only ingetAnyUnprocessedFileand executed on random replicas.This PR fixes the order, now files are matched with new best replicas.