Skip to content

chore(ci): skip refresh-counts PR when only timestamps changed#606

Merged
amavashev merged 1 commit intomainfrom
chore/refresh-counts-skip-noop-prs
May 9, 2026
Merged

chore(ci): skip refresh-counts PR when only timestamps changed#606
amavashev merged 1 commit intomainfrom
chore/refresh-counts-skip-noop-prs

Conversation

@amavashev
Copy link
Copy Markdown
Contributor

Summary

Follow-up to PR #604. The unified refresh-counts.yml opens a PR every day at 06:00 UTC; when no count actually moves between runs, the diff is just a fetchedAt bump (update-github-counts.mjs writes the timestamp unconditionally on line 254). PR #605 today hit exactly this case — 1 file / +1/-1 line, only fetchedAt changed, closed without merging because nothing material was captured.

What changed

.github/workflows/refresh-counts.yml step 3:

  • Pull the textual diff for both watched files via git diff --no-color --
  • Filter to real change lines: ^[+-][^+-] excludes the +++ b/path / --- a/path file-header lines
  • Drop any change lines that touch only fetchedAt or lastVerifiedAt
  • Open a PR only if material change lines remain

The data fetch in steps 1–2 still runs unconditionally — fresh data always pulled, HWM cursors still advance, only the noise PR is suppressed.

Why filter both fetchedAt and lastVerifiedAt

installs-cache.json carries fetchedAt (ISO-8601 datetime, set on every successful update-github-counts.mjs run, line 254). manual-package-counts.json carries lastVerifiedAt (date-only YYYY-MM-DD, set whenever the GHCR sub-step succeeds, line 263). Both are pure freshness indicators — neither carries data anyone reads. Filtering both covers the no-op-refresh case for both files.

Verification

Tested the filter regex against three synthetic diffs locally:

=== Test 1: timestamp-only diff (should skip) ===
material lines: []
→ would skip PR (correct)

=== Test 2: real count change + timestamp (should open) ===
material lines:
-  "npm": 4639,
+  "npm": 4700,
-  "total": 11209,
+  "total": 11270,
→ would open PR (correct)

=== Test 3: lastVerifiedAt only (should skip) ===
material lines: []
→ would skip PR (correct)

YAML re-parsed cleanly via the yaml package — 5 steps, unchanged structure.

Failure modes considered

  • Timestamp value contains the substring fetchedAt: not possible — the timestamps are ISO datetime / date strings, no field-name contamination.
  • total changes but no individual count did: not possible — cache.total = Math.max(npm + pypi + crates + releases + ghPackages + maven, cache.total) (script line 251–253). If all components are unchanged, newTotal is unchanged, total stays at its existing value.
  • JSON key reordering producing spurious diff lines: not possible — JSON.stringify preserves insertion order, and the script only mutates values, never reorders.
  • Non-fetchedAt change on the same line as fetchedAt: not possible — pretty-printed JSON puts each field on its own line.

Test plan

  • YAML parses, 5 steps, structure unchanged
  • Filter regex verified against three synthetic diff cases (timestamp-only-skip / count-change-open / lastVerifiedAt-only-skip)
  • Live test: trigger workflow_dispatch shortly after merge to confirm a no-op refresh produces the "skipping PR" log line and no PR
  • Live test: when registry counts grow on a real day, confirm a PR with the expected small diff is still opened

The unified refresh-counts.yml workflow opens a PR every day at 06:00
UTC. When no count actually moves between runs (npm/PyPI/crates/clones/
releases/ghPackages all unchanged), the diff is just a `fetchedAt`
bump from update-github-counts.mjs writing the timestamp
unconditionally — and on manual-package-counts.json, just a
`lastVerifiedAt` bump when the GHCR sub-step succeeds.

PR #605 (2026-05-09) hit exactly this case: 1 file / +1/-1 line, only
fetchedAt moved, and the PR was closed without merging because there
was nothing material to capture. Same pattern would have repeated
every day with no count changes.

Step 3 now extracts the textual diff for both files, filters to real
change lines (`^[+-][^+-]` excludes the `+++ b/`/`--- a/` headers),
drops change lines that touch only `fetchedAt` or `lastVerifiedAt`,
and only opens a PR if material change lines remain.

The data fetch in steps 1–2 still runs unconditionally — fresh data
is always pulled, HWM cursors still advance — only the noise PR is
suppressed.

Verified the filter regex against three synthetic diffs:
- fetchedAt-only diff → no material lines → skip (correct)
- npm/total change + timestamp → material lines listed → open PR (correct)
- lastVerifiedAt-only diff → no material lines → skip (correct)
@amavashev amavashev merged commit 384303a into main May 9, 2026
5 checks passed
@amavashev amavashev deleted the chore/refresh-counts-skip-noop-prs branch May 9, 2026 10:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant