Fix wait / sleep which is causing many PRs to fail performance tests in CI for reasons unrelated to the PR#77726
Fix wait / sleep which is causing many PRs to fail performance tests in CI for reasons unrelated to the PR#77726danluu wants to merge 5 commits intoWordPress:trunkfrom
Conversation
|
👋 Thanks for your first Pull Request and for helping build the future of Gutenberg and WordPress, @danluu! In case you missed it, we'd love to have you join us in our Slack community. If you want to learn more about WordPress development in general, check out the Core Handbook full of helpful information. |
0cf755b to
e51d754
Compare
|
Warning: Type of PR label mismatch To merge this PR, it requires exactly 1 label indicating the type of PR. Other labels are optional and not being checked here.
Read more about Type labels in Gutenberg. Don't worry if you don't have the required permissions to add labels; the PR reviewer should be able to help with the task. |
|
The following accounts have interacted with this PR and/or linked issues. I will continue to update these lists as activity occurs. You can also manually ask me to refresh this list by adding the If you're merging code through a pull request on GitHub, copy and paste the following into the bottom of the merge commit message. To understand the WordPress project's expectations around crediting contributors, please review the Contributor Attribution page in the Core Handbook. |
e51d754 to
72498c6
Compare
dmsnell
left a comment
There was a problem hiding this comment.
these two tests both define isVisibleElement and isReadyElement and they look identical. let’s extract them? I think that will make the diff look much smaller, especially if we define these helpers somewhere in the module global scope so that they aren’t right there in the waitForFunction lines.
|
This seems a bit surprising to me, but the initial results look like this clearly demonstrates slowing down the performance tests runs. The revert I made in #77895 seems to be running within around 39 min while this branch is closer to 42 min. I’m not sure why this seems to contradict the 2.5 min speedup you found; perhaps I made a mistake in creating the revert PR.
|
|
Interesting. It's definitely possible this fix doesn't help. I ran it twice and got a speedup both times, but the process is fairly noise. If we wanted a better measurement, we could try running a bunch of times locally (which will be less accurate, but should have less noise and can directionally indicate what should happen), or add instrumentation that shows what's going on. |
I would be more curious about instrumentation, because I could believe there are platform differences in GitHub actions that lead to a different runtime locally vs. in the CI runners. these jobs are more or less running simultaneously and both share the same merge-base with perhaps we could log how long we wait for the different canvas events? anyway I think I take that back; I would be very interested to know if running locally can reliably produce the opposite result. by any chance, you weren’t comparing this branch against |
|
At the time, I was comparing to trunk the moment #77725 was merged and with this applied to trunk before #77725. Another thought is that, when doing a comparison for this, we could run only the part that's impacted by this, which will reduce the noise from unrelated parts of the job (which is where most the run time is from and, I would guess, where most of the noise is coming from). |
|
I tried running the relevant ests locally (or, really, having my agent run the relevant tests locally). My agent decided to only run 22 out of the 23 relevant tests because one of the makes some kind of persistent change (if I were doing it myself, I would've reset the state at the start each time to run every test).
Local speedup for this affected slice: ~34s per branch run. The main signal is in the Site Editor Loading iterations:
Apologies for the pointless mean and median in the AI generated table. I think this happens twice per CI run, so that's 68s when run locally on the laptop I used (M3 Max, lower core config for the M3 Max). I don't have a full CI job run time for this machine (I did it on another machine at one point, but 100% of the memory on that machine is being used for various fuzz tests at the moment). In any case, I doubt the 23rd test reverses this (in all likelihood, it increases the delta) and from a performance / gain standpoint, I feel pretty comfortable with the idea that this speeds up the CI job (from a correctness standpoint, this is AI written code in a codebase I know nothing about, I'm less confident there). |
|
Alright, here are the results with (allegedly) all relevant tests:
Also, looking at the Performance Tests job run time, it seems to have improved over when this PR was created for some reason (possibly just random variance):
Unfortunately, we don't seem to break out how long each part takes (yet?). |
796b338 to
0a1bf7d
Compare
|
If we want reliable and fast code to detect if the site editor has finished loading, we already have it inside the The
Exactly the same logic should be used by e2e tests, there is no need to do something different, like a 60s timeout. We should move the logic to something that is not a React hook, which is usable only inside a React component context. For example, to a store selector that we can call with await page.waitForFunction( () => ! window.wp.data.select( 'core/site-editor' ).isSiteEditorLoading() );This should be the one-liner that does all the waiting we need. And it looks directly at the site editor internal state, doesn't depend on catching the DOM elements with loaders. I'll try to create an alternative PR that implements this idea, I'm curious how the results change. Both in reliability and timing. |


Closes #77895
This PR was originally created to stop every PR from having a failing CI run due to performance tests timing out at 60 minutes, which was blocking every PR for the #77716 fuzzing project. There was as race with #77725 and that issue solved the problem. However, the approach is different and this PR eliminates the 3s wait that #77725 has in successful cases. This seems to speed up the entire Performance CI run by approximately 2.5 minutes (no doubt there's a lot of noise and the true speedup might be 2 minutes or 3 minutes and not 2.5 minutes).
I originally closed this as a dup, but on seeing that this version has some advantages over the other one, I'm gong to re-open it.
BEGIN AI GENERATED TEXT
The performance job is not timing out because the Site Editor test suite or the
Site Editor itself became 20+ minutes slower. It is timing out because the
visitSiteEditor()Playwright helper can wait a full 60 seconds for a canvasloader that has already appeared and disappeared.
The direct trigger was #77443. That PR did not change Site Editor loading code,
the performance spec, or the
visitSiteEditor()helper. It changed runtimetiming enough to expose an existing race in the helper.
The latent helper bug came from the sequence of earlier helper changes, most
importantly #68667. #68667 made a missed loader a successful fallback path, but
only after spending the full 60 second timeout waiting for the loader to become
visible.
Once #77443 was on trunk, PR performance jobs compared two branches that both
contained the timing trigger. The Site Editor loading test calls
visitSiteEditor()11 times per branch, so one artificial 60 second wait periteration adds about 22 minutes to a full PR performance run.
Relevant PRs and commits
d301cf7d5152: added a Site Editor canvas loader wait to avoidresolving
visitSiteEditor()before large canvas content finished loading.7e24ae4f91ee: limited that wait to Site Editor routes that areexpected to load the canvas. The PR description says the previous behavior
added about 2 minutes to each affected performance test.
ad4da873ce4: movedsetPreferences()before the loader wait. Thismade the helper start waiting for the loader later in the page lifecycle.
4085555f8ac: added the important latent bug. The helper tries towait for
.edit-site-canvas-loader, .edit-site-canvas-spinnerto becomevisible with a 60 second timeout, catches the timeout, and then waits for the
editor content region. If the loader was already gone, the helper succeeds
after wasting 60 seconds.
309dfc2e885: changedcatch ( error )tocatch {}. This wasmechanical and did not change behavior.
0f99b8a8044: rebuilt the Experiments screen. This is the PR thatchanged timing enough for performance CI to hit the existing helper race.
What the helper does
On trunk before the fix,
visitSiteEditor()contains this sequence:The catch block comment says "If the canvas loader is already disappeared, skip
the waiting", but the implementation cannot know that until after
canvasLoader.waitFor( { state: 'visible', timeout: 60_000 } )has alreadytimed out.
The Site Editor performance loading test hits this path 11 times per branch:
The test then records
results.firstBlockfrommetrics.getLoadingDurations().timeSinceResponseEnd, which isperformance.now() - responseEnd. That means the helper's 60 second wait isincluded in the
firstBlockmetric even if the first block was already present.Evidence from CI
The #77443 performance run compared the PR merge commit
cf45676490ae3900632a512f6deaa9ab259e2d54against trunk and still finished,but it was close to the 60 minute workflow limit. Its Site Editor summary showed
a signature 60 second
firstBlockvalue on the PR side while server-side workwas normal:
That rules out a PHP/server/DB slowdown as the main cause of the 60 second
number.
After #77443 landed, later PRs compared PR branches against trunk where both
sides had the timing trigger. For example, the #77675 performance run was
canceled at the 60 minute job timeout while running the second Site Editor
branch:
That is consistent with roughly 11 minutes of artificial loader-wait delay per
branch, plus normal setup/build/test time.
Local reproduction and double-check
I reproduced the behavior with the actual performance test path, not a reduced
direct
page.goto()probe. With instrumentation added around the oldvisitSiteEditor()helper on #77443's PR merge commit, the page was alreadyloaded before the 60 second wait started:
This answers the important question directly: yes, the content was present
before 60 seconds. The iframe already contained 1437 blocks before the helper
started the 60 second loader-visible wait.
I then ran the same instrumented path on the pre-#77443 base commit
90e45267257fc4a4da0095933ecc24fa3bae776d. There the loader was still presentwhen the helper started waiting, so the 60 second timeout was not hit:
This isolates the behavior change to #77443's merge comparison: the helper code
was the same, but the loader was missed after #77443 and not missed before it.
Why #77443 is the trigger but not the actual slowdown
The #77443 diff changes the Experiments screen and related settings plumbing:
It does not change these Site Editor or performance-test files:
So the correct interpretation is:
visitSiteEditor(): a missed loader costs 60 secondsand then still succeeds.
loader was already gone by the time
visitSiteEditor()started waiting forit to become visible.
firstBlock.the job past the workflow's 60 minute timeout.
Fix direction
The helper should not require the loader to appear. It should wait for either:
Then it should verify the loader is absent and the editor content/canvas is
ready. That preserves the intent of #61629 and #61816, while avoiding the
60-second penalty introduced by the fallback behavior from #68667.
END AI GENERATED TEXT
Note that this PR was AI generated (GPT-5.5)