Skip to content

Update pr-review skill model lineup#35174

Open
kubaflo wants to merge 1 commit intomainfrom
copilot/pr-review-model-update
Open

Update pr-review skill model lineup#35174
kubaflo wants to merge 1 commit intomainfrom
copilot/pr-review-model-update

Conversation

@kubaflo
Copy link
Copy Markdown
Contributor

@kubaflo kubaflo commented Apr 27, 2026

Note

Are you waiting for the changes in this PR to be merged?
It would be very helpful if you could test the resulting artifacts from this PR and let us know in a comment if this change resolves your issue. Thank you!

Updates the Phase 2 multi-model exploration list in the pr-review skill:

Order Before After
1 claude-opus-4.6 claude-opus-4.6 (unchanged)
2 claude-sonnet-4.6 claude-opus-4.7
3 gpt-5.3-codex gpt-5.3-codex (unchanged)
4 gemini-3-pro-preview gpt-5.5

Updated in both the model config table and the Phase 2 launch checklist in .github/skills/pr-review/SKILL.md.

Replace claude-sonnet-4.6 with claude-opus-4.7 and gemini-3-pro-preview
with gpt-5.5 in the Phase 2 multi-model exploration list.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings April 27, 2026 21:03
@github-actions
Copy link
Copy Markdown
Contributor

🚀 Dogfood this PR with:

⚠️ WARNING: Do not do this without first carefully reviewing the code of this PR to satisfy yourself it is safe.

curl -fsSL https://raw.githubusercontent.com/dotnet/maui/main/eng/scripts/get-maui-pr.sh | bash -s -- 35174

Or

  • Run remotely in PowerShell:
iex "& { $(irm https://raw.githubusercontent.com/dotnet/maui/main/eng/scripts/get-maui-pr.ps1) } 35174"

@github-actions
Copy link
Copy Markdown
Contributor

🔍 Skill Validation Results

✅ Static Checks Passed

Skills checked: 15 | Agents checked: 3

Full validator output
Found 1 skill(s)
[pr-review] 📊 pr-review: 3,267 BPE tokens [chars/4: 3,153] (standard ~), 22 sections, 7 code blocks
[pr-review]    ⚠  Skill is 3,267 BPE tokens (chars/4 estimate: 3,153) — approaching "comprehensive" range where gains diminish.
✅ All checks passed (1 skill(s))
Found 3 agent(s)
Validated 3 agent(s)

✅ All checks passed (3 agent(s))

⏭️ LLM Evaluation: Skipped

No changed skills with eval tests found.

🔍 Full results and investigation steps

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the .github/skills/pr-review skill instructions to reflect a new 4-model lineup used during Phase 2 (Try-Fix) multi-model exploration.

Changes:

  • Replace Phase 2 model #2 from claude-sonnet-4.6 to claude-opus-4.7.
  • Replace Phase 2 model #4 from gemini-3-pro-preview to gpt-5.5.
  • Update both the model configuration table and the Phase 2 launch checklist to stay consistent.

@kubaflo
Copy link
Copy Markdown
Contributor Author

kubaflo commented Apr 28, 2026

/review

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 28, 2026

Expert Code Review completed successfully!

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Expert Code Review — PR #35174

Methodology: 3 independent reviewers with adversarial consensus


Findings

# Severity Consensus File Lines Finding
1 🔴 CRITICAL 3/3 reviewers .github/skills/pr-review/SKILL.md 53, 113 claude-opus-4.7 is not in the platform's available model catalog. Attempt 2 of Phase 2 will fail at runtime.
2 🔴 CRITICAL 3/3 reviewers .github/skills/pr-review/SKILL.md 55, 117 gpt-5.5 is not in the platform's available model catalog. Attempt 4 of Phase 2 will fail at runtime.
3 🟢 MINOR 3/3 reviewers Other workflow files claude-sonnet-4.6 still referenced in .github/workflows/shared/review-shared.md and .github/workflows/copilot-evaluate-tests.md, but these are separate workflows using it for their own purposes — not stale references from this PR. gemini-3-pro-preview is fully removed. No action needed.

Details

Findings 1 & 2 — Unavailable model identifiers

The platform's task tool currently exposes these models: claude-opus-4.6, claude-opus-4.6-1m, claude-opus-4.5, claude-sonnet-4.6, claude-sonnet-4.5, claude-sonnet-4, claude-haiku-4.5, gpt-5.4, gpt-5.3-codex, gpt-5.2-codex, gpt-5.2, gpt-5.4-mini, gpt-5-mini, gpt-4.1.

Neither claude-opus-4.7 nor gpt-5.5 appears in this list. If these model IDs are not resolvable at runtime, 2 of 4 Phase 2 try-fix attempts will fail on every PR review, silently reducing fix exploration diversity by 50%.

If these models are expected to become available soon, consider gating the merge on their deployment. Otherwise, substitute with confirmed models.

Finding 3 — Other claude-sonnet-4.6 references

All 3 reviewers confirmed these are intentionally separate usages (agent models for different workflows), not stale references that should have been updated by this PR.

Internal Consistency ✅

The model config table (lines 50–55) and the Phase 2 launch checklist (lines 110–118) are consistent with each other after this change. No within-file discrepancies.

CI / Test Coverage

This PR modifies only a skill markdown file (no functional code). No CI tests are applicable or expected.

Generated by Expert Code Review for issue #35174 · ● 5.1M

|-------|-------|
| 1 | `claude-opus-4.6` |
| 2 | `claude-sonnet-4.6` |
| 2 | `claude-opus-4.7` |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 CRITICAL — Model not in available catalog (3/3 reviewers)

claude-opus-4.7 does not appear in the platform's current task-tool model catalog. The documented available models include claude-opus-4.6, claude-opus-4.6-1m, and claude-opus-4.5 — but not claude-opus-4.7.

If this model ID is not resolvable at runtime, Attempt 2 of every Phase 2 try-fix exploration will fail or be skipped, reducing fix diversity from 4 models to 3.

Recommendation: Confirm claude-opus-4.7 is a valid, deployed model identifier before merging. If not yet available, consider keeping claude-sonnet-4.6 or substituting with a confirmed model (e.g., claude-opus-4.6-1m or claude-opus-4.5).

| 2 | `claude-opus-4.7` |
| 3 | `gpt-5.3-codex` |
| 4 | `gemini-3-pro-preview` |
| 4 | `gpt-5.5` |
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 CRITICAL — Model not in available catalog (3/3 reviewers)

gpt-5.5 does not appear in the platform's current task-tool model catalog. The documented available models include gpt-5.4, gpt-5.3-codex, gpt-5.2-codex, gpt-5.2, gpt-5.4-mini, gpt-5-mini, and gpt-4.1 — but not gpt-5.5.

If this model ID is not resolvable at runtime, Attempt 4 of every Phase 2 try-fix exploration will fail or be skipped, and the cross-pollination round will only cover 3 of 4 models.

Recommendation: Confirm gpt-5.5 is a valid, deployed model identifier before merging. If not yet available, consider keeping gemini-3-pro-preview or substituting with a confirmed model (e.g., gpt-5.4).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants