Replies: 12 comments
-
|
Hi @Veivel! Great to see another contributor interested in Idea 5 — your GPU profiling thesis background is really relevant here. I've also been exploring this area and wanted to share my findings to help move the discussion forward. About MeI'm Sundaram Mahajan (@SUNDRAM07), actively contributing to Gemini CLI:
Codebase Analysis — What Already ExistsYou're right that a good chunk of instrumentation is already done. Here's what I found after a deep dive:
Prototype WorkI've built a working prototype locally that adds a
Questions for Mentors
|
Beta Was this translation helpful? Give feedback.
-
|
Hi! I'm Manan, a student interested in GSoC Project #5 (Performance Monitoring and Optimization Dashboard). I've been exploring the Gemini CLI codebase and the existing telemetry infrastructure is substantial — My approach focuses on building a I've already submitted contributions to the repo:
One question for the mentors: would you prefer the dashboard as a Looking forward to feedback! |
Beta Was this translation helpful? Give feedback.
-
|
Hi Sehoon, I'm Manan, a final-year student at BITS Pilani. I'm really interested in GSoC Project #5 (Performance Monitoring and Optimization Dashboard). As someone who loves digging into performance bottlenecks and making things measurable, this project immediately stood out to me. I've been exploring the Gemini CLI codebase and I can see there's already a solid telemetry foundation in place — which makes me think the real challenge here is surfacing that data in a way that's actually useful to developers and CI pipelines. That's the part I'm most excited about. I've submitted 6 PRs to the repo (#20757, #20758, #20788, #20789, #20793, #20794) to get comfortable with the codebase and contribution process. Would love to hear what you think the highest-priority aspect of this project is — the developer-facing dashboard, the CI regression detection, or something else? That would really help me focus my proposal. Looking forward to your thoughts! Manan |
Beta Was this translation helpful? Give feedback.
-
|
Quick update — I've pushed a working prototype to my fork on the
94 tests passing across 4 test suites. Built it to extend the existing @sehoon38 would love your thoughts on whether this aligns with what you had in mind for the project — happy to adjust the approach based on feedback. |
Beta Was this translation helpful? Give feedback.
-
|
Hi everyone! Just to give a quick introduction about me, I'm a student at UMich currently working with Qualcomm through a university program (MDP), and I'll be interning at Oracle this summer. I've recently been taking a deeper dive into open-source, and it has been incredibly fulfilling so far. It's awesome to see how much thought @Veivel, @SUNDRAM07, and @aishop-lab are putting into this dashboard and the different approaches being explored. I’ve been tackling this from a slightly different architectural angle. To avoid building custom aggregators or manual rolling windows, I built a PoC that safely intercepts the CLI's existing OTel pipeline. I attached an I plan to introduce a standalone I've opened a Draft PR showing this backend plumbing. My next step is to build out that standalone command and wire the cc: @sehoon38 |
Beta Was this translation helpful? Give feedback.
-
|
Great progress from everyone here! Quick update from my side, and some observations that might be useful for all of us. My UpdateI've opened a Draft PR #21262 with the backend plumbing for:
42 tests across 3 test suites, core build clean. Some observations that might help everyoneAfter spending time in the telemetry codebase, a few things stood out:
@anthonychen000 — The OTel-based approach with @aishop-lab — Your Looking forward to seeing how everyone's approaches evolve! 🚀 |
Beta Was this translation helpful? Give feedback.
-
|
Good question on baseline persistence — yes, the The serialization itself is straightforward once you have the type guard — the more interesting design question is when to update the baseline. My thinking is it should only happen on merge to main, not on every PR, to avoid baseline drift from feature branches. On the v8 heap limit point — agreed that Cost estimation is an interesting addition — that's a real gap for users running long sessions on 2.5 Pro. |
Beta Was this translation helpful? Give feedback.
-
|
Hi all, great to see all these different approaches evolving! @SUNDRAM07: Good catch on the OTel time mismatch! I handled it in my initial PR by calling As for implementing custom I wanted to prioritize keeping the core chat lightweight over maintaining this UI consistency. I just pushed an update ( Really enjoying both of your ideas about cost estimation and baselines. This is definitely something I will take a deeper look into and excited to brainstorm with you guys on. |
Beta Was this translation helpful? Give feedback.
-
|
This thread is turning into a proper design review and I'm here for it 😄 On the timestamp fix — @anthonychen000 good catch with On On V8 heap monitoring — @aishop-lab your finding about On baseline persistence — Where does baseline data live? If On cost estimation — Per-request granularity with session-level rollups. Per-request lets the agent optimize mid-session ("switch to Flash for this query"), session-level gives the "$0.47 today" dashboard. Both layers, negligible overhead. Really enjoying seeing everyone's approaches converge 🚀 |
Beta Was this translation helpful? Give feedback.
-
|
From my point of view, extending the existing telemetry foundation is the right instinct, but the most valuable first version of this project is probably a small and trustworthy summary rather than a very dense dashboard. Startup timing, model latency percentiles, memory trend, and a short optimization summary already feel like a strong first cut. If those signals are reliable, the richer htop style presentation can follow naturally. |
Beta Was this translation helpful? Give feedback.
-
|
@aishop-lab Good to know the @aniruddhaadak80 Agreed — trustworthy and minimal beats flashy and noisy every time. Our prototype in PR #21262 follows exactly that philosophy: startup phases, P50/P90/P99 latency, V8 heap utilization against the actual limit, and a short list of auto-generated optimization suggestions. No fancy UI, just reliable numbers. If the data is solid, the visualization is the easy part. One thing I'd add to the "V1 shortlist": cost visibility. Users running 2.5 Pro sessions have zero insight into token spend right now. Even a simple "this session used ~$0.35" in the summary would be a huge quality-of-life win — and it's cheap to compute (just multiply token counts by the known price-per-million). |
Beta Was this translation helpful? Give feedback.
-
|
Update — I've wired the prototype into the real telemetry pipeline. The
Also found and fixed a cache hit rate bug while wiring: the original formula used The |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hi! I'm Givarrel Veivel, a software engineer from Indonesia. I'll introduce myself concisely:
I'm very new to open source & GSoC, so I'm honestly unsure of the big picture flow & next steps. I have been tinkering locally, though: from what I understand (correct me if im wrong), a good chunk of the heavy lifting of instrumentation has been done with OpenTelemetry (
packages/core/src/telemetry/), while the token/session statistics with the/statscommand have already been implemented. My task will tie together the existing instrumentation with the CLI rendering for/perf.My questions at the moment:
/perf, I'm thinking of a layout like Glances or htop, but is there any particular expected UI?It goes without saying, but I'm looking forward to your feedback and input. Thank you very much for your time and attention, excited to learn :)
cc @bdmorgan @sehoon38
Beta Was this translation helpful? Give feedback.
All reactions