Update index.md#1
Merged
Merged
Conversation
Update architecture workflow
rdhyee
added a commit
that referenced
this pull request
Feb 13, 2026
* Add H3 spatial indexing, two-tier facet loading, and benchmark optimizations (#5) Add H3 spatial indexing, two-tier facet loading, and benchmark optimizations ## Changes - isamples_explorer.qmd: Two-tier facet loading (2KB summary for instant counts) - parquet_cesium_isamples_wide.qmd: Zoom-adaptive H3 clustering with LOD - zenodo_isamples_analysis.qmd: Data-driven H3 regional analysis - narrow_vs_wide_performance.qmd: Added geospatial and facet benchmarks ## Fixes Applied (Codex review) - Fixed MODE(n) → MODE(source) for cluster coloring - Added camera listener cleanup to prevent leaks - Added NaN guard for cluster label parsing - Added user-facing warning for facet summary failures Closes #1, #2, #3, #4 * Add progressive globe demo with H3 aggregated loading Loads 580KB H3 res4 summary for instant globe render (<1s), then switches to res6/res8 on zoom with viewport filtering. Click triggers sample detail query from full 280MB parquet. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix progressive globe: render stats bar from OJS cells DOM elements created in raw HTML aren't available when OJS cells execute. Move legend, stats bar, and phase indicator into OJS cells and add null guards on all getElementById calls. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Click cluster dot to fly-to and drill down to next H3 resolution Clicking an H3 cluster now flies the camera to that location at an altitude that triggers the next resolution level (res4→res6→res8). The zoom watcher then automatically loads finer detail. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Redesign progressive globe: side panel + global data + info-only clicks - Side-by-side layout: globe left, live info panel right (always visible) - Load full H3 files globally (no viewport filtering) — no gaps when panning - Click shows cluster info + nearby samples in side panel (no camera fly-to) - Zoom watcher switches resolution automatically: res4 → res6 → res8 - Stats, legend, cluster card, and sample list all in side panel Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix infinite loop: replace OJS reactivity with imperative DOM updates The side panel was causing a reactive cycle: globeStatus change → sidePanel re-render → layout re-render → viewer re-create → phase1 re-run → globeStatus change → loop Fix: all side panel content is static HTML. Stats, cluster card, and sample list are updated via getElementById/innerHTML only. No OJS mutable variables, no reactive cascade. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add sub-res8 individual sample drill-down to progressive globe - New 4th zoom tier: below 120km altitude, switches from H3 clusters to individual sample points loaded from lite parquet (60MB vs 280MB) - Two-stage sample card: instant metadata from lite file, lazy-loaded description from full wide parquet on click - Viewport caching with 30% padding for smooth panning - Stale-request guards for async camera/query flows - Hysteresis thresholds (120km enter / 180km exit) to prevent flicker - Separate PointPrimitiveCollection for samples vs clusters - Cluster click queries now use lite parquet instead of wide (5x faster) Data files on R2: - isamples_202601_samples_map_lite.parquet (60MB, 6M rows, 9 columns) - Still uses H3 summary files for res4/6/8 cluster view Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix bugs from Codex review: deadlock, schema mismatch, timing - loadRes: wrap in try/catch/finally so `loading` flag always resets on query failure (was permanent deadlock — finding #2) - Schema fix: cluster-click query used `n as source` but the lite parquet has column named `source` (finding #4) - Remove unnecessary ORDER BY on H3 loads (finding #8) - Use .pop() instead of [0] for performance timing entries (finding #11) - Add rel="noopener noreferrer" to target="_blank" link (finding #7) Deferred: XSS escaping (trusted data), antimeridian handling, detail click caching, startup error fallback. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add progressive globe to sidebar navigation Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix cluster-click query: remove description column missing from lite parquet The samples_map_lite.parquet doesn't have a description column. Use place_name for nearby sample cards instead. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add URL state encoding for shareable deep links - Hash-based URL state: lat, lng, alt, heading, pitch, mode, pid - v=1 schema versioning for future compatibility - parseNum with Number.isFinite (avoids lat=0 bug from Codex review) - replaceState for continuous camera movement, pushState for mode transitions and sample/cluster selection - Browser back/forward via hashchange listener with flight animation - Suppress flag prevents hash write loops during navigation restore - Deep-link startup: fly to position and restore sample card from pid - Share View button copies current URL to clipboard with toast - pid takes precedence over h3 (canonicalized on write) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix startup crash: move _initialHash before globalRect block that reads it v._initialHash was set after the once() closure that references it, causing undefined.lat TypeError on page load. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
rdhyee
added a commit
that referenced
this pull request
Apr 10, 2026
* Add cross-filtering to Explorer facet counts When any filter is active, facet counts now reflect the intersection of all OTHER active filters. For example, selecting SESAR as source updates material/context/specimen counts to show only what exists in SESAR data. Uses parallel GROUP BY queries via DuckDB-WASM. Counts update via DOM manipulation to avoid resetting checkbox selections. Zero-count facet values are dimmed for visual clarity. When no filters are active, pre-computed summaries are used (instant). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix cross-filtering: use pre-computed cache + correct column mapping - Add 6KB pre-computed cross-filter cache for instant single-filter lookups - Add 21MB sample_facets view with URI-string columns for on-the-fly fallback - Fix column name mismatch: wide parquet has p__* BIGINT[] columns, but facet values are URI strings — cross-filter now queries sample_facets - Main whereClause uses pid subquery against sample_facets for facet filters - Source filter still queries wide parquet directly (n column is correct) Supplementary files on data.isamples.org: - isamples_202601_facet_cross_filter.parquet (6 KB, 526 rows) - isamples_202601_sample_facets_v2.parquet (21 MB, 6M rows) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix three cross-filter bugs 1. Multi-value within single facet: fast path now requires exactly one value in the active facet, not just one active dimension. Multiple selections (e.g., SESAR+GEOME) correctly fall through to on-the-fly queries. 2. Text search participates in cross-filtering: buildCrossFilterWhere now includes ILIKE conditions. sample_facets_v2 regenerated with label, description, place_name columns (63 MB on R2). 3. Clearing filters restores baseline counts: the update cell now resets all facet-count labels to baseline values and removes zero-count dimming when crossFilteredFacets is null. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Fix count universe inconsistency and blank-value mismatch Codex review found two bugs: 1. facet_summaries counted all 6.68M records but sample_facets only had the 5.98M with coordinates — counts jumped when toggling filters. Regenerated all three parquet files from the same base universe (lat IS NOT NULL). SESAR now consistently 4,389,231 across all files. 2. Baseline summaries included blank-string facet values, but on-the-fly queries excluded them with != ''. Regenerated summaries now exclude blanks, matching the on-the-fly behavior. Also: removed dead getDisplayCounts(), fixed stale 0.3MB comment, added missing quote escaping on source cache lookup. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Add cross-filtering interaction tests 5 new tests in TestExplorerCrossFiltering: - Baseline SESAR count matches summaries (>4M) - Clicking source updates material counts (organicmaterial decreases) - Clearing filter restores baseline counts - Zero-count items get dimmed (opacity < 1) - New parquet endpoints (cross_filter, sample_facets_v2) reachable Cross-filter tests gracefully skip if data attributes not yet deployed. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * Clean up blank-string facet values in sample_facets Convert blank strings to NULL with NULLIF in sample_facets_v2 generation (586 blank context rows → NULL). Remove redundant != '' guards from on-the-fly queries since IS NOT NULL now handles both. Addresses Codex finding #2: blank values in sample_facets caused state mismatch with baseline summaries (which correctly excluded blanks). Finding #1 (count universe mismatch) was a false positive — Codex cached stale files; live CDN has consistent counts across all three artifacts (SESAR=4,389,231, total=5,980,282). Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
This was referenced Apr 24, 2026
This was referenced May 8, 2026
rdhyee
added a commit
that referenced
this pull request
May 9, 2026
…ase 1) (#186) * explorer: persist selected cluster identity in URL via &h3=<cell> Phase 1 of EXPLORER_CLUSTER_URL_PROPOSAL.md (#182). Cluster selection now round-trips through the URL hash, complementing the existing &pid= for samples. Use case: share or bookmark a specific cluster you clicked, and have collaborators land on the same H3 cell with side-panel populated. Encoding: #h3=843f6d3ffffffff H3 cell index in canonical 15-char hex (no 0x prefix). The cell index encodes its own resolution; no separate &res= or &cluster_source= field needed. The existing &sources= filter (already URL-persisted) covers the only filter that affects cluster aggregation — material/context/object_type filters can't, per the comment at :1706-1710. Mechanics: - h3_cell carried into the runtime cluster .id at both add() sites (:992, :1335) as a hex string via row.h3_cell.toString(16). The parquet column is UBIGINT; converting to hex once at ingestion keeps the URL representation canonical. - _globeState.selectedH3 added; mutated by cluster-click (:923) and cleared by sample-click (:895) for mutual exclusion. Same pattern as selectedPid. - readHash parses h3 (:626); buildHash emits h3 when set (:645). - fetchClusterByH3 helper at :1791 looks up the row across res4/res6/res8 parquets via UNION ALL. DuckDB-WASM doesn't accept 0x... literals, so hex is converted to decimal in JS via BigInt and CAST AS UBIGINT in SQL. - hydrateClusterUI helper at :1827 mirrors the cluster-click side-panel + nearby-samples query, called from both the boot deep-link (:2266) and the back/forward hashchange handler (:1899). - Mutual-exclusion at hydration time: &pid= wins if both are present, per the proposal §4. EXPLORER_STATE.md §2 updated with the new h3 row. Verified locally: - URL #h3=843f6d3ffffffff (a known res4 cell with 151,334 OpenContext samples in central Turkey) round-trips: side panel shows 'Selected Cluster / OpenContext / H3 res4 / 151,334 samples / 37.6619, 32.8334' with 30 nearby samples loaded. - Empty hash + hash without h3/pid both load without errors. Closes Phase 1 of #182. Phase 2 (unified &sel=) deferred unless a third selection type appears. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * explorer: address Codex review of #186 Five fixes from Codex review: 1. Race-safe hash hydration (BLOCKER) Both pid and h3 hashchange branches now use a monotonic `viewer._selGen` token, bumped per hashchange and rechecked after every await. Fast back/forward across pid/h3/empty no longer lets stale fetch results repaint the side panel. 2. Strict h3 validation Replaced `replace(/[^0-9a-fA-F]/g, '')` with `/^[0-9a-f]{15}$/i.test()` over a lowercased input. Reject malformed input rather than silently strip — `h3=xxx843f...` no longer becomes a different lookup key. 3. Canonical lowercase normalization After successful lookup, runtime `selectedH3` is set from the parquet row's `h3_cell.toString(16)` (always lowercase), not the raw URL token. Subsequent `buildHash` writes always emit canonical form regardless of what the user typed. Boot deep-link applies the same normalization. 4. Resolution routing instead of UNION ALL Canonical H3 cells encode resolution in the 2nd hex char (after the leading-zero strip). `RES_TO_H3_URL[parseInt(lower[1], 16)]` picks the right parquet directly — one fetch instead of three on every &h3= load. 5. Mutual-exclusion in buildHash Changed independent `if`s for `selectedPid` / `selectedH3` to `else if`, making the runtime invariant load-bearing in one place. Also: unknown / malformed h3 now actively clears the cluster card and nearby-samples list, matching the empty-hash and missing-pid paths (previously left stale content). Verified locally: - Uppercase #h3=843F6D3FFFFFFFF — hydrates, then runtime canonicalizes. - Unknown well-formed cell #h3=843ffffffffffff — side panel clears, no errors. - Non-hex #h3=zzz_NOT_HEX_zzz — silent reject, no JS errors. - Known #h3=843f6d3ffffffff — round-trips identically. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * explorer: thread freshness check into hydrateClusterUI; bump _selGen earlier Codex's second review found the previous race fix was incomplete: hydrateClusterUI has its own internal `await db.query(...)` for the nearby- samples list, then calls updateSamples(samples). The hashchange-handler- side selGen check happened only AFTER hydrateClusterUI returned, so a stale fetch INSIDE hydrateClusterUI could still repaint the side panel with samples for an older h3 selection. Fix: hydrateClusterUI now accepts an optional `isStale` predicate and checks it after its inner await, before updateSamples (and before the catch-path's "Query failed" message). The hashchange caller passes `() => selGen !== viewer._selGen`. The cluster-click and boot-deep-link callers leave it undefined — clicks are user-serialized and there's only one boot, so no race possible there. Also (Codex non-blocking nits): - Bump `_selGen` at the very top of the hashchange handler, before the lat/lng early return — so even hashchanges that lack lat/lng invalidate any in-flight stale work. - Reject non-cell H3 modes (`lower[0] !== '8'`) in fetchClusterByH3 — defensive guard against edges/vertices/etc. ever ending up in `&h3=`. Verified locally: known-good `#h3=843f6d3ffffffff` round-trips identically (151,334 OpenContext samples, 30 nearby rendered). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * explorer: address Codex review v3 — boot race + source-filter consistency Two P2 findings from Codex's third pass on PR #186: 1. Boot deep-link could still race with a later hashchange. The hashchange listener registers earlier in the same OJS cell, so a slow initial &h3= or &pid= lookup can be superseded by browser back/forward (or a manual hash edit) during the await — the boot path would then finish later and repaint stale data. Apply the same _selGen guard to the boot path: bump the token at boot start, capture bootSelGen, define isBootStale = () => bootSelGen !== viewer._selGen, and check it after every await (pid lookup, wide-parquet description fetch, h3 lookup, and inside hydrateClusterUI via the existing isStale-predicate parameter). 2. fetchClusterByH3 bypassed the active source filter. The cluster lookup did `WHERE h3_cell = ?` without sourceFilterSQL — so an &h3= URL whose dominant_source is currently unchecked in ?sources= would still hydrate a cluster card for a dot the user can't see on the globe. Worse, hydrateClusterUI's nearby-samples query DOES apply source filter, producing a mismatched panel: full unfiltered cluster card with a filtered-down samples list. Add sourceFilterSQL('dominant_source') to the lookup; an excluded source now returns null and the side panel stays empty (matching what the globe shows). Verified locally: - ?sources=SESAR,GEOME,SMITHSONIAN#h3=843f6d3ffffffff (the cluster's dominant_source OPENCONTEXT is excluded) → side panel stays empty. - ?sources= default (all checked) #h3=843f6d3ffffffff → hydrates as before with 151,334 OpenContext samples and 30 nearby rows. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * explorer: source-filter invalidates selection; boot finalize in try/finally Two findings from Codex's fourth review: 1. Source-filter changes don't invalidate selection state. When the user unchecked the source for an already-hydrated cluster (or sample), the globe correctly hid the dot but the side panel and `&h3=` / `&pid=` URL stayed stale. Source filter changes also raced against in-flight selection lookups since they didn't bump `_selGen`. Fix: in the source-filter change handler (`:1690`), bump `_selGen` immediately, then after the existing globe-data reload, re-validate the current selection under the new filter: - Cluster (selectedH3): re-run fetchClusterByH3 (already honors sourceFilterSQL after v3); if returns null, clear selectedH3, cluster card, samples list, and rewrite the URL via replaceState. - Sample (selectedPid): probe lite_url with the same source filter; if no match, clear selectedPid + side panel + URL. Both branches re-check `_selGen` after the await to bail if a newer filter change has fired. 2. Boot's stale-abort early-returns skipped `_suppressHashWrite = false`. A no-lat/lng hashchange during boot's awaits could leave hash writes suppressed forever (the lat/lng path clears it via _suppressTimer; a stale-aborted boot leaves it set with no later cleanup). Fix: wrap the boot deep-link block in try/finally; move the `_suppressHashWrite = false` assignment into the finally so it runs on every path, including stale-abort early returns. Verified locally: - Load #h3=843f6d3ffffffff (OpenContext cluster); side panel hydrates. - Uncheck OPENCONTEXT in the source filter → `&h3=` drops from URL, cluster card returns to empty state, samples list clears, ?sources= written with the remaining 3 sources. Globe also re-renders without OpenContext clusters. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * explorer: fix UBIGINT precision-loss in h3_cell + rehydrate cluster on filter Two issues from Codex's fifth review: 1. (P2 NEW) Selected cluster surviving the filter wasn't being rehydrated. When the user toggled a non-cluster source (e.g. unchecked SESAR while the selected cluster's dominant_source = OPENCONTEXT), the cluster stayed in URL but the nearby-samples list could now show stale rows from unchecked sources or miss newly-checked ones (hydrateClusterUI's nearby query uses sourceFilterSQL('source')). Fix: in the source-filter handler's revalidate branch, when meta is truthy (cluster still valid), call hydrateClusterUI(meta, isStale) to refresh the side panel under the new filter — not just leave it. 2. (UBIGINT precision regression — surfaced by testing #1) DuckDB-WASM returns h3_cell (UBIGINT > 2^53) as a JS Number, which loses precision on .toString(16). Boot worked because the SQL WHERE matched at the parquet level, but `selectedH3 = meta.h3_cell` (lossy roundtrip) stored a corrupted hex; subsequent revalidations against the corrupted key would never match and the panel would clear. The bug was latent in PR-as-of-ebd7978; the rehydrate branch above made it visible. Fix: SQL SELECT now CASTs h3_cell to VARCHAR (decimal string), and JS converts to hex via BigInt(decString).toString(16) — no precision loss. Applied at the two cluster-render sites (phase1, loadRes). fetchClusterByH3's return now uses the validated input `lower` as the canonical hex so the helper is also lossless. `to_hex()` in DuckDB-WASM doesn't exist (tried first, errored "Catalog Error: Scalar Function with name to_hex does not exist!" — the VARCHAR cast + JS BigInt is portable across versions). Verified locally: - Boot at #h3=843f6d3ffffffff hydrates correctly. - Uncheck SESAR (OPENCONTEXT survives): cluster card unchanged, samples re-rendered with only OpenContext rows, &h3= preserved. - Uncheck OPENCONTEXT (cluster's own source): card + samples cleared, &h3= dropped from URL. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * docs: sync EXPLORER_STATE.md h3 row to current implementation Codex's sixth review only finding (P3, non-runtime): the EXPLORER_STATE.md description still reflected the original v1 implementation: - "regex `[^0-9a-fA-F]` strip" → now strict `/^[0-9a-f]{15}$/i` reject-not-strip - "UNION ALL across all 3 parquets" → now resolution-routed via RES_TO_H3_URL - Missing: cell-mode guard (`lower[0] === '8'`) - Missing: source filter applied (sourceFilterSQL('dominant_source')) - Missing: UBIGINT precision-loss workaround (CAST AS VARCHAR + BigInt) - Missing: source-filter change re-validation - Missing: _selGen race guard Updated the h3 row to describe current behavior so future URL-state work finds accurate docs. --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
rdhyee
added a commit
that referenced
this pull request
May 13, 2026
Codex flagged three blockers on the cumulative M-1A + M-1B diff: 1. Mobile overlay collided with the Cesium toolbar. At max-width: 900px the overlay was switching to left: 8px, covering the toolbar column at left: 5px. Removed the mobile left-shift; keep the overlay offset at left: 50px so the vertical toolbar remains hit-targetable. Width adjusts to calc(100% - 58px) instead. 2. Base-layer picker dropdown could be occluded by the overlay (dropdown opens at left: 36px; overlay starts at left: 50px). Bump .cesium-baseLayerPicker-dropDown z-index to 1100 so it wins the z-stack against the overlay's 1000. 3. No automated coverage for overlay-vs-toolbar collision. Add tests/playwright/explorer-map-overlay.spec.js with four specs: desktop overlap, mobile overlap, dropdown z-index ordering, and a smoke test that the sidebar↔in-map input mirror is bidirectional. Also picked up the small IME concern from Codex ask #1: gate both Enter handlers (in-map and sidebar) on `!e.isComposing && e.keyCode !== 229` so IMEs that emit Enter to commit a candidate don't trigger a search on the pre-commit value. Non-blocker items (URL-clears-search edge case, aria-label on #sampleSearch, aria-describedby for hints) deferred — to be picked up in the EXPLORER_STATE.md / a11y commit at the end of the PR. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
4 tasks
33 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Update architecture workflow