Skip to content

Plan: unify Search Explorer and Interactive Explorer into one page #156

@rdhyee

Description

@rdhyee

Context

iSamples currently hosts two interactive map pages that have been converging for months:

Commit bd0a094 (April 30) explicitly "rewrote the explorer on the progressive_globe foundation for speed + added results table." The two files already share assets/js/source-palette.js, Cesium 1.127, ION token, and preload hints. The globe page already links to the explorer in See Also. Convergence is in motion — this issue tracks completing it.

Direction: a single unified page at the new canonical URL /explorer.html (top-level). Implementation is built in tutorials/progressive_globe.qmd during Phases 1–4, then renamed to explorer.qmd at site root in Phase 5 with redirects from both old URLs. The DOM-first architecture is the destination model; Explorer's facet UI, cross-filter counts, table view, URL params, and SKOS labels port into it as plain-JS DOM handlers (no OJS reactive cells added).

Why now: grant ends July 2026, May website-cleanup deadline ahead of June 2 keynote and June workshop. Mandate: don't start new infrastructure — polish what exists. Two pages doing the same job is technical debt blocking the cleanup.

Plan v3 — further hardened after second Codex review. Five implementation contracts now explicit in the relevant phases: (a) facet data contract standardized on v2 parquet with URI-valued checkboxes (Phase 1); (b) portable predicate builder using EXISTS or pid IN (...) instead of alias-dependent fragments (Phase 1, reused in Phases 2 and 4); (c) all filter dimensions named in URL params with comma-list pattern (Phase 3); (d) preview-safe redirects using document-relative URLs (Phase 5); (e) asset path adjustment after moving to root (Phase 5).

Plan v2 changes (preserved): canonical URL decided up front; cluster-mode filter honesty merged into Phase 1; Phase 4 table mode scope-narrowed to Globe/Table toggle.

Phased PR Plan

Five PRs, each smoke-tested independently. Phases 1–4 modify tutorials/progressive_globe.qmd only. Phase 5 handles the rename, redirects, navbar, and tests.

Phase 1 — Specimen Type filter + SKOS prefLabels + honest cluster-mode UX + portable predicate refactor (medium-large, ~7 hr)

Facet data contract (new in v3): standardize on isamples_202601_sample_facets_v2.parquet (URI strings) — the explorer's facets file, not the globe's older short-label file. Update the globe's samples view registration accordingly. Checkbox value attributes store full URIs; SKOS / prettyLabel() is display-only. Otherwise prettyLabel() has nothing to look up.

Portable predicate refactor (new in v3): refactor facetFilterSQL() from emitting alias-dependent fragments (AND f.material IN (...)) to a portable predicate using EXISTS (SELECT 1 FROM sample_facets f WHERE f.pid = l.pid AND f.material IN (...)) or equivalently pid IN (SELECT DISTINCT pid FROM sample_facets WHERE material IN (...)). Avoids alias mismatch and duplicate rows from JOINs against multi-valued facets. Required for Phase 4 table mode but ship in Phase 1 to avoid backward refactoring. Smoke-test that the existing JOIN-based query in progressive_globe.qmd:855-878 still produces correct counts after the refactor.

Specimen Type filter + SKOS labels:

  • Preload vocab_labels.parquet (~60 KB) in YAML include-in-header.
  • Add collapsible "Specimen Type" panel (#objectTypeFilter) in the side panel, mirroring the Material/Feature pattern at progressive_globe.qmd:155-187.
  • Extend facetFilters OJS cell at progressive_globe.qmd:662-707 to also pull object_type rows from facet_summaries_url and populate #objectTypeFilterBody.
  • Port prettyLabel(uri) from isamples_explorer.qmd:486-495 (pure JS, no reactivity) — apply when rendering material/feature/object_type checkboxes so URIs display as human labels.
  • Extend getCheckedValues() and the refactored facetFilterSQL() (progressive_globe.qmd:256-275) to include object_type.

Cluster-mode honesty (ships in same PR): the H3 summary parquets only carry dominant_source, so material/feature/specimen filters cannot apply at cluster zoom. When any non-source facet filter is active and mode is cluster, show a persistent status note: "These filters apply at neighborhood zoom — zoom in or click a cluster to see filtered samples." If the camera is at res4/res6, auto-enter res8 on filter change to minimize the gap. At point mode (<120 km), the existing JOIN handles filters correctly. Document the limitation in a code comment; revisit if/when DuckDB-WASM gains H3 extension support.

Phase 2 — Cross-filtered live counts (medium, ~6 hr)

Port the explorer's cross-filter machinery to plain-JS, using the portable predicate from Phase 1:

  • Add cross_filter_url constant (already-defined cache parquet, 6 KB).
  • Copy buildCrossFilterWhere(excludeFacet) from isamples_explorer.qmd:500-548 — strip OJS reactive references (searchInput?.trim()document.getElementById('sampleSearch').value.trim()) and adapt to call the Phase 1 portable predicate builder rather than emitting alias-dependent fragments.
  • Copy crossFilteredFacets cell logic from isamples_explorer.qmd:565-652 as an async function updateCrossFilteredCounts() triggered from each filter change listener.
  • Add data-facet and data-value attributes to the count <span> elements so updates are in-place mutations (no re-render).
  • Use the globe's existing db.query() API (DuckDBClient.of, progressive_globe.qmd:438) — not the explorer's manual runQuery().

Phase 3 — URL query params + multi-term search (small-medium, ~5 hr)

Add readQueryParams() and writeQueryParams() alongside the existing readHash()/buildHash() at progressive_globe.qmd:277. Reconcile URL state model:

  • Hash (#v=1&lat=&lng=&alt=&pid=) — camera + selected sample (already working, unchanged).
  • Query params — all bookmarkable filter dimensions (updated in v3 — full list):
    • q= — search query
    • sources=A,B,C — source filter (comma-list, matching existing sources=)
    • material=A,B,C — material URIs (comma-list, URI-encoded)
    • context=A,B,C — sampled feature URIs (comma-list, URI-encoded)
    • object_type=A,B,C — specimen type URIs (comma-list, URI-encoded)
    • maxSamples=N — table row cap
    • view=globe|table
    • perf=1 — opt-in performance panel
  • Both hash and query params coexist. Share URL example: /explorer.html?q=basalt&sources=SESAR&material=https%3A%2F%2Fw3id.org%2Fisample%2Fcontrolledvocabulary%2Fmaterialtype%2Frockorsediment#v=1&lat=37.5&lng=-122&alt=200000.

On load: hydrate #sampleSearch, all filter checkboxes, view mode, and maxSamples from query params. On every filter change and search submit: call writeQueryParams() via history.replaceState.

Fold in the explorer's multi-term search + FTS relevance ranking from #95 in this phase since the search wiring already changes here. Search state becomes bookmarkable on day one of the new search behavior.

Phase 4 — Table view (medium, ~4 hr — scope-narrowed)

Add a binary view toggle (Globe / Table only — drop the explorer's three-way Globe/List/Table) above .globe-layout:

<div id="viewToggle">
  <button data-view="globe" class="active">Globe</button>
  <button data-view="table">Table</button>
</div>

When view === 'table': hide .globe-layout (display:none, do not destroy the Cesium viewer — just hide), show #tableContainer. Reuse the Phase 1 portable predicate for the WHERE clause — no separate query builder, no alias mismatch risk, no duplicate rows. Render a paginated HTML <table> (page size 100, default, configurable to 1K). No upfront row dump; pagination keeps memory bounded.

The maxSamples slider applies only to the table mode's hard cap (1K–100K, default 25K); globe stays at its 5K viewport budget. If table parity becomes too large or risky, ship Phases 1–3 and 5 first and defer table mode to a follow-up issue.

Test: /explorer.html?q=basalt&sources=SESAR&view=table should land on a pre-filtered, paginated table.

Phase 5 — Rename, redirects, navbar, tests (small-medium, ~3 hr)

Files: new explorer.qmd at site root, tutorials/progressive_globe.qmd (→ redirect stub), tutorials/isamples_explorer.qmd (→ redirect stub), _quarto.yml, how-to-use.qmd, tutorials/index.qmd, index.qmd, tests/test_explorer.pytests/test_globe.py migration.

  1. Rename: move the unified content from tutorials/progressive_globe.qmd to explorer.qmd at site root. Output is /explorer.html.

  2. Asset path fix (new in v3): after the move, change the source palette import from ../assets/js/source-palette.js to assets/js/source-palette.js. The current .. accidentally still works on production (browsers swallow .. at root) but breaks GitHub Pages PR previews whose base path is /isamplesorg.github.io/. Audit the unified file for any other ../ paths and resolve them similarly.

  3. Two preview-safe redirect stubs (updated in v3) — one per old URL — each passes location.search + location.hash through using a document-relative URL so previews work:

    location.replace(new URL(`../explorer.html${location.search}${location.hash}`, location.href).href);

    Absolute /explorer.html would break GitHub Pages PR previews at username.github.io/isamplesorg.github.io/.... Keep both files in _quarto.yml so Quarto continues to build them at their public URLs. This preserves all inbound deep links from search engines, shared URLs, and external sites.

  4. _quarto.yml: navbar Interactive Explorer href changes from tutorials/progressive_globe.qmdexplorer.qmd. Remove the Search Explorer sidebar entries at lines 21-22 and 68-69.

  5. Update internal links: how-to-use.qmd:39, tutorials/index.qmd:12, and any reference in index.qmd to point at explorer.html.

  6. Migrate Playwright tests: rename tests/test_explorer.pytests/test_globe.py (or test_explorer_v2.py) targeting /explorer.html. Unskip the cross-filter tests deferred in explorer: dynamic cross-filter facet counts #155 — native checkboxes respond to programmatic .click() unlike OJS Inputs.checkbox.

Phase 5 ships last so both pages remain live and independently functional through the migration — any regression up to this point is a single-PR revert away.

Critical Files

  • tutorials/progressive_globe.qmd — host page during Phases 1–4; renamed to explorer.qmd in Phase 5
  • tutorials/isamples_explorer.qmd — source of facet UI, cross-filter, table; reduced to redirect in Phase 5
  • _quarto.yml — navbar updates in Phase 5
  • how-to-use.qmd:39, tutorials/index.qmd:12, index.qmd — link updates in Phase 5
  • tests/test_explorer.py — migrate in Phase 5
  • tests/playwright/cesium-queries.spec.js — extend with new selectors per phase

Reused functions / patterns:

  • getCheckedValues(elementId), sourceFilterSQL(), facetFilterSQL() — already in globe at lines 234–275; refactor facetFilterSQL() to portable predicate in Phase 1 and extend to handle object_type.
  • buildCrossFilterWhere(), crossFilteredFacets, prettyLabel() — port from explorer lines 500–652, 486–495.
  • DuckDBClient.of() db.query() — globe's existing pattern at line 438.
  • Hash read/write — readHash(), buildHash() at globe lines 211–252; mirror for query params.
  • Source palette — already centralized in assets/js/source-palette.js.

Verification

Per-phase smoke test (Playwright): render with quarto render, run smoke test on the built HTML, visual check, fix-and-repeat, then commit + PR.

Phase-specific Playwright tests to add to tests/playwright/cesium-queries.spec.js:

  • Phase 1:
    • #objectTypeFilterBody input[type=checkbox] count > 0 after 10s; material labels are human-readable (not URIs).
    • Facet honesty regression test (per Codex): selecting material/context/specimen at high altitude shows the explanatory status note; selecting the same filter at point zoom (<120 km) constrains the sample query.
    • Portable predicate regression test (new in v3): a sample in the lite parquet with two material URIs appears exactly once in cluster-zoom counts and point-zoom rendering, not duplicated.
  • Phase 2: span.facet-count[data-facet='source'] exists; SESAR count > 4M; selecting SESAR drops other-source counts.
  • Phase 3: navigate to ?q=basalt&sources=SESAR&material=<uri> → search input, source checkbox, and material checkbox all hydrate correctly.
  • Phase 4: ?view=table → table visible, globe hidden; pagination visible; toggling back to globe re-renders points.
  • Phase 5:
    • tutorials/progressive_globe.html?q=basalt&sources=SESAR and tutorials/isamples_explorer.html?q=basalt&sources=SESAR both redirect to /explorer.html with params intact.
    • Preview-safe redirect test: redirect works on a GitHub Pages preview URL with non-root base path, not just on isamples.org.
    • Asset path test: source palette loads correctly from /explorer.html on both production and preview deploys.

Manual browser verification per phase: cross-filter latency under 5s; mobile 900 px breakpoint collapses cleanly; hash deep-link round-trip via incognito; ?perf=1 panel works.

Rollback: each phase is one PR. Reverting Phase N leaves earlier phases intact. Until Phase 5 ships, both pages remain live and independently functional.

Resolved Decisions

  1. Canonical URL: /explorer.html (top-level). Matches the page's hero navbar position; "progressive_globe" describes an implementation, not a user task.
  2. Single page: yes — no two-page fallback.
  3. Branch / PR strategy: each PR branched from main (easier review, smaller blast radius per merge).
  4. Facet data contract (v3): standardize on sample_facets_v2.parquet with URI-valued checkboxes from Phase 1.
  5. Predicate shape (v3): portable EXISTS / pid IN (...) predicate built in Phase 1, reused in Phases 2 and 4.

Plan prepared collaboratively with Claude Code, hardened across two rounds of Codex review. Not yet implemented — filing for visibility and future execution.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions