diff --git a/COLLECTION_FACET_CODEX_PROMPT.md b/COLLECTION_FACET_CODEX_PROMPT.md
new file mode 100644
index 00000000..43fc9581
--- /dev/null
+++ b/COLLECTION_FACET_CODEX_PROMPT.md
@@ -0,0 +1,113 @@
+# Codex prompt — Option A: first-class `collection` facet in the iSamples explorer
+
+> Paste the block below into Codex (run from `~/C/src/iSamples/isamplesorg.github.io`,
+> which has `AGENTS.md`; the repo root has `.codex/config.toml` with Playwright MCP).
+> Tracks issue isamplesorg/isamplesorg.github.io#243. Plan-first with a sign-off gate.
+
+---
+
+```
+GOAL
+Add a first-class "collection" dimension to the iSamples interactive explorer
+(explorer.qmd) so users can filter samples to a named collection — e.g. the
+OpenContext project "PKAP Survey Area" — and layer the existing material /
+context / object_type facets on top. Full background, data analysis, and the
+two-phase plan are in issue #243.
+
+DO THIS IN TWO STAGES. Stage 1: produce a written implementation plan and STOP
+for my sign-off. Stage 2 (only after I approve): implement.
+
+=== KEY DESIGN FACTS (already verified — do not re-derive) ===
+- A "collection" is the `label` of a SamplingSite entity. It is NOT on the
+  MaterialSampleRecord rows; it is reached by traversal:
+    MaterialSampleRecord.p__produced_by[1] -> SamplingEvent
+    SamplingEvent.p__sampling_site[1]       -> SamplingSite.label
+  (All within the wide parquet; `otype` column distinguishes entity types.)
+- Cardinality: ~60,268 distinct SamplingSite labels; only ~1.63M of 6.35M
+  samples have a site (sparse facet, mostly OpenContext). PKAP = 15,446 samples.
+- Doing this traversal LIVE in DuckDB-WASM per interaction is NOT viable (it is
+  the array-join pattern profiled as the in-browser bottleneck). MUST precompute.
+- Data is served from https://data.isamples.org/ (Cloudflare Worker -> R2).
+  NEVER reference raw pub-*.r2.dev URLs.
+
+=== HOW FACETS WORK TODAY (anchors in explorer.qmd) ===
+- Parquet URL constants: R2_BASE (:683), wide_url=/current/wide.parquet (:690),
+  facets_url=…sample_facets_v2.parquet (:692), facet_summaries_url (:693),
+  cross_filter_url (:695), vocab_labels_url (:698), lite_url (:687),
+  h3_res{4,6,8}_url (:684-686).
+- The facet filter predicate (:942):
+    AND pid IN (SELECT DISTINCT pid FROM read_parquet('${facets_url}')
+                WHERE <conds>)
+  i.e. per-sample facet values live in sample_facets_v2.parquet, keyed by pid.
+- Facet checkbox lists + counts are rendered by renderFilter(...) (:~1792) from
+  facet_summaries (value -> count); cross-filtered counts use facet_cross_filter.
+- material/context/object_type values are vocabulary URIs labeled via
+  vocab_labels.parquet. NOTE: a collection's "value" is a SamplingSite identity
+  (site_id) labeled from the NEW collections dimension below — NOT a vocab URI.
+- URL/state contract is normative in EXPLORER_STATE.md. The four query params
+  today are search, sources, material, context, object_type (+ search_scope).
+  A new `collection` param must follow the SAME lifecycle as `material`:
+  applyQueryToFacetFilters (hydrate), handleFacetFilterChange ->
+  writeQueryState() (write-back), cross-filter count recompute, param removed
+  when empty. Honor the Quarto `?q=` collision note (use `collection`, not `q`).
+- Cluster-mode honesty: H3 summary parquets only carry dominant_source, so
+  material/context/object_type filters do NOT affect zoomed-out clusters (the
+  #facetNote). A `collection` facet inherits this unless collection is also
+  added to the H3 summaries — call this out; do not silently break the note.
+
+=== STEP 0 (do first, report findings) ===
+Locate the build pipeline that PRODUCES the supplementary parquets
+(sample_facets_v2, samples_map_lite, h3_summary_res{4,6,8}, facet_summaries,
+facet_cross_filter) and uploads them to R2. They are NOT in this repo's
+scripts/. Search the sibling repos and data dirs:
+  ~/C/src/iSamples/{isamples-python,pqg,isamplesorg.github.io-duckdb-spike}
+  ~/Data/iSample/  (esp. pqg_refining/)
+and any notebooks. Also read workers/data-isamples-org/README.md for the R2
+serving/versioning layer. Report exactly how each file is built and uploaded,
+or state that a build path must be created from scratch.
+
+=== STAGE 1 DELIVERABLE: a written plan covering ===
+1. Build: a new script (e.g. scripts/build_collections.py) that, from
+   /current/wide.parquet, computes per-sample (pid -> site_id, site_label) via
+   the traversal, and emits:
+     a) collections.parquet  — dimension, ~60K rows:
+        site_id, label, source, n_samples, centroid_lat, centroid_lng,
+        bbox(min/max lat/lng). Powers the "search the long tail" half of the UX
+        and the Featured-Collections presets (collections.qmd).
+     b) an added `site_id` (+ maybe site_label) column on sample_facets_v2
+        (regenerate as v3 if v2's builder is unavailable; keep pid as the key so
+        the :942 predicate extends with one more AND condition).
+     c) collection rows in facet_summaries (site_id -> count) so the checkbox
+        list + counts render via the existing machinery. Decide whether to add
+        collection to facet_cross_filter now or defer (note the consequence).
+   Define a stable site_id (hash of label, or the SamplingSite pid). Specify
+   versioned filenames + the /current alias, consistent with existing files.
+2. Explorer wiring (explorer.qmd), mirroring `material` exactly:
+   - new collection facet container + a `?collection=` URL param on the
+     EXPLORER_STATE.md lifecycle.
+   - DUAL UX (my decision): top-N collections (>= a sample-count threshold) as
+     checkboxes reusing renderFilter; PLUS a type-to-search input over
+     collections.parquet for the long tail (60K). Specify how a search-selected
+     collection becomes an active filter value alongside the checkboxes.
+   - extend the :942 predicate (or facets subquery) with the collection
+     condition; ensure cross-filter counts and #facetNote stay correct.
+3. data.qmd + collections.qmd updates: document collections.parquet; once the
+   facet exists, upgrade the Featured-Collections preset links from
+   geographic-only to a real &collection=<site_id> filter.
+4. Test plan: extend tests/ (pytest + Playwright). At minimum a Playwright check
+   that ?collection=<PKAP site_id> yields the PKAP sample set and that layering
+   ?material=… narrows it; reproducible DuckDB snippets for the counts.
+5. Risks / migration: snapshot-version coupling (site_id stability across
+   rebuilds), the sparse-facet UX for non-collection sources, cluster-mode
+   honesty, and file-size deltas.
+
+=== CONSTRAINTS ===
+- Read AGENTS.md, ../CLAUDE.md, EXPLORER_STATE.md before planning.
+- explorer.qmd is ~3,500 lines of working OJS/JS — make INCREMENTAL, additive
+  changes mirroring existing facet code; do not refactor working paths.
+- Quarto OJS gotcha: cells use `name = value`, NOT top-level const/let/var.
+- Static site, no hot reload: note where `quarto preview` + browser refresh is
+  needed to verify.
+- Verify against https://data.isamples.org/ only; never raw pub-*.r2.dev.
+- STOP after the Stage 1 plan and wait for my approval before writing code.
+```
diff --git a/EXPLORER_STATE.md b/EXPLORER_STATE.md
index 253ee32e..a3cfc1f8 100644
--- a/EXPLORER_STATE.md
+++ b/EXPLORER_STATE.md
@@ -34,6 +34,7 @@ citations.
 | `material` | DOM `#materialFilterBody` checkboxes | omitted (= no filter) | CSV of full URIs | `applyQueryToFacetFilters()` at end of `facetFilters` (`:1061`) | `writeQueryState()` from `handleFacetFilterChange` (`:1642`) | none — checkbox `value` already constrained by render | empty checked set ⇒ param removed (`:459`) |
 | `context` | DOM `#contextFilterBody` checkboxes | omitted | CSV of full URIs | same as `material` | same as `material` | none | same |
 | `object_type` | DOM `#objectTypeFilterBody` checkboxes | omitted | CSV of full URIs | same as `material` | same as `material` | none | same |
+| `collection` | DOM `#collectionFilterBody` checkboxes | omitted (= no filter) | CSV of `collection_id`s (16-hex) | `applyQueryToFacetFilters()` (after the `facetFilters` cell renders top-N ∪ URL ids) | `writeQueryState()` from `handleFacetFilterChange` | none | #243. Values are collection ids from `collections.parquet`, NOT vocab URIs. Filters via a 2nd subquery in `facetFilterSQL()` against `sample_collections.parquet`. NOT cross-filtered (no cross_filter cache); counts shown are the collection's static total. The `#collectionSearch` box adds long-tail rows beyond the top-N checkboxes |
 | ~~`view`~~ | _removed in mockup-v1 (#200)_ | — | — | — | — | — | The Globe/Table toggle is gone — the samples table is now permanent below the globe. `writeQueryState()` does `params.delete('view')` to canonicalize legacy bookmarks. See §6 "Mockup-v1 addendum" |
 | `search_scope` | local closure `_searchScope` in `zoomWatcher` | omitted (= `world`) | `area` only; absent ⇒ world | `_searchScope` hydrated at top of `zoomWatcher` from `params.get('search_scope')` | `persistSearchScope()` from `doSearch()` and button clicks | exact match `'area'` | sidebar `#sampleSearchSidebar` Enter always submits `world`, never `area` — see §6 mockup-v1 addendum |
 | `page` | inner closure `let page = 0` in `tableView` | not in URL | — | — | resets to 0 on `refreshTable()`; ±1 on prev/next | clamped to `[0, totalPages-1]` | **#163 item 6** — table page is intentionally not URL state today; if/when added, must coexist with the cross-filter contract below |
diff --git a/_quarto.yml b/_quarto.yml
index 685a07f9..140713d9 100644
--- a/_quarto.yml
+++ b/_quarto.yml
@@ -14,6 +14,8 @@ website:
         text: Home
       - href: explorer.qmd
         text: Interactive Explorer
+      - href: collections.qmd
+        text: Collections
       - text: How to Use
         menu:
           - text: Overview
diff --git a/collections.qmd b/collections.qmd
new file mode 100644
index 00000000..e2fa2797
--- /dev/null
+++ b/collections.qmd
@@ -0,0 +1,62 @@
+---
+title: "Featured Collections"
+subtitle: "Jump straight to well-known sample collections on the interactive globe"
+toc: true
+categories: [explore, collections]
+---
+
+::: {.callout-note}
+**Identity-based collection filtering** (issue
+[#243](https://github.com/isamplesorg/isamplesorg.github.io/issues/243)). Each
+link applies the explorer's `collection` facet (`&collection=<id>`) so you see
+*exactly* that collection's samples — not just whatever is near a location — and
+flies the globe to the collection's centroid. From there, layer on the Material,
+Sampled Feature, or Specimen Type facets to narrow further.
+:::
+
+## How to use these
+
+1. Click **Open in Explorer** — the `collection` facet filters to exactly that
+   collection's samples and the globe flies to its centroid in point mode.
+2. **Layer on facets**: open the *Material*, *Sampled Feature*, or *Specimen
+   Type* panels and check values to narrow within the collection.
+3. **Find any collection** — in the explorer, open the **Collection** panel and
+   type in its search box; the top ~100 collections also appear as checkboxes.
+4. **Share what you see** — the URL captures the full view (`collection` +
+   other facets + camera), so you can bookmark or send any state you reach.
+
+## Featured collections
+
+These are the largest OpenContext project areas in the current snapshot
+(`202604`), by sample count.
+
+| Collection | Source | Samples | |
+|---|---|---:|---|
+| **PKAP — Pyla-Koutsopetria Survey Area** (Cyprus) | OpenContext | 15,446 | [Open in Explorer](explorer.html?collection=dd74c71982da0e21#v=1&lat=34.9836&lng=33.7071&alt=40000&mode=point) |
+| Çatalhöyük (Turkey) | OpenContext | 145,884 | [Open in Explorer](explorer.html?collection=20365f0e3b27dc8e#v=1&lat=37.6682&lng=32.8272&alt=40000&mode=point) |
+| Petra Great Temple (Jordan) | OpenContext | 108,846 | [Open in Explorer](explorer.html?collection=1ef8673aa89023c1#v=1&lat=30.3287&lng=35.4421&alt=40000&mode=point) |
+| Polis Chrysochous (Cyprus) | OpenContext | 52,252 | [Open in Explorer](explorer.html?collection=756f324a7d902068#v=1&lat=35.0349&lng=32.4218&alt=40000&mode=point) |
+| Kenan Tepe (Turkey) | OpenContext | 42,294 | [Open in Explorer](explorer.html?collection=732469b20b632815#v=1&lat=37.8307&lng=40.8137&alt=40000&mode=point) |
+| Poggio Civitate (Italy) | OpenContext | 41,679 | [Open in Explorer](explorer.html?collection=a5e653d3b3704b95#v=1&lat=43.1529&lng=11.4016&alt=40000&mode=point) |
+| Ilıpınar (Turkey) | OpenContext | 36,947 | [Open in Explorer](explorer.html?collection=2308de8c25a27090#v=1&lat=40.4683&lng=29.3091&alt=40000&mode=point) |
+| Čḯxwicən (Washington, USA) | OpenContext | 29,793 | [Open in Explorer](explorer.html?collection=84eb590024898ba9#v=1&lat=48.1315&lng=-123.4628&alt=40000&mode=point) |
+| Heit el-Ghurab / Giza (Egypt) | OpenContext | 28,940 | [Open in Explorer](explorer.html?collection=cb1775e663696ce6#v=1&lat=29.9711&lng=31.1413&alt=40000&mode=point) |
+| Domuztepe (Turkey) | OpenContext | 22,394 | [Open in Explorer](explorer.html?collection=d452bbb04ea0d100#v=1&lat=37.3226&lng=37.0349&alt=40000&mode=point) |
+| Forcello Bagnolo San Vito (Italy) | OpenContext | 18,573 | [Open in Explorer](explorer.html?collection=c59e2c8620cde574#v=1&lat=45.0897&lng=10.8754&alt=40000&mode=point) |
+| Chogha Mish (Iran) | OpenContext | 16,827 | [Open in Explorer](explorer.html?collection=49e189be61689b3d#v=1&lat=32.2240&lng=48.5559&alt=40000&mode=point) |
+
+## What a preset URL is made of
+
+```
+explorer.html
+  ?collection=dd74c71982da0e21  # the collection facet (PKAP Survey Area)
+  #v=1                          # hash schema version
+  &lat=34.9836&lng=33.7071      # camera target (collection centroid)
+  &alt=40000                    # 40 km altitude → point mode
+  &mode=point                   # force individual sample dots
+```
+
+The `collection` value is a stable id (a hash of source + collection name) from
+`collections.parquet`. To build your own view, apply any combination of facets
+and camera in the explorer, then copy the browser's URL — every part of the
+state is encoded there.
diff --git a/data.qmd b/data.qmd
index be0c0751..f1ceb6b3 100644
--- a/data.qmd
+++ b/data.qmd
@@ -57,6 +57,7 @@ cite `https://data.isamples.org/<file>`.
 | Aggregate map clusters by zoom | [`h3_summary_res{4,6,8}.parquet`](https://data.isamples.org/isamples_202601_h3_summary_res4.parquet) | ≤ 2.4 MB each |
 | Filter by material / context / object-type | [`sample_facets_v2.parquet`](https://data.isamples.org/isamples_202601_sample_facets_v2.parquet) | 63 MB |
 | Walk relationships (graph queries) | [`isamples_202512_narrow.parquet`](https://data.isamples.org/isamples_202512_narrow.parquet) | 820 MB |
+| Browse / filter by collection (e.g. an OpenContext project) | [`collections.parquet`](https://data.isamples.org/isamples_202604_collections.parquet) + [`sample_collections.parquet`](https://data.isamples.org/isamples_202604_sample_collections.parquet) | 3 MB + 13 MB |
 | Translate vocabulary URIs to human-readable labels | [`vocab_labels.parquet`](https://data.isamples.org/vocab_labels.parquet) | 58 KB |
 
 ## 3. Copy-pasteable DuckDB snippets
diff --git a/explorer.qmd b/explorer.qmd
index c0b5c9e4..60b7b88e 100644
--- a/explorer.qmd
+++ b/explorer.qmd
@@ -640,6 +640,16 @@ Specimen Type <span>▾</span>
 <em style="font-size: 11px; color: #999;">Loading...</em>
 </div>
 </div>
+<div class="filter-section" id="collectionFilter">
+<div class="filter-header" onclick="this.nextElementSibling.style.display = this.nextElementSibling.style.display === 'none' ? 'block' : 'none'">
+Collection <span>▾</span>
+</div>
+<div class="filter-body" style="display: none;" id="collectionFilterWrap">
+<input type="text" id="collectionSearch" placeholder="Search collections…" autocomplete="off" aria-label="Search collections" style="width: 100%; box-sizing: border-box; margin-bottom: 6px; font-size: 12px; padding: 3px 5px;">
+<div id="collectionSearchResults" style="display: none; max-height: 160px; overflow-y: auto; border: 1px solid #eee; margin-bottom: 6px;"></div>
+<div id="collectionFilterBody"><em style="font-size: 11px; color: #999;">Loading...</em></div>
+</div>
+</div>
 <div id="facetNote" style="display: none; font-size: 11px; color: #888; margin-top: 4px; font-style: italic;">
 Material / feature / specimen filters apply at sample zoom level — zoom in or click a cluster.
 </div>
@@ -696,6 +706,18 @@ cross_filter_url = `${R2_BASE}/isamples_202601_facet_cross_filter.parquet`
 // SKOS prefLabels for Material / Sampled Feature / Specimen Type URIs.
 // ~60 KB lookup; falls back to URI tail if a URI isn't covered.
 vocab_labels_url = `${R2_BASE}/vocab_labels.parquet`
+// Collection facet (#243). Additive files built by scripts/build_collections.py
+// from the wide parquet's Sample→Event→Site traversal — they touch none of the
+// existing facet files. `collections` is the dimension (collection_id, label,
+// source, n_samples, centroid_lat/lng, bbox); `sample_collections` maps
+// pid → collection_id. A "collection" is a SamplingSite *label* (e.g. the
+// OpenContext project "PKAP Survey Area"), keyed by a stable hash of
+// (source, label).
+collections_url = `${R2_BASE}/isamples_202604_collections.parquet`
+sample_collections_url = `${R2_BASE}/isamples_202604_sample_collections.parquet`
+// How many top collections (by sample count) render as checkboxes; the long
+// tail (~60K) is reachable via the search box.
+COLLECTION_FACET_TOPN = 100
 
 // Canonical palette — see issue #113. Path-relative so this works under
 // both isamples.org (custom domain at root) and project-pages fork
@@ -805,6 +827,10 @@ function applyQueryToFacetFilters() {
     setCheckedValues('materialFilterBody', csvParamValues(params, 'material'));
     setCheckedValues('contextFilterBody', csvParamValues(params, 'context'));
     setCheckedValues('objectTypeFilterBody', csvParamValues(params, 'object_type'));
+    // Collection checkboxes are rendered as the union of top-N and the URL's
+    // collection ids (see facetFilters cell), so the values below already have
+    // matching rows by the time this runs.
+    setCheckedValues('collectionFilterBody', csvParamValues(params, 'collection'));
 }
 
 
@@ -823,6 +849,7 @@ function writeQueryState() {
         ['material', 'materialFilterBody'],
         ['context', 'contextFilterBody'],
         ['object_type', 'objectTypeFilterBody'],
+        ['collection', 'collectionFilterBody'],
     ].forEach(([key, containerId]) => {
         const values = getCheckedValues(containerId);
         if (values.length > 0) params.set(key, values.join(','));
@@ -881,7 +908,8 @@ function getCheckedValues(containerId) {
 function hasFacetFilters() {
     return getCheckedValues('materialFilterBody').length > 0
         || getCheckedValues('contextFilterBody').length > 0
-        || getCheckedValues('objectTypeFilterBody').length > 0;
+        || getCheckedValues('objectTypeFilterBody').length > 0
+        || getCheckedValues('collectionFilterBody').length > 0;
 }
 
 // Single source of truth for #facetNote visibility. The note ("filter
@@ -938,8 +966,20 @@ function facetFilterSQL() {
         const list = ot.map(s => `'${escSql(s)}'`).join(',');
         conds.push(`object_type IN (${list})`);
     }
-    if (conds.length === 0) return '';
-    return ` AND pid IN (SELECT DISTINCT pid FROM read_parquet('${facets_url}') WHERE ${conds.join(' AND ')})`;
+    let sql = '';
+    if (conds.length > 0) {
+        sql += ` AND pid IN (SELECT DISTINCT pid FROM read_parquet('${facets_url}') WHERE ${conds.join(' AND ')})`;
+    }
+    // Collection facet (#243) lives in its own membership file, so it appends a
+    // second independent subquery rather than a column in `facets_url`. Multiple
+    // checked collections are OR'd (IN list); they AND with the material/etc.
+    // predicate above.
+    const coll = getCheckedValues('collectionFilterBody');
+    if (coll.length > 0) {
+        const list = coll.map(s => `'${escSql(s)}'`).join(',');
+        sql += ` AND pid IN (SELECT pid FROM read_parquet('${sample_collections_url}') WHERE collection_id IN (${list}))`;
+    }
+    return sql;
 }
 
 // Shared viewport-padding factor. The samples table (PR #219), the
@@ -1792,6 +1832,102 @@ facetFilters = {
         renderFilter('materialFilterBody', 'material', grouped.material);
         renderFilter('contextFilterBody', 'context', grouped.context);
         renderFilter('objectTypeFilterBody', 'object_type', grouped.object_type);
+
+        // --- Collection facet (#243): top-N checkboxes + search-the-tail ---
+        // Reads from the additive collections.parquet dimension. Counts here are
+        // the collection's total sample count (static); unlike material/context/
+        // object_type they are NOT cross-filtered (no cross_filter cache for
+        // collections yet) — the dots and table still respect the filter via
+        // facetFilterSQL(). data-facet="collection" keeps applyFacetCounts() from
+        // touching these rows.
+        try {
+            const collBody = document.getElementById('collectionFilterBody');
+            const collSearch = document.getElementById('collectionSearch');
+            const collResults = document.getElementById('collectionSearchResults');
+            const urlCollIds = csvParamValues(new URLSearchParams(location.search), 'collection') || [];
+
+            const collRowHtml = (id, label, count) =>
+                `<label class="facet-row" data-facet="collection" data-value="${escAttr(id)}" title="${escAttr(label)}"><input type="checkbox" value="${escAttr(id)}"> ${escText(label)} <span class="facet-count" data-facet="collection" data-value="${escAttr(id)}" style="color:#999">(${Number(count).toLocaleString()})</span></label>`;
+
+            // Top-N by sample count, plus any ids named in the URL so deep links
+            // restore long-tail selections that aren't in the top-N.
+            const topRows = await db.query(`
+                SELECT collection_id, label, n_samples
+                FROM read_parquet('${collections_url}')
+                ORDER BY n_samples DESC
+                LIMIT ${COLLECTION_FACET_TOPN}
+            `);
+            const seen = new Set(topRows.map(r => r.collection_id));
+            let extraRows = [];
+            const missing = urlCollIds.filter(id => !seen.has(id));
+            if (missing.length) {
+                const list = missing.map(s => `'${escSql(s)}'`).join(',');
+                extraRows = await db.query(`
+                    SELECT collection_id, label, n_samples
+                    FROM read_parquet('${collections_url}')
+                    WHERE collection_id IN (${list})
+                `);
+            }
+            const allRows = extraRows.concat(topRows);
+            if (collBody) {
+                collBody.innerHTML = allRows.length
+                    ? allRows.map(r => collRowHtml(r.collection_id, r.label, r.n_samples)).join('')
+                    : '<em style="font-size: 11px; color: #999;">No collections</em>';
+            }
+
+            // Search box → query the full dimension by label; clicking a result
+            // injects a checked row into the body (if absent) and fires the same
+            // change event the checkboxes do, so the existing handler reruns.
+            if (collSearch && collResults && collBody) {
+                let collSearchTimer = null;
+                const runCollSearch = async () => {
+                    const term = collSearch.value.trim();
+                    if (term.length < 2) { collResults.style.display = 'none'; collResults.innerHTML = ''; return; }
+                    const esc = escapeIlikePattern(term);
+                    const rows = await db.query(`
+                        SELECT collection_id, label, n_samples
+                        FROM read_parquet('${collections_url}')
+                        WHERE label ILIKE '%${esc}%' ESCAPE '\\'
+                        ORDER BY n_samples DESC
+                        LIMIT 25
+                    `);
+                    collResults.innerHTML = rows.length
+                        ? rows.map(r =>
+                            `<div class="collection-search-row" data-id="${escAttr(r.collection_id)}" data-label="${escAttr(r.label)}" data-count="${Number(r.n_samples)}" style="padding: 3px 5px; cursor: pointer; font-size: 12px;">${escText(r.label)} <span style="color:#999">(${Number(r.n_samples).toLocaleString()})</span></div>`
+                          ).join('')
+                        : '<em style="font-size: 11px; color: #999; padding: 4px; display:block;">No matches</em>';
+                    collResults.style.display = 'block';
+                };
+                collSearch.addEventListener('input', () => {
+                    clearTimeout(collSearchTimer);
+                    collSearchTimer = setTimeout(runCollSearch, 250);
+                });
+                collResults.addEventListener('click', (ev) => {
+                    const row = ev.target.closest('.collection-search-row');
+                    if (!row) return;
+                    const id = row.getAttribute('data-id');
+                    const selector = `input[type="checkbox"][value="${id}"]`;
+                    let cb = collBody.querySelector(selector);
+                    if (!cb) {
+                        collBody.insertAdjacentHTML('afterbegin',
+                            collRowHtml(id, row.getAttribute('data-label'), row.getAttribute('data-count')));
+                        cb = collBody.querySelector(selector);
+                    }
+                    if (cb && !cb.checked) {
+                        cb.checked = true;
+                        collBody.dispatchEvent(new Event('change', { bubbles: true }));
+                    }
+                    collSearch.value = '';
+                    collResults.style.display = 'none';
+                    collResults.innerHTML = '';
+                });
+            }
+        } catch (err) {
+            console.warn('collection facet setup failed:', err);
+            const collBody = document.getElementById('collectionFilterBody');
+            if (collBody) collBody.innerHTML = '<em style="font-size: 11px; color: #999;">Collections unavailable</em>';
+        }
+
         applyFacetCounts('source', null);
         applyQueryToFacetFilters();
 
@@ -2960,6 +3096,7 @@ zoomWatcher = {
             material: getCheckedValues('materialFilterBody').slice().sort(),
             context: getCheckedValues('contextFilterBody').slice().sort(),
             object_type: getCheckedValues('objectTypeFilterBody').slice().sort(),
+            collection: getCheckedValues('collectionFilterBody').slice().sort(),
         });
     }
 
@@ -3427,6 +3564,7 @@ zoomWatcher = {
     document.getElementById('materialFilterBody').addEventListener('change', handleFacetFilterChange);
     document.getElementById('contextFilterBody').addEventListener('change', handleFacetFilterChange);
     document.getElementById('objectTypeFilterBody').addEventListener('change', handleFacetFilterChange);
+    document.getElementById('collectionFilterBody')?.addEventListener('change', handleFacetFilterChange);
 
     // --- Camera change handler ---
     let timer = null;
diff --git a/scripts/build_collections.py b/scripts/build_collections.py
new file mode 100644
index 00000000..bea7ef70
--- /dev/null
+++ b/scripts/build_collections.py
@@ -0,0 +1,201 @@
+#!/usr/bin/env python3
+"""
+Build the supplementary parquet files that power the explorer's *collection*
+facet (issue #243).
+
+A "collection" is the human-readable **label** of a SamplingSite (e.g. the
+OpenContext project "PKAP Survey Area"). That identity does NOT live on the
+MaterialSampleRecord rows the explorer renders; it is reached by traversal
+through the wide parquet's relationship arrays:
+
+    MaterialSampleRecord.p__produced_by[1]  -> SamplingEvent.row_id
+    SamplingEvent.p__sampling_site[1]        -> SamplingSite.row_id
+    SamplingSite.label                       -> the collection name
+
+Many SamplingSite rows share one label (e.g. ~1,336 rows are "PKAP Survey
+Area"), so a collection aggregates over all of them. We therefore key a
+collection on a stable hash of (source, label), NOT on a site pid.
+
+Doing this traversal live in DuckDB-WASM per facet interaction is the
+documented in-browser bottleneck, so we precompute here. Two ADDITIVE outputs
+(they touch none of the existing facet files):
+
+  1. collections.parquet      -- dimension, one row per collection:
+       collection_id, label, source, n_samples,
+       centroid_lat, centroid_lng, min_lat, max_lat, min_lng, max_lng
+     Powers the top-N checkbox list, the long-tail search box, and the
+     Featured-Collections preset camera targets.
+
+  2. sample_collections.parquet -- membership, one row per sample that has a
+     collection: pid, collection_id
+     The explorer filters with:
+       AND pid IN (SELECT pid FROM read_parquet('<sample_collections>')
+                   WHERE collection_id IN (...))
+     exactly parallel to the existing facet predicate at explorer.qmd:942.
+
+Usage:
+    python build_collections.py \
+        --wide https://data.isamples.org/current/wide.parquet \
+        --out-dir /tmp/collections_build \
+        --snapshot 202604
+
+Verify against the live data without writing files:
+    python build_collections.py --dry-run
+"""
+from __future__ import annotations
+
+import argparse
+import os
+import sys
+import time
+
+import duckdb
+
+DEFAULT_WIDE = "https://data.isamples.org/current/wide.parquet"
+
+
+def build(wide_url: str, out_dir: str, snapshot: str, dry_run: bool) -> dict:
+    con = duckdb.connect()
+    con.sql("INSTALL httpfs; LOAD httpfs;")
+
+    t0 = time.time()
+    # Pull only the columns the traversal needs, for the three entity types.
+    con.sql(
+        f"""
+        CREATE TEMP TABLE w AS
+        SELECT row_id, pid, otype, n AS source, label, latitude, longitude,
+               p__produced_by, p__sampling_site
+        FROM read_parquet('{wide_url}')
+        WHERE otype IN ('MaterialSampleRecord','SamplingEvent','SamplingSite')
+        """
+    )
+    print(f"[1/4] loaded traversal columns in {time.time() - t0:.1f}s")
+
+    # Lookup tables for the two hops.
+    con.sql(
+        "CREATE TEMP TABLE site AS "
+        "SELECT row_id AS site_rid, label AS site_label "
+        "FROM w WHERE otype='SamplingSite' AND label IS NOT NULL"
+    )
+    # Unnest the sampling_site array so an event with multiple sites maps to
+    # all of them (not just the first).
+    con.sql(
+        "CREATE TEMP TABLE evt AS "
+        "SELECT row_id AS evt_rid, UNNEST(p__sampling_site) AS site_rid "
+        "FROM w WHERE otype='SamplingEvent' AND p__sampling_site IS NOT NULL"
+    )
+
+    # Per-sample collection membership. Unnest BOTH relationship arrays
+    # (produced_by → events, sampling_site → sites) so a sample with multiple
+    # events / a site list joins through all of them — otherwise a member could
+    # be silently dropped from a non-first collection. DISTINCT collapses the
+    # fan-out to one row per (pid, collection). collection_id is a stable 16-hex
+    # digest of (source, label) so it survives rebuilds and is URL-safe.
+    con.sql(
+        """
+        CREATE TEMP TABLE memb AS
+        SELECT DISTINCT
+            s.pid AS pid,
+            substr(md5(coalesce(s.source,'') || '\x1f' || st.site_label), 1, 16) AS collection_id,
+            st.site_label AS label,
+            s.source AS source,
+            s.latitude AS lat,
+            s.longitude AS lng
+        FROM (
+            SELECT pid, source, latitude, longitude, UNNEST(p__produced_by) AS evt_rid
+            FROM w
+            WHERE otype='MaterialSampleRecord' AND pid IS NOT NULL
+              AND p__produced_by IS NOT NULL
+        ) s
+        JOIN evt e ON e.evt_rid = s.evt_rid
+        JOIN site st ON st.site_rid = e.site_rid
+        """
+    )
+    print(f"[2/4] built membership in {time.time() - t0:.1f}s")
+
+    # Collections dimension (one row per collection).
+    con.sql(
+        """
+        CREATE TEMP TABLE collections AS
+        SELECT
+            collection_id,
+            any_value(label) AS label,
+            any_value(source) AS source,
+            COUNT(DISTINCT pid) AS n_samples,
+            round(median(lat), 5) AS centroid_lat,
+            round(median(lng), 5) AS centroid_lng,
+            round(min(lat), 5) AS min_lat,
+            round(max(lat), 5) AS max_lat,
+            round(min(lng), 5) AS min_lng,
+            round(max(lng), 5) AS max_lng
+        FROM memb
+        GROUP BY collection_id
+        """
+    )
+
+    stats = {
+        "samples_with_collection": con.sql("SELECT COUNT(DISTINCT pid) FROM memb").fetchone()[0],
+        "n_collections": con.sql("SELECT COUNT(*) FROM collections").fetchone()[0],
+        "pkap_samples": con.sql(
+            "SELECT n_samples FROM collections WHERE label='PKAP Survey Area'"
+        ).fetchone(),
+    }
+    print(f"[3/4] aggregated {stats['n_collections']:,} collections; "
+          f"{stats['samples_with_collection']:,} samples carry one")
+    pkap = stats["pkap_samples"][0] if stats["pkap_samples"] else None
+    print(f"      PKAP Survey Area -> {pkap} samples "
+          f"(expected ~15,446)")
+
+    print("\n      Top 10 collections by sample count:")
+    print(con.sql(
+        "SELECT label, source, n_samples, centroid_lat, centroid_lng "
+        "FROM collections ORDER BY n_samples DESC LIMIT 10"
+    ).df().to_string(index=False))
+
+    if dry_run:
+        print("\n[4/4] --dry-run: no files written")
+        return stats
+
+    os.makedirs(out_dir, exist_ok=True)
+    dim_path = os.path.join(out_dir, f"isamples_{snapshot}_collections.parquet")
+    memb_path = os.path.join(out_dir, f"isamples_{snapshot}_sample_collections.parquet")
+
+    con.sql(
+        f"COPY (SELECT * FROM collections ORDER BY n_samples DESC) "
+        f"TO '{dim_path}' (FORMAT PARQUET, COMPRESSION ZSTD)"
+    )
+    con.sql(
+        # Order by collection_id so the explorer's `WHERE collection_id IN (...)`
+        # filter can prune row groups (and it compresses better).
+        f"COPY (SELECT DISTINCT pid, collection_id FROM memb ORDER BY collection_id, pid) "
+        f"TO '{memb_path}' (FORMAT PARQUET, COMPRESSION ZSTD)"
+    )
+    print(f"\n[4/4] wrote:\n  {dim_path} ({os.path.getsize(dim_path)/1e6:.1f} MB)"
+          f"\n  {memb_path} ({os.path.getsize(memb_path)/1e6:.1f} MB)")
+    stats["dim_path"] = dim_path
+    stats["memb_path"] = memb_path
+    return stats
+
+
+def main(argv=None) -> int:
+    ap = argparse.ArgumentParser(description="Build collection facet parquet files (#243)")
+    ap.add_argument("--wide", default=DEFAULT_WIDE,
+                    help="wide parquet URL (default: %(default)s)")
+    ap.add_argument("--out-dir", default="/tmp/collections_build",
+                    help="output directory (default: %(default)s)")
+    ap.add_argument("--snapshot", default="202604",
+                    help="snapshot tag for filenames (default: %(default)s)")
+    ap.add_argument("--dry-run", action="store_true",
+                    help="compute and report, but write no files")
+    args = ap.parse_args(argv)
+
+    try:
+        build(args.wide, args.out_dir, args.snapshot, args.dry_run)
+    except Exception as exc:  # noqa: BLE001
+        print(f"ERROR: {exc}", file=sys.stderr)
+        return 1
+    return 0
+
+
+if __name__ == "__main__":
+    raise SystemExit(main())
diff --git a/tests/test_collections.py b/tests/test_collections.py
new file mode 100644
index 00000000..aafd83a1
--- /dev/null
+++ b/tests/test_collections.py
@@ -0,0 +1,61 @@
+"""
+Feature: Collections (issue #243)
+  As someone exploring iSamples
+  I want to jump to a named collection (e.g. an OpenContext project) and
+  filter the explorer to exactly its samples
+  So that I can browse meaningful groupings, not just map locations.
+
+These tests validate the static markup the feature ships: the Collections
+landing page and the explorer's `collection` facet DOM. They do NOT require the
+collections.parquet / sample_collections.parquet files to be live on R2 — the
+data-layer behavior is verified separately (see scripts/build_collections.py and
+the data-contract checks). Run the live facet verification after those two files
+are uploaded to data.isamples.org.
+"""
+from conftest import SITE_URL
+
+COLLECTIONS_URL = f"{SITE_URL}/collections.html"
+EXPLORER_URL = f"{SITE_URL}/explorer.html"
+
+# Stable id for PKAP Survey Area = substr(md5('OPENCONTEXT\x1fPKAP Survey Area'), 1, 16)
+PKAP_COLLECTION_ID = "dd74c71982da0e21"
+
+
+class TestCollectionsPage:
+    """Scenario: the Collections landing page lists featured collections."""
+
+    def test_page_renders(self, page):
+        page.goto(COLLECTIONS_URL, wait_until="domcontentloaded")
+        assert page.get_by_text("Featured Collections").count() > 0
+
+    def test_lists_pkap(self, page):
+        page.goto(COLLECTIONS_URL, wait_until="domcontentloaded")
+        assert page.get_by_text("PKAP", exact=False).count() > 0
+
+    def test_presets_use_collection_param(self, page):
+        """Each preset links into the explorer with a ?collection=<id> filter."""
+        page.goto(COLLECTIONS_URL, wait_until="domcontentloaded")
+        links = page.locator("a[href*='explorer.html?collection=']")
+        assert links.count() >= 12
+
+    def test_pkap_preset_id(self, page):
+        page.goto(COLLECTIONS_URL, wait_until="domcontentloaded")
+        assert page.locator(
+            f"a[href*='collection={PKAP_COLLECTION_ID}']"
+        ).count() >= 1
+
+
+class TestExplorerCollectionFacet:
+    """Scenario: the explorer exposes a Collection facet (search + checkboxes)."""
+
+    def test_collection_filter_section_present(self, page):
+        page.goto(EXPLORER_URL, wait_until="domcontentloaded")
+        assert page.locator("#collectionFilter").count() == 1
+
+    def test_collection_search_box_present(self, page):
+        page.goto(EXPLORER_URL, wait_until="domcontentloaded")
+        assert page.locator("#collectionSearch").count() == 1
+
+    def test_collection_body_present(self, page):
+        page.goto(EXPLORER_URL, wait_until="domcontentloaded")
+        assert page.locator("#collectionFilterBody").count() == 1