You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Expand how-to-use.qmd into a full data catalog (#123)
Replaces the minimal (and slightly inaccurate — res4 was listed as
~70 KB, actually 580 KB; lite as ~150 MB, actually 60 MB) Data Files
table with a proper catalog organized by use case:
- Architecture note: all files served via data.isamples.org backed by
Cloudflare R2 with the immutable cache-control Worker (deployed
2026-04-17). File naming convention documented.
- Primary datasets (wide, wide+H3, narrow) with size, shape, row count,
and when-to-use guidance.
- Pre-aggregated helpers (facet_summaries, facet_cross_filter,
sample_facets_v2) with their tiny sizes and why they exist.
- H3 geospatial aggregates at three resolutions with typical altitude.
- Lite sample-point file.
- Cross-reference matrix: which tutorial uses which file.
- Python quick-query recipe.
Intended as the single source of truth that tutorials can link into
rather than re-describing each file.
Co-authored-by: Claude Opus 4.7 <noreply@anthropic.com>
All three represent the same underlying data (SESAR + OpenContext + GEOME
68
+
+ Smithsonian) with identical semantics — they differ only in serialization
69
+
strategy. See the
70
+
[Technical: Narrow vs Wide tutorial](/tutorials/narrow_vs_wide_performance.html)
71
+
for a performance comparison.
72
+
73
+
### Pre-aggregated helpers {.unnumbered}
74
+
75
+
Small lookup tables computed ahead of time so a page can render facets
76
+
and counts instantly, without touching the 278 MB primary file:
77
+
78
+
| File | Size | Contents | Use when… |
79
+
|---|---:|---|---|
80
+
|[`isamples_202601_facet_summaries.parquet`](https://data.isamples.org/isamples_202601_facet_summaries.parquet)| 2 KB |`(facet_type, facet_value, count)` for source, material, context, object_type | You want instant initial facet counts with no filters applied |
81
+
|[`isamples_202601_facet_cross_filter.parquet`](https://data.isamples.org/isamples_202601_facet_cross_filter.parquet)| 6 KB | Pre-computed counts for single-facet selections | You want instant cross-filtered counts for a single active filter |
82
+
|[`isamples_202601_sample_facets_v2.parquet`](https://data.isamples.org/isamples_202601_sample_facets_v2.parquet)| 63 MB |`(pid, material, context, object_type)` facet URIs per sample | You need to filter on *combinations* of facets at query time |
83
+
84
+
### Geospatial aggregates (H3) {.unnumbered}
85
+
86
+
Hexagonal H3 cells pre-aggregated at three resolutions for zoom-adaptive
87
+
globe rendering. Each row: `h3_cell, center_lat, center_lng, sample_count,
88
+
dominant_source, source_count`.
89
+
90
+
| File | Size | Cells | Typical altitude |
91
+
|---|---:|---:|---|
92
+
|[`isamples_202601_h3_summary_res4.parquet`](https://data.isamples.org/isamples_202601_h3_summary_res4.parquet)| 580 KB |~38 K | Continental (world view) |
0 commit comments