release(v1.5.3): Open-Meteo forecast-join correctness (#67) + research() docs (#52) by helloiamvu · Pull Request #71 · mostlyrightmd/mostlyright-sdk

helloiamvu · 2026-06-06T13:07:30Z

Summary

Integration PR from merged-vision → main for the v1.5.3 patch release. Bundles two merged feature PRs.

PR	Issue	Type
#69	#67	fix — OM/IEM forecast discrimination
#70	#52	docs — research() granularity

What's in 1.5.3

Fixed (#67): build_pairs_row() split forecast records by issued_at presence, but Phase 20+ Open-Meteo rows carry a derived issued_at and were misrouted into the IEM MOS aggregation path — silently nulling forecast temps and polluting IEM run-selection for multi-source callers. Now split by the authoritative source field, with:

legacy source-less/issued_at-less rows kept as Open-Meteo (backward compat),
temperature_f fallback + pop_6hr_pct/qpf_6hr_in + fcst_issued_at provenance preserved through the source-routed path,
lookahead-leakage guard: OM rows from runs issued after market close are excluded from aggregation, not just the displayed timestamp.

Docs (#52): research() Returns docstring clarifies the daily-row return granularity.

Verification

uv run pytest -m "not live" — green on integrated merged-vision (incl. the HARD parity gate tests/test_parity.py).
uv build — all three wheels build at 1.5.3 (mostlyrightmd, mostlyrightmd-weather, mostlyrightmd-markets).
Codex review (gpt-5.5, medium) on fix(pairs): discriminate OM/IEM forecasts by source, not issued_at (#67) #69 — 4 iterations (1 P1 leakage + 3 P2) all resolved, final clean. docs(research): clarify research() returns daily rows, not hourly obs (#52) #70 clean.

Version

1.5.2 → 1.5.3 across all three pyproject.toml files; CHANGELOG entry added.

⚠️ Release tag held: the v1.5.3 tag (which triggers PyPI trusted publishing per CLAUDE.md) is prepared locally but not pushed — push it after this PR merges to main so publication happens from the merged main commit, not the integration branch.

TS Parity

#67 is a Python-internal forecast-join fix (no public column change); the TS twin's buildPairsRow should get a parity ticket per CROSS-SDK-SYNC.md. #70 is docs-only (N/A).

…trim (closes #64)

- Resolve the latent GFS precipitation twin bug by adding a `_pick_record` helper to forecast_nwp.py. It filters duplicates by prioritizing instantaneous records and breaking ties using lowest `record_no`. - Implement `cloud_cover_pct`, `visibility_m`, and `cloud_ceiling_m` for GFS and HRRR. - Define and integrate physics-bounds QC rules for the three new fields. - Regenerate schema.forecast_nwp.v1 JSON schemas. - Add comprehensive test coverage in test_qc_rules_nwp.py, test_forecast_nwp.py, and test_forecast_nwp_multi_cycle.py. - Include planning artifacts and research briefs in .briefs/ directory.

Reverted accidental inclusion of all extras in root dependency list. The [nwp] extra pulls eccodes which is not available in CI without the full extra install, causing test detection guards to pass while the actual eccodes binary fails to load.

Phase 20+ Open-Meteo rows carry a derived issued_at, so the old issued_at-presence split in build_pairs_row() misrouted them into the IEM MOS aggregation path — silently nulling forecast temps and polluting IEM run-selection when both sources are combined. Discriminate by the authoritative source field instead (open_meteo* -> Open-Meteo, else IEM), matching the contract research._fetch_open_meteo_range already documents. Also teach _aggregate_fcst_temps_openmeteo to fall back to a pre-converted temperature_f so source-discriminated rows from the research() wrapper (which emits temperature_f, not temperature_c) aggregate correctly instead of returning null. TDD: 3 new regression tests in test_pairs.py cover the derived-issued_at classification, IEM-still-preferred, and the research-wrapper temperature_f shape.

… split (#67)

…#52) A user asked why research() returns daily means and how to access hourly station observations. Clarify in the Returns section that research() yields one daily settlement-summary row per date (obs_* are window aggregates), not sub-daily rows, and point to weather.obs() and the Sprint 0.5+ raw_metar roadmap.

Codex review caught a regression: research._fetch_open_meteo_range emits OM rows carrying pop_6hr_pct / qpf_6hr_in (not precipitation_probability_pct and no QPF read). Pre-#67 those rows flowed through the IEM branch, which set both fields; after source-routing they hit the OM branch, which only read precipitation_probability_pct and never set QPF — silently nulling precip columns for research(forecast_source="open_meteo"). OM branch now accepts the pop_6hr_pct alias (explicit None-checks keep a valid 0.0) and sums qpf_6hr_in over the window, matching IEM semantics. +2 regression tests.

…odex P2) Codex flagged that the pure source-prefix split regressed the previously documented Open-Meteo shape (no source, no issued_at, temperature_c) to the IEM branch -> null temps. Extract _is_open_meteo_record(): source prefixed open_meteo OR (no source AND no issued_at). Real IEM rows always carry an issued_at, so a record missing both can only be legacy OM. +1 regression test.

… codex P2) Codex flagged leakage-safety provenance loss: Phase 20+ OM rows carry a derived issued_at; pre-#67 the IEM branch set fcst_issued via _select_best_run, but after source routing the OM branch left fcst_issued_at None for forecast_source="open_meteo". Restore it as the most-recent OM issued_at at-or-before market close — never leaking a run issued after settlement. +2 regression tests.

…P1 leakage) Codex P1: my provenance fix filtered issued_at only for the displayed timestamp; the OM aggregation still summed window rows issued AFTER market close, leaking their temp/POP/QPF into the training pair (lookahead). Pre-#67 _select_best_run applied this cutoff; the OM branch did not. Now filter window_om by issued_at <= market_close before aggregating temps/POP/QPF/ issued_at. Legacy source-less rows (no issued_at) are kept — nothing to leak. Strengthened test asserts the after-close temp is excluded, not just hidden.

…h() docs (#52) Bump all three packages 1.5.2 -> 1.5.3. - #67 (#69): build_pairs_row() now discriminates OM/IEM forecasts by source instead of issued_at presence, fixing silent null temps + IEM run-selection pollution for Phase 20+ multi-source callers; closes a lookahead-leakage path for after-close OM runs. Parity gate unaffected. - #52 (#70): research() docstring clarifies daily-row return granularity.

github-actions · 2026-06-06T13:07:39Z

✅ Docs-required check: PASS

API-surface change includes docs updates — no reminder needed.

API-surface files changed:

packages-ts/codegen/src/codegen.ts
packages-ts/core/src/schemas/validators/format-map.ts
packages-ts/core/src/schemas/validators/index.ts
packages-ts/core/src/schemas/validators/schema_forecast_nwp_v1.d.ts
packages-ts/core/src/schemas/validators/schema_forecast_nwp_v1.js
packages/core/src/mostlyright/core/schemas/forecast_nwp.py
packages/core/src/mostlyright/research.py
packages/weather/src/mostlyright/weather/_fetchers/_nwp_grids/gfs.py
packages/weather/src/mostlyright/weather/_fetchers/_nwp_grids/hrrr.py
packages/weather/src/mostlyright/weather/_fetchers/_open_meteo.py
packages/weather/src/mostlyright/weather/forecast_nwp.py
packages/weather/src/mostlyright/weather/qc/rules_nwp.py

Docs files changed:

.briefs/cloud-cover-deep-research.md
.briefs/github-issue-63-nwp-fields-review.md
.briefs/github-issue-pairs-source-misclassification.md
.briefs/implementation_plan.md
.briefs/issue-63-review-report.md
.briefs/issue-64-brief.md
.briefs/issue-64-gemini-review-brief.md
.briefs/issue-pairs-source-misclassification.md
.briefs/task.md
.briefs/walkthrough.md
CHANGELOG.md

github-actions · 2026-06-06T13:07:44Z

Parity ticket gate: PASSED

parity-ticket-check: PR touches BOTH Python and TypeScript trigger surfaces — paired-language change satisfies the gate.

See CROSS-SDK-SYNC.md §2 for the workflow.

…riable trim (#64)

, codex P2) Codex P2 on #66: _fetch_open_meteo_range clamped the fetch span to the request's [start, end], so a subrange request wrote an incomplete monthly forecast partition; a later same-month window then read it as a hit and silently dropped the uncached days. Fetch on full-month boundaries (clamped only to today to avoid future dates) so every written elapsed-month partition is complete. Current UTC month is never read-served nor written, so its partialness stays harmless. +1 regression test.

…precip fix (#63)

… P2) Codex P2 on #68: scripts/export_schemas.py exported schema.forecast_nwp.v1 but the TS codegen SCHEMA_FILES list omitted it, so pnpm codegen never emitted a ForecastNwpV1 type/validator and TS consumers stayed unaware of the new public NWP columns (dual-SDK parity gap). Add the schema to the TS codegen list and regenerate: new generated forecast_nwp.v1.ts + ajv validator + barrel/format-map exports. Codegen + core validator tests green.

…-join fix (#67) + docs (#52) Minor release (features added -> 1.6.0, not a patch). Bump all three packages 1.5.3 -> 1.6.0; regenerate uv.lock (codex P2 on #71: lock was stale at 1.5.2/1.5.3). CHANGELOG: fold the never-published 1.5.3 entry into 1.6.0 and add the #63 NWP-fields and #64 OM-cache/throttle feature entries. Bundles PRs #67/#52 (already in merged-vision) + #66/#64 + #68/#63, each with its codex P2 resolved on this integration branch.

…#64, codex P2) Codex P2 on updated #71: the >14-day Previous-Runs chunking path concatenates per-chunk DataFrames, and pd.concat drops df.attrs (same for the Single-Runs boolean clip). Chunked/clipped frames then lacked attrs[source]/[retrieved_at], failing validate_dataframe's source_attr_required for long windows. Restamp the combined frame's provenance from the per-chunk frames (latest retrieved_at). +1 regression test on a 3-chunk window.

, codex P2) Codex P2 on updated #71: the #63 disambiguation silently picked lowest record_no for ANY ambiguous (variable, level), including multiple DISTINCT aggregation windows with no instantaneous record — populating a column from an arbitrary window (silently-wrong NWP data). Restore the loud-fail guard issue #63 intended: _pick_record resolves only the known cases (instantaneous preferred; identical-window twins like the GFS APCP pair) and returns None otherwise, so _extract_records raises GribIntegrityError. Corrected the PR-68 test that had encoded the silently-pick behavior + added a raise-path test.

…dow (#64, codex P2) Codex P2 on updated #71: the weight-aware polite delay scaled num_days by the requested window even for Single-Runs, which ignores start_date/end_date and returns a fixed ~168h horizon from run= in one call. An exact-cycle multi-month or year request therefore slept tens of seconds after a single response. Use a fixed 7-day horizon for the Single-Runs cost; Previous-Runs chunks unchanged. +1 regression test (year window -> single 0.2s sleep, not ~5s).

#63, codex P2) Codex P2 on updated #71: the new ForecastNwpV1 TS export shipped but the four packages-ts package.json files still reported 1.5.2, so release-ts-preflight would reject a vts-1.6.0 publish and npm consumers couldn't receive the new type. Bump @mostlyrightmd/{core,weather,markets} + meta to 1.6.0 (workspace:* cross-deps need no change) and note the dual PyPI+npm bump in the CHANGELOG.

… codex P1) Codex P1: research() now calls fetch_open_meteo(variables=...), a kwarg added in weather 1.6.0. The research extra still allowed weather>=1.0.0, so a core-1.6.0 + weather-1.5.x install would TypeError on research(forecast_source="open_meteo") before fetching. Bump the floor to >=1.6.0,<2.0 and relock.

…uns (#64, codex P2) Codex P2: only Previous-Runs was chunked, but seamless/live also bill by start_date/end_date, so a >14-day seamless window still went out as one high-weight call — the exact rate-limit behavior #64 fixes. Chunk every endpoint except Single-Runs (which sends run= for a fixed horizon). +1 test on a 30-day seamless window.

…codex P1) Codex P1: narrowing the research-extra weather pin to >=1.6.0,<2.0 broke test_core_research_extra_pins_weather_to_active_major, which only accepted >=1.0.0,<2.0. Update the contract to require an active-major (<2.0) pin with a >=1.6.0 floor (the fetch_open_meteo variables= kwarg requirement), accepting any 1.x>=6 floor.

…are (#67, codex P2) Codex P2: rows from fetch_open_meteo(...).to_dict('records') carry pandas Timestamp valid_at/issued_at. The source-routed OM branch compares them to ISO string window bounds -> TypeError for direct callers of build_pairs_row (the research() wrapper stringifies, but the internal builder should be robust). Add _to_iso_z() and normalize the OM rows up front (shallow copies; caller dicts untouched). +1 regression test with pd.Timestamp fields.

…ex P2) Codex P2: raw fetch_open_meteo(...).to_dict('records') rows name the Celsius temp column temp_c (not temperature_c/temperature_f). The source-routed OM aggregation read only the latter two, returning null highs/lows for direct callers. Accept temp_c (°C->F) alongside temperature_c and the temperature_f fallback. Updated the Timestamp regression test to use the canonical temp_c.

zach and others added 14 commits June 4, 2026 20:15

fix(weather): wire forecast cache + weight-aware throttle + variable …

bc5759f

…trim (closes #64)

fix: update uv.lock after pyproject.toml reversion

b89b724

docs(pairs): update module docstring to reflect source-based forecast…

9ca5b48

… split (#67)

Merge PR #69: fix OM/IEM forecast discrimination by source (#67)

237c437

Merge PR #70: clarify research() daily granularity (#52)

b6c98aa

helloiamvu requested a review from Tarabcak June 6, 2026 13:07

helloiamvu added 13 commits June 6, 2026 16:17

Merge PR #66: Open-Meteo forecast cache wiring + weight throttle + va…

dabb364

…riable trim (#64)

Merge PR #68: NWP cloud_cover_pct/visibility_m/cloud_ceiling_m + GFS …

1fcbfdf

…precip fix (#63)

helloiamvu merged commit 4c0bd19 into main Jun 6, 2026
19 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

release(v1.5.3): Open-Meteo forecast-join correctness (#67) + research() docs (#52)#71

release(v1.5.3): Open-Meteo forecast-join correctness (#67) + research() docs (#52)#71
helloiamvu merged 28 commits into
mainfrom
merged-vision

helloiamvu commented Jun 6, 2026

Uh oh!

github-actions Bot commented Jun 6, 2026 •

edited

Loading

Uh oh!

github-actions Bot commented Jun 6, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

helloiamvu commented Jun 6, 2026

Summary

What's in 1.5.3

Verification

Version

TS Parity

Uh oh!

github-actions Bot commented Jun 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions Bot commented Jun 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

github-actions Bot commented Jun 6, 2026 •

edited

Loading

github-actions Bot commented Jun 6, 2026 •

edited

Loading