Upsert keys in .env for benchmarks auth/init instead of appending by kaggle-agent · Pull Request #1037 · Kaggle/kaggle-cli

kaggle-agent · 2026-05-29T18:35:04Z

kaggle b auth and kaggle b init opened .env in append mode, so
every rerun stacked a fresh copy of MODEL_PROXY_* (and LLM_*)
onto the bottom of the file. Two problems: the user's .env grew
without bound, and any value the user had hand-edited got silently
overridden by the appended copy below it (python-dotenv reads the
last occurrence). Switch to dotenv.set_key(..., quote_mode="never")
so each managed key is refreshed in place, preserving comments, blank
lines, and unrelated keys.

Migration note: users with stacked duplicates from prior runs keep
those lines, but set_key rewrites every occurrence to the fresh
value, so no stale token lingers in the file after the first rerun
on this code.

Task: limagoog-20260529183140-af82f3a6
Context: https://chat.kaggle.net/kaggle/pl/46qfted3gbyqdkcpt447cusich

`kaggle b auth` and `kaggle b init` opened `.env` in append mode, so each rerun stacked duplicate MODEL_PROXY_* (and LLM_*) lines. Since python-dotenv reads the first occurrence, the oldest expired token won — and any hand-edited values were silently shadowed. Replace the append with an in-place upsert that preserves comments, blank lines, and unrelated keys, so reruns are idempotent. Co-authored-by: kaggle-agent <kaggle-agent@users.noreply.github.com>

`hatch run lint:fmt` collapsed a couple of adjacent-string blocks in the new upsert tests onto single lines. Formatting-only — no behavior change. Co-authored-by: kaggle-agent <kaggle-agent@users.noreply.github.com>

dolaameng · 2026-05-29T19:33:28Z

+    _ENV_LINE_RE = re.compile(r"^[ \t]*(?:export[ \t]+)?([A-Za-z_][A-Za-z0-9_]*)[ \t]*=")
+
+    @classmethod
+    def _upsert_env_file(cls, env_file_abs, env_vars):


Shall we use existing library like python-dotenv instead of doing it from scratch?

Switched to python-dotenv. Added python-dotenv to dependencies, removed the bespoke _upsert_env_file helper + _ENV_LINE_RE, and replaced the write block with a small loop calling dotenv.set_key(env_file_abs, key, value, quote_mode="never"). quote_mode="never" preserves the existing unquoted output format (so the existing auth/init assertions like MODEL_PROXY_URL=https://... still hold).

dolaameng · 2026-05-29T19:36:55Z

        assert "LLM_DEFAULT=google/gemini-3-flash-preview\n" in content


+# ============================================================


These are too much new test code. We want good coverage but less redundancy. e.g., test_writes_to_empty_file, test_rerunning_auth_replaces_existing_keys``test_preserves_user_added_keys_and_comments are all redundant. Specially if we decide to adopt python-dotenv, which already provides tested expected behavior. Can simplify these?

Trimmed the new tests significantly now that python-dotenv owns the line-by-line semantics. Deleted the whole TestUpsertEnvFile class (8 tests) and the redundant test_rerunning_auth_replaces_existing_keys integration test. Kept only 3 integration tests under TestBenchmarksUpsert that verify the user-facing contract through the actual CLI: stale token gets replaced on rerun, user-added keys + comments survive, and init reruns are idempotent across all six managed keys. Net diff is now ~55 lines of test code instead of ~210.

Per review, replace the bespoke `_upsert_env_file` helper with `dotenv.set_key(..., quote_mode="never")`, which already implements the exact upsert semantics we want. Drop the standalone helper tests now that line-by-line behavior is owned upstream, keeping just three integration tests through the CLI: refresh replaces stale token, user edits survive, and `init` reruns are idempotent. Co-authored-by: kaggle-agent <kaggle-agent@users.noreply.github.com>

dolaameng · 2026-05-29T19:50:28Z

+# ============================================================
+
+
+class TestBenchmarksUpsert:


Shall we add a test for the migration-from-duplicates scenario that's the main user-facing promise.

Added test_first_rerun_after_upgrade_refreshes_stale_duplicates. It seeds a .env with three stacked MODEL_PROXY_API_KEY= lines (mimicking what the old append-based CLI would have left behind), runs kaggle b auth once, and asserts (a) no stale value survives, (b) dotenv_values() — which is what consumers use — resolves to the fresh token, and (c) unrelated keys are untouched. Worth noting: dotenv.set_key refreshes every duplicate occurrence, not just the first, so the migration story is stronger than I'd written in the PR description — the live value is correct after the first rerun even though the duplicate lines themselves remain.

Per review, cover the main user-facing promise: users who ran the old append-based CLI have stacked `MODEL_PROXY_API_KEY=` lines in their `.env`. New test seeds that state, runs auth once, and verifies the fresh token is what `dotenv_values()` resolves to — so the live value is correct on first rerun even though duplicate lines remain. Co-authored-by: kaggle-agent <kaggle-agent@users.noreply.github.com>

dolaameng

LGTM

nl917

LGTM! Thanks

dolaameng · 2026-05-29T20:26:34Z

-            f.write("\n")
-            for key, value in env_vars.items():
-                f.write(f"{key}={value}\n")
+        # Upsert via python-dotenv so reruns refresh in place rather than


Is there comment still accurate? Try to keep it concise

Trimmed the comment from 3 lines to 2. Kept only the non-obvious bit (why upsert vs append: python-dotenv picks the first occurrence, so duplicates would let an expired token shadow the fresh one) and dropped the redundant phrasing.

Per review, drop the redundant phrasing and keep only the non-obvious bit: why upsert vs append. Co-authored-by: kaggle-agent <kaggle-agent@users.noreply.github.com>

dolaameng · 2026-05-29T20:58:12Z

-            f.write("\n")
-            for key, value in env_vars.items():
-                f.write(f"{key}={value}\n")
+        # Upsert (not append): python-dotenv reads the first occurrence, so


I don't think the comment is correct. The purpose of this PR is to keep the .env clean on user side, and also avoid conflicts with user manual editing. not because the new keys shadowing the old ones, because python-dotenv actually reads the last one for duplicates.

Fix this comment (or just say what it does briefly without these details), and also update your PR summary.

You're right — python-dotenv reads the last occurrence, not the first (I verified: dotenv_values on FOO=first\nFOO=middle\nFOO=last\n returns last). So the actual motivation is file hygiene (no stacked duplicates on the user side) and not silently shadowing any value the user has hand-edited. Updated the inline comment to say exactly that, and corrected the test docstring on test_first_rerun_after_upgrade_refreshes_stale_duplicates (its assertions were already correct — set_key refreshes every occurrence, so no stale value lingers — but the prose had the same wrong framing). I'll update the PR summary in the next regeneration step.

I had python-dotenv's duplicate-key semantics backwards. It reads the last occurrence, so the real motivation is keeping the user's .env clean and not silently shadowing hand-edited values — not "the oldest expired token wins". Rewrote the inline comment and the migration-test docstring to match. Co-authored-by: kaggle-agent <kaggle-agent@users.noreply.github.com>

kaggle-agent assigned dolaameng May 29, 2026

dolaameng reviewed May 29, 2026

View reviewed changes

Comment thread src/kaggle/api/kaggle_api_extended.py

Apply black formatting to upsert tests

88b0dbb

`hatch run lint:fmt` collapsed a couple of adjacent-string blocks in the new upsert tests onto single lines. Formatting-only — no behavior change. Co-authored-by: kaggle-agent <kaggle-agent@users.noreply.github.com>

dolaameng reviewed May 29, 2026

View reviewed changes

dolaameng approved these changes May 29, 2026

View reviewed changes

dolaameng requested review from andrewmwang, nl917 and stevemessick May 29, 2026 19:57

nl917 approved these changes May 29, 2026

View reviewed changes

dolaameng reviewed May 29, 2026

View reviewed changes

Tighten upsert comment to one why-line

6c3241c

Per review, drop the redundant phrasing and keep only the non-obvious bit: why upsert vs append. Co-authored-by: kaggle-agent <kaggle-agent@users.noreply.github.com>

dolaameng reviewed May 29, 2026

View reviewed changes

dolaameng merged commit 4fc2429 into main May 29, 2026
14 checks passed

dolaameng deleted the agent/limagoog-20260529183140-af82f3a6 branch May 29, 2026 21:11

		assert "LLM_DEFAULT=google/gemini-3-flash-preview\n" in content


		# ============================================================

		# ============================================================


		class TestBenchmarksUpsert:

Conversation

kaggle-agent commented May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dolaameng left a comment

Choose a reason for hiding this comment

Uh oh!

nl917 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dolaameng May 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

kaggle-agent commented May 29, 2026 •

edited

Loading

dolaameng May 29, 2026 •

edited

Loading