Skip to content

Stabilize CLI, modularize commands, migrate packaging to uv + pyproject.toml#25

Draft
Copilot wants to merge 2 commits into
masterfrom
copilot/explore-codebase-improvement-plan
Draft

Stabilize CLI, modularize commands, migrate packaging to uv + pyproject.toml#25
Copilot wants to merge 2 commits into
masterfrom
copilot/explore-codebase-improvement-plan

Conversation

Copy link
Copy Markdown

Copilot AI commented May 21, 2026

Several CLI commands were broken or incomplete, core search logic had a string/sequence confusion bug, and the project used a legacy setup.py + requirements.txt stack. This PR addresses all three areas.

Core bug fixes

  • search_by_ids_in_flora: a plain str was iterated as a character sequence — "geo" would match inside query string "geo-nuts". Fixed by normalizing to [str] before the membership test.
  • Flora.__init__: remote flora called self.workdir before it was set; plain str paths were rejected. Both fixed.
  • Flora._convert_to_herb: Herb was constructed without base_path, falling back to a config file lookup that fails in CI. Now passes self.workdir / id.
  • upload / validate: used __CWD__ = Path(__file__).parent (the package directory, evaluated at import time) instead of Path.cwd(). Fixed.
  • validate: called MetaData() with no args (raises); MetaData.validate() returned None — CLI iterated it. Implemented a real structured return: {"data": [{"path": {...}, "name": {...}}]}.
  • add: implementation was cut off at a # TODO. Completed — fetches remote metadata and calls fl.add(Herb(metadata)).
  • Broad Exception(...) raises replaced with ValueError, FileNotFoundError, TypeError, ConnectionError, click.ClickException.

CLI restructuring

Split the 600-line monolithic command.py into focused modules:

Module Commands
dataherb/cmd/flora.py search, download, add, remove
dataherb/cmd/dataset.py create, upload, validate
dataherb/cmd/configure.py configure
dataherb/cmd/serve.py serve

command.py is now ~50 lines — a thin registration file using dataherb.add_command(...).

Tests

35 new tests (43 total):

  • tests/core/test_search.py — exact-match, multi-id, empty, and the string-safety regression
  • tests/core/test_metadata.pyMetaData.load, validate (success/error/missing), create
  • tests/test_cmd/test_cli.py — CLI regression via CliRunner: search, validate, upload abort
  • tests/integration/test_flora.py — add/remove roundtrip, unknown-id returns None, duplicate-ID guard

Packaging migration

Replaced setup.py + requirements*.txt with pyproject.toml (hatchling backend) and a uv.lock lockfile.

# install project + dev deps
uv sync --extra dev

# run tests
uv run pytest

CI updated to use astral-sh/setup-uv@v5 instead of setup-python + manual pip installs.

Also fixed a distutils.dir_util.copy_tree import in save_mkdocs.py that broke on Python 3.12 (module removed); replaced with shutil.copytree(..., dirs_exist_ok=True).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants