Skip to content

feat(python/sedonadb-expr): Add Python documentation sources based on SQL documentation#862

Open
paleolimbot wants to merge 29 commits into
apache:mainfrom
paleolimbot:python-expr
Open

feat(python/sedonadb-expr): Add Python documentation sources based on SQL documentation#862
paleolimbot wants to merge 29 commits into
apache:mainfrom
paleolimbot:python-expr

Conversation

@paleolimbot

@paleolimbot paleolimbot commented May 20, 2026

Copy link
Copy Markdown
Member

Like #851 but for Python.

This PR generates a package, sedonadb-expr, dynamically at build time based on the SQL documentation. This allows function documentation inline. Because it's a separate package, users that don't need this don't have to pay the cost of loading it (non trivial, given that there are hundreds of functions and thousands of lines of functions + docs).

The pattern here is the "accessor" pattern, which both Pandas and CUDF and ESRI use to get nice type completion (I think this was invented slightly after GeoPandas subclassed the GeoSeries).

import sedona.db 

sd = sedona.db.connect()
t = sd.funcs.table.sd_random_geometry()
t.select(geom=t.geometry.geo.buffer(10)).show(2)
# ┌──────────────────────────────────────────────────────────────────────────────────────────────────┐
# │                                               geom                                               │
# │                                             geometry                                             │
# ╞══════════════════════════════════════════════════════════════════════════════════════════════════╡
# │ MULTIPOLYGON(((68.9138184644943 64.1641126648927,69.10596566046199 62.21320944473141,69.6750231… │
# ├╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌╌┤
# │ MULTIPOLYGON(((35.236552927729704 78.19983483536834,35.428700123697396 76.24893161520706,35.997… │
# └──────────────────────────────────────────────────────────────────────────────────────────────────┘
Screenshot 2026-06-02 at 12 17 56 PM

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an optional, build-time-generated Python package (sedonadb-expr) that exposes SQL-function documentation and geometry/geography accessor helpers (.geo) for improved IDE completion and inline docs, while keeping the core sedonadb import/load cost low for non-interactive usage.

Changes:

  • Introduces new sedonadb-expr Python package with Hatch build hook + codegen from docs/reference/sql/*.qmd.
  • Adds .geo accessors on Expr, Literal, and Functions to route calls through generated GeoMethods / GeoFunctions.
  • Updates CI to install and test sedonadb-expr alongside sedonadb.

Reviewed changes

Copilot reviewed 15 out of 15 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
python/sedonadb/tests/expr/test_function_expression.py Adds tests for piped expressions and new .geo accessors.
python/sedonadb/python/sedonadb/functions/init.py Adds Functions.geo property exposing generated geo functions.
python/sedonadb/python/sedonadb/expr/literal.py Adds Literal.geo and internal _call used by generated accessors.
python/sedonadb/python/sedonadb/expr/expression.py Adds Expr.geo and internal _call used by generated accessors.
python/sedonadb-expr/tests/test_codegen.py Tests doc/code generation helpers and validates generated sources compile.
python/sedonadb-expr/tests/init.py Initializes test package (license header only).
python/sedonadb-expr/README.md Adds initial package README and installation snippet.
python/sedonadb-expr/python/sedonadb_expr/utils.py Adds MISSING sentinel + trailing-missing argument filtering helper.
python/sedonadb-expr/python/sedonadb_expr/_codegen.py Implements parsing of SQL docs and generation of accessor modules + docstrings.
python/sedonadb-expr/python/sedonadb_expr/init.py Exposes GeoFunctions / GeoMethods from generated modules.
python/sedonadb-expr/pyproject.toml Defines new package metadata and Hatch build hook configuration.
python/sedonadb-expr/hatch_build.py Adds Hatch build hook to generate _version.py and _generated/ sources at build time.
python/sedonadb-expr/.gitignore Ignores build-generated files and common Python artifacts.
python/sedonadb-expr/_version.py Provides Hatch version source reading from workspace Cargo.toml.
.github/workflows/python.yml Installs/tests sedonadb-expr and adjusts test/doctest working directories.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +133 to +143
def test_geo_methods_accessor(con):
pytest.importorskip("sedonadb_expr")

# Check piped function from literal via .geo accessor
e = con.lit(shapely.Point(0, 1)).geo.as_text()
e = con.lit("POINT (0 1)").funcs.st_geomfromwkt()
assert repr(e) == 'Expr(st_geomfromwkt(Utf8("POINT (0 1)")))'

# Check piped function from Expr via .geo accessor
e = con.col("foofy").geo.as_text()
assert repr(e) == "Expr(st_astext(foofy))"
Comment thread python/sedonadb-expr/python/sedonadb_expr/_codegen.py
Comment thread python/sedonadb-expr/python/sedonadb_expr/_codegen.py
Comment thread python/sedonadb-expr/hatch_build.py Outdated
@paleolimbot paleolimbot marked this pull request as ready for review June 9, 2026 19:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants