Skip to content

feat(rust/sedona-raster-gdal): add RS_MetaData#833

Open
Kontinuation wants to merge 6 commits into
apache:mainfrom
Kontinuation:pr-g-rs-metadata
Open

feat(rust/sedona-raster-gdal): add RS_MetaData#833
Kontinuation wants to merge 6 commits into
apache:mainfrom
Kontinuation:pr-g-rs-metadata

Conversation

@Kontinuation

@Kontinuation Kontinuation commented May 11, 2026

Copy link
Copy Markdown
Member

Summary

  • add RS_MetaData backed by the GDAL dataset provider
  • expose raster dimensions, transform, srid, band count, and GDAL block size metadata
  • keep the change limited to the incremental metadata surface on top of the current GDAL raster work

Dependency

Testing

  • cargo test -p sedona-raster-gdal
  • cargo clippy -p sedona-raster-gdal -- -D warnings
  • cargo test -p sedona-gdal

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 6 out of 7 changed files in this pull request and generated 4 comments.

Comment thread rust/sedona-raster-gdal/src/rs_metadata.rs
Comment thread rust/sedona-raster-gdal/src/rs_metadata.rs Outdated
Comment thread docs/reference/sql/rs_metadata.qmd Outdated
Comment thread rust/sedona-raster-gdal/src/rs_metadata.rs
@Kontinuation Kontinuation marked this pull request as ready for review June 9, 2026 01:56

@paleolimbot paleolimbot left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

It would be great to get the example to be user runnable (whatever we do here will also help make realistic examples for other functions, too). Other than that these are just nits.

Comment on lines +42 to +46
```sql
SELECT RS_MetaData(
RS_FromPath('../../../submodules/sedona-testing/data/raster/test4.tiff')
);
```

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use the http:// url here so that this example is runnable by a user? Or enable RS_Example("fixture name")?

Comment on lines +48 to +50
Field::new("upperLeftX", DataType::Float64, true),
Field::new("upperLeftY", DataType::Float64, true),
Field::new("gridWidth", DataType::Int64, true),

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get that we probably have to do this for compatibility with Sedona Spark, but it's a bit of a bummer these aren't snake_case.

let mut num_bands_builder = UInt32Builder::with_capacity(capacity);
let mut tile_width_builder = UInt64Builder::with_capacity(capacity);
let mut tile_height_builder = UInt64Builder::with_capacity(capacity);
let mut struct_validity = Vec::with_capacity(capacity);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use a NullsBuffer here?

Comment on lines +145 to +157
let srid = match raster.crs() {
None => 0i32,
Some(crs_str) => match deserialize_crs(crs_str) {
Ok(Some(crs_ref)) => crs_ref
.srid()
.ok()
.flatten()
.and_then(|s| i32::try_from(s).ok())
.unwrap_or(0),
_ => 0i32,
},
};
srid_builder.append_value(srid);

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get how this is needed for compatibility, but we should arguably change the SedonaSpark version to return a string here so that we avoid this being lossy for un-sridable tiffs.

Comment on lines +313 to +326
fn invoke_array_result(raster_array: StructArray) -> StructArray {
metadata_udf_tester()
.invoke_array(Arc::new(raster_array))
.unwrap()
.as_struct()
.clone()
}

fn assert_scalar_result(result: ScalarValue) -> Arc<StructArray> {
match result {
ScalarValue::Struct(struct_array) => struct_array,
other => panic!("Expected struct scalar result, got {other:?}"),
}
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For analyze agg we have a JSON based struct approach that could be copied or moved to sedona-testing and reused (but your tests here are quite readable and I'm not sure it's an improvement other than less helper code).

fn actual_result_json(struct_array: Arc<StructArray>) -> Value {
// Create a schema that matches the struct array fields
let fields = struct_array.fields().to_vec();
let schema = Schema::new(fields);
// Create a record batch with just the struct array converted to a column
let columns: Vec<ArrayRef> = struct_array.columns().to_vec();
let record_batch = RecordBatch::try_new(Arc::new(schema), columns).unwrap();
// Serialize to JSON using Arrow's JSON writer
let buf = Vec::new();
let mut writer = ArrayWriter::new(buf);
writer.write_batches(&[&record_batch]).unwrap();
writer.finish().unwrap();
// Get the JSON string
let json_str = String::from_utf8(writer.into_inner()).unwrap();
let json_value: Value = serde_json::from_str(&json_str).unwrap();
// Get the first (and only) row from the JSON array
json_value[0].clone()
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants