Skip to content

Implement Metadata:from_statx for Linux MetadataExt trait#156269

Open
asder8215 wants to merge 2 commits into
rust-lang:mainfrom
asder8215:statx_metadata
Open

Implement Metadata:from_statx for Linux MetadataExt trait#156269
asder8215 wants to merge 2 commits into
rust-lang:mainfrom
asder8215:statx_metadata

Conversation

@asder8215

@asder8215 asder8215 commented May 7, 2026

Copy link
Copy Markdown
Contributor

This PR implements the Metadata::from_statx for Linux MetadataExt ACP. You can create a std::fs::Metadata from a *const c_void populated by the statx syscall.

The tracking issue for this ACP is here.

I had a couple of questions regarding to what Josh said in the ACP:

We'd also be interested in adding a constant for "the statx flags std normally requests", so that code wanting to request everything std knows about can use those.
^I wanted a little bit more clarity on what Josh meant by this.

  • How should I go about adding in flags/masks for populating the Metadata struct with the *const c_void pointer containing the statx info? I'm assuming it's possible that not all of the fields from our statxbuf will be submitted or given to Metadata stat fields (since I understood that this function is supposed to be flexible), so we risk accessing something not initialized?
  • Should this function return a Result<Metadata>? Should I be checking whether the given *const c_void is a null pointer or not, and returning an Error if it is?

Documentation for this function is a WIP. There's probably a lot more I have to say for this function.

@rustbot rustbot added O-linux Operating system: Linux S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue. labels May 7, 2026
@rustbot

rustbot commented May 7, 2026

Copy link
Copy Markdown
Collaborator

r? @jhpratt

rustbot has assigned @jhpratt.
They will have a look at your PR within the next two weeks and either review your PR or reassign to another reviewer.

Use r? to explicitly pick a reviewer

Why was this reviewer chosen?

The reviewer was selected based on:

  • Owners of files modified in this PR: @ChrisDenton, libs
  • @ChrisDenton, libs expanded to 8 candidates
  • Random selection from Mark-Simulacrum, jhpratt, nia-e

@rust-log-analyzer

This comment has been minimized.

@rust-log-analyzer

This comment has been minimized.

@joshtriplett

Copy link
Copy Markdown
Member

I had a couple of questions regarding to what Josh said in the ACP:

We'd also be interested in adding a constant for "the statx flags std normally requests", so that code wanting to request everything std knows about can use those.
^I wanted a little bit more clarity on what Josh meant by this.

* How should I go about adding in flags/masks for populating the `Metadata` struct with the `*const c_void` pointer containing the statx info? I'm assuming it's possible that not all of the fields from our `statxbuf` will be submitted or given to `Metadata` stat fields (since I understood that this function is supposed to be flexible), so we risk accessing something not initialized?

You can provide a single public const STATX_MASK, of type c_int, for the appropriate mask. It's the caller's responsibility to make sure std doesn't encounter uninitialized data.

* Should this function return a `Result<Metadata>`? Should I be checking whether the given `*const c_void` is a null pointer or not, and returning an `Error` if it is?

It's an unsafe function; you have to pass it a valid pointer. We could panic if we get null, as a precaution, but I'm not sure we should; I think we should just assume the pointer is valid. The function should return a Metadata.

Comment thread library/std/src/os/linux/fs.rs
@asder8215

Copy link
Copy Markdown
Contributor Author

r? libs

Comment on lines +65 to +66
/// Currently [`Metadata::from_statx`] is only supported on Linux platforms with a target
/// environment of GNU.

@Darksonn Darksonn May 20, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was planning to wait for #154981 before implementing this to avoid this clause.

View changes since the review

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got you. I think the order doesn't matter right now since this function would automatically have musl support once #154981 lands as the PR updates the cfg_has_statx macro. The only thing that needs to be updated is the documentation for this function, which I'm down to keep my eyes on #154981 to see it get merged in.

Comment thread library/std/src/os/linux/fs.rs
/// This is the [`statx`] mask expected by [`Metadata::from_statx`], which sets both
/// `STATX_BASIC_STATS` and `STATX_BTIME`. See the [Linux man page] for statx for more
/// details.
///

@Mark-Simulacrum Mark-Simulacrum Jun 7, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, so it seems like we can't really extend this mask in the future if we want more information unless it's available on all Linux versions we support. E.g., if we for some reason wanted to pass STATX_MNT_ID, we couldn't add it here because the caller may not have allocated enough space for it in their statx they pass to the kernel.

I see that the man page on my system at least says that:

Note that, in general, the kernel does not reject values in mask other than the above. (For an exception, see EINVAL in errors.) Instead, it simply informs the caller which values are supported by this kernel and filesystem via the statx.stx_mask field.
[...]
The status information for the target file is returned in the statx structure pointed to by statxbuf. Included in this is stx_mask which indicates what other information has been returned. stx_mask has the same format as the mask argument and bits are set in it to indicate which fields have been filled in.

I'm wondering:

  • Does std check stax.stx_mask today, and if not, should we do so? Perhaps that's a reason to return Result from from_statx if we didn't in fact get what we asked for?
  • Should we provide a minimum allocation size that we 'recommend' for passing to libc/kernel so that we can extend this?

View changes since the review

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the ACP I suggested providing an opaque statx type in the stdlib so that the stdlib can tell the library what size (and alignment) to use for the pointer.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also from what Josh said in the ACP:

We don't want to add the opaque struct statx, because people will want to get at the internal structure (e.g. setting additional mask bits, or reading fields out before converting it). People would end up wanting to transmute, and if we create the struct then we have to manage the size and the memory. (Which might also prevent future clever APIs that use kernel-managed buffers or arrays of statx structures or similar, for instance.)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm OK landing this as-is if we add unresolved questions to the ACP, but I think we shouldn't stabilize this without a plan for extending this constant. AFAICT, right now we can't really do so (per my previous comment).

I'll nominate for libs-api for now.

@rustbot rustbot added S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Jun 7, 2026
@rustbot

rustbot commented Jun 7, 2026

Copy link
Copy Markdown
Collaborator

Reminder, once the PR becomes ready for a review, use @rustbot ready.

@rustbot

rustbot commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator

⚠️ Warning ⚠️

@rustbot rustbot added S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. and removed S-waiting-on-author Status: This is awaiting some action (such as code changes or more information) from the author. labels Jun 10, 2026
@Mark-Simulacrum Mark-Simulacrum added the I-libs-nominated Nominated for discussion during a libs team meeting. label Jun 21, 2026
stat.st_mtime = (*buf).stx_mtime.tv_sec as libc::time_t;
stat.st_mtime_nsec = (*buf).stx_mtime.tv_nsec as _;
stat.st_ctime = (*buf).stx_ctime.tv_sec as libc::time_t;
stat.st_ctime_nsec = (*buf).stx_ctime.tv_nsec as _;

@Mark-Simulacrum Mark-Simulacrum Jun 21, 2026

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason we can't copy from the passed pointer into our stat for the size that we're expecting to be valid? Presumably that is a fixed 'header' of the passed structure of known size?

I think with either implementation strategy we need some better tests -- I think right now we're just checking that the size is copied? It seems easy to forget to update this in the future.

View changes since the review

@asder8215 asder8215 Jun 21, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took this from try_statx in sys/fs/unix.rs.

stat.st_dev = libc::makedev(buf.stx_dev_major, buf.stx_dev_minor) as _;
stat.st_ino = buf.stx_ino as libc::ino64_t;
stat.st_nlink = buf.stx_nlink as libc::nlink_t;
stat.st_mode = buf.stx_mode as libc::mode_t;
stat.st_uid = buf.stx_uid as libc::uid_t;
stat.st_gid = buf.stx_gid as libc::gid_t;
stat.st_rdev = libc::makedev(buf.stx_rdev_major, buf.stx_rdev_minor) as _;
stat.st_size = buf.stx_size as off64_t;
stat.st_blksize = buf.stx_blksize as libc::blksize_t;
stat.st_blocks = buf.stx_blocks as libc::blkcnt64_t;
stat.st_atime = buf.stx_atime.tv_sec as libc::time_t;
// `i64` on gnu-x86_64-x32, `c_ulong` otherwise.
stat.st_atime_nsec = buf.stx_atime.tv_nsec as _;
stat.st_mtime = buf.stx_mtime.tv_sec as libc::time_t;
stat.st_mtime_nsec = buf.stx_mtime.tv_nsec as _;
stat.st_ctime = buf.stx_ctime.tv_sec as libc::time_t;
stat.st_ctime_nsec = buf.stx_ctime.tv_nsec as _;

However, just to clarify the stat variable you're referring to is a libc::stat64 while our buf is a libc::statx looking at statx and stat64 struct representation:

pub struct statx {
    pub stx_mask: crate::__u32,
    pub stx_blksize: crate::__u32,
    pub stx_attributes: crate::__u64,
    pub stx_nlink: crate::__u32,
    pub stx_uid: crate::__u32,
    pub stx_gid: crate::__u32,
    pub stx_mode: crate::__u16,
    __statx_pad1: Padding<[crate::__u16; 1]>,
    pub stx_ino: crate::__u64,
    pub stx_size: crate::__u64,
    pub stx_blocks: crate::__u64,
    pub stx_attributes_mask: crate::__u64,
    pub stx_atime: statx_timestamp,
    pub stx_btime: statx_timestamp,
    pub stx_ctime: statx_timestamp,
    pub stx_mtime: statx_timestamp,
    pub stx_rdev_major: crate::__u32,
    pub stx_rdev_minor: crate::__u32,
    pub stx_dev_major: crate::__u32,
    pub stx_dev_minor: crate::__u32,
    pub stx_mnt_id: crate::__u64,
    pub stx_dio_mem_align: crate::__u32,
    pub stx_dio_offset_align: crate::__u32,
    __statx_pad3: Padding<[crate::__u64; 12]>,
}

pub struct stat64 {
    pub st_dev: crate::dev_t,
    pub st_ino: crate::ino64_t,
    pub st_nlink: crate::nlink_t,
    pub st_mode: crate::mode_t,
    pub st_uid: crate::uid_t,
    pub st_gid: crate::gid_t,
    __pad0: Padding<c_int>,
    pub st_rdev: crate::dev_t,
    pub st_size: off_t,
    pub st_blksize: crate::blksize_t,
    pub st_blocks: crate::blkcnt64_t,
    pub st_atime: crate::time_t,
    pub st_atime_nsec: i64,
    pub st_mtime: crate::time_t,
    pub st_mtime_nsec: i64,
    pub st_ctime: crate::time_t,
    pub st_ctime_nsec: i64,
    __reserved: Padding<[i64; 3]>,
}

I don't think we can copy the passed in statx pointer (which we assume the user gave us a pointer with the same representations as libc::statx) directly and create a stat64 object out of this. For example, stat64.st_dev relies on statx.stx_dev_major and statx.stx_dev_minor, which we anyways we need to use libc::makedev to properly create the dev_t value.

Also, Metadata is wrapper around FileAttr, which on unix it's represented as:

pub struct FileAttr {
        stat: stat64,
        statx_extra_fields: Option<StatxExtraFields>,
}

So all the stat64 and statx extra information are copied separately as a result to populate FileAttr.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see, they're different structs, I missed that when skimming the impl. Maybe we can re-use try_statx?

My main worry here is that it seems pretty easy for the two implementations to diverge, which is probably not great -- at best, it's missing features for one of them, at worst unsoundness if we assume something is always initialized (but isn't actually).

I suppose most (all?) of the fields are just integers though so maybe the zero-init makes it mostly harmless...

@asder8215 asder8215 Jun 21, 2026

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think a shared function can be created for both try_statx and this Metadata::from_statx since they both are just creating a FileAttr at the end of the day. I can modify my implementation of FileAttr::from_statx to take in a libc::statx value, do the same thing in populating a stat64 and StatxExtraFields, and return a FileAttr struct out of this.

If that sounds good to you, should I mark the FileAttr::from_statx function unsafe?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it has a safety precondition, then yes. I think it would.

@Mark-Simulacrum Mark-Simulacrum added I-libs-api-nominated Nominated for discussion during a libs-api team meeting. and removed I-libs-nominated Nominated for discussion during a libs team meeting. labels Jun 21, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

I-libs-api-nominated Nominated for discussion during a libs-api team meeting. O-linux Operating system: Linux S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. T-libs Relevant to the library team, which will review and decide on the PR/issue.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants