Skip to content

More metadata#52

Merged
danigm merged 8 commits intodanigm:masterfrom
l1yefeng:more-metadata
May 26, 2025
Merged

More metadata#52
danigm merged 8 commits intodanigm:masterfrom
l1yefeng:more-metadata

Conversation

@l1yefeng
Copy link
Copy Markdown
Contributor

Mainly, extract more data from the metadata section, store it using a new struct MetadataItem, and change EpubDoc::metadata from HashMap<String, Vec<String>> to Vec<MetadataItem>.

Auxiliary info in metadata that isn't the values is often helpful. E.g., reading systems may wish to render metadata values in the right fonts with the help of lang, to access creators' file-as and role.

Addresses #26

P.S. The first commit in this thread is code formatting. So perhaps it's easier to compare with master modulus that. Review and suggestions are appreciated.

l1yefeng and others added 8 commits May 24, 2025 18:33
Struct `MetadataItem` is introduced mainly to support more information
than the data value (e.g,. refinements such as "file-as").

The reason to change from HashMap to Vec isn't significant.
Vec is slightly easier to construct and does reflect the original
element better, preserving the data items' order.

In this commit, metadata does not have more or less info.
The incomplete construction for EPUB2 will be completed next,
and the implementation for EPUB3 following that.
Finish the parsing of metadata. One design decision made here is to
reject as little data as possible, so that it supports EPUB files well
even if they have features not in accordance to the file's EPUB version.

Word "value" replaces "text" to represent property values.
`mdata()` used to return the value. It now returns the item (reference)
where the value can still be easily accessed via `item.value`.

A similar method `refinement()` is added to `MetadataItem` to query a
refinement.
Copy link
Copy Markdown
Owner

@danigm danigm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. The only thing that I have to say is that the fill_metadata function is getting more and more complex, so maybe it's worth to split to reduce the depth.

@l1yefeng
Copy link
Copy Markdown
Contributor Author

The only thing that I have to say is that the fill_metadata function is getting more and more complex, so maybe it's worth to split to reduce the depth.

I agree. When doing that, it is perhaps worth converting it to a extract_metadata function (not mutating EpubDoc) so that it can be tested more easily. Then test cases can be added without the need to have an actual EPUB archive. However I prefer it to be done in a separate cleanup PR.

@danigm danigm merged commit 84221b0 into danigm:master May 26, 2025
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants