Experimental dataset JSON GET API returns multi-valued fields as arrays#12488
Experimental dataset JSON GET API returns multi-valued fields as arrays#12488stevenwinship wants to merge 3 commits into
Conversation
| // Add metadata value to aggregation, suppress array when multiples not allowed | ||
| JsonArray valArray = vals.build(); | ||
| return (valArray.size() != 1) ? valArray : valArray.get(0); | ||
| return (dfType.isAllowMultiples()) ? valArray : valArray.get(0); |
There was a problem hiding this comment.
If you make changes to the OREMap, you need to update the version as noted at
. This could also break archiving and tools such as DVUploader that can read archival bags to restore datasets. Hopefully those are robust enough to this change, but they presumably have code to parse single values that will now be obsolete.There was a problem hiding this comment.
I will update the version (even though the output really doesn't change) and check DVUploader. DVUploader and anyone using the API should have no issue with the array since it is there when more than 1 entry is there.
There was a problem hiding this comment.
It looks like DVUploader expects an array. When it finds a String it converts it to an array so I believe this code is not affected by this change.
There was a problem hiding this comment.
Thanks for checking. Thinking some more, I think most of the places we read the ORE map, we use json-ld tools and canonicalize the format before parsing, which probably managed the conversion already. (That would probably be a best practice for json-ld in general.)
This comment has been minimized.
This comment has been minimized.
1 similar comment
This comment has been minimized.
This comment has been minimized.
190d915 to
b8a67ac
Compare
This comment has been minimized.
This comment has been minimized.
There are two affected endpoints, and the version specific one returns JSON-LD (affected by this PR) or JSON (not affected), depending on what MimeType the user requests.
qqmyers
left a comment
There was a problem hiding this comment.
This looks OK. I edited the release note to indicate there are two api calls and one is only if JSON LD is requested.
In QA, I think someone should verify that the PUT will accept the new formatting (so we maintain round-tripping). I think it will, because the PUT uses a JSON-LD library that standardizes what we actually try to parse (The 'same' JSON-LD can be written with many variants of the @context so standardizing (or completely parsing as JSON-LD) is necessary for inputs.)
|
📦 Pushed preview images as 🚢 See on GHCR. Use by referencing with full name as printed above, mind the registry name. |
What this PR does / why we need it: For API GET /api/datasets/{id}/metadata multi-valued field values are not returned as arrays if only 1 entry exists. This causes parsing the JSON to be more complex.
Which issue(s) this PR closes: #9495
Special notes for your reviewer: Since the fields are parsed as an array, when more than 1 entry exists, there is no break in backward compatibility.
Suggestions on how to test this: Create a dataset with 1 subject and another with more than 1 subject. Examine the JSON output to see that the "subject": ["Medicine, Health and Life Sciences"] is always surrounded with []. This will happen for all fields that have DatasetFieldType.isAllowMultiples() (for reference "title" does not allow multiples)
Does this PR introduce a user interface change? If mockups are available, please link/include them here:
Is there a release notes update needed for this change?: Included
Additional documentation: