refactor: ♻️ update shared primary keys#106
Conversation
| description="The unique identifier of the participant.", | ||
| constraints=sp.ConstraintsProperties(required=True), | ||
| ) | ||
| event_field = sp.FieldProperties( |
There was a problem hiding this comment.
This is still the unique name of the event that we get from the raw data. There is also a numeric event ID, but we would need to look that up separately using another API request.
As for the relationship between events and visits, my understanding is that some events belong to a visit and some don't. So I don't know if adding a visit_id field to all resources makes sense because sometimes this will be empty. There are also forms with consult or week in the name, so that could be equivalent to visit. Maybe event is a broader category (any interaction with the participant) and visit is a particular type of interaction.
There was a problem hiding this comment.
A visit ID should never be empty tho 🤔 Or at least in very few exceptions, like things collected between visits like the adverse events, but that will still have date fields that can line up with the visits. At the least, there should be a visit date.
There was a problem hiding this comment.
What field does this relate to? Would we rename from another field? If so, would be good to put as a comment above the field variable.
There was a problem hiding this comment.
This is from the raw redcap_event_name field, which is given for each row of the raw data. Event is the concept REDCap forms are organised around, visit is one kind of event in this study.
Visit ID can be missing if the row doesn't come from an instrument/form that is associated with a visit. I found about 24 forms like this. I looked at the fields for each to see if there is a date. There are date fields in each (e.g., contact_date, date_dietary_deviation_1, date_diet_consult_1), and some are also clearly associated with a study week or with another type of event (e.g., a consultation or a group meeting).
So we could pull out a date from each form, but it could be a bit faffy because we would need to check which field has the date in each. And then for forms that are associated with a visit, we would need to find the correct visit (based on the form name, event name, or field names) and then find the date for that visit. What do you think?
| ) | ||
| default_fields = [event_field, center_field] | ||
| primary_key = ["event"] | ||
| submission_field = sp.FieldProperties( |
There was a problem hiding this comment.
This is renamed from redcap_repeat_instance. Not all forms can repeat (= be submitted multiple times for the same participant and the same event), but when they do repeat, this number tells apart the submissions. I put it as a string to match the other IDs.
We can either add this everywhere and set it to 1 where the form doesn't repeat, or add it only for forms that can repeat.
There was a problem hiding this comment.
Yea, I think only for forms that repeat.
There was a problem hiding this comment.
Okay, I made it like that!
I included downloading the list of repeating instruments in the metadata download, so I can check which ones repeat.
| constraints=sp.ConstraintsProperties(required=True), | ||
| ) | ||
| default_fields = [participant_field, event_field, submission_field, center_field] | ||
| primary_key = ["participant_id", "event_id", "submission_id"] |
There was a problem hiding this comment.
I didn't add an equivalent to redcap_repeat_instrument here because that should be the resource itself. When we transform the resources (like with VAS), we derive an extra PK component from the original instrument/form name, so that should maintain the distinction between rows coming from different instruments/forms.
| @@ -234,8 +241,15 @@ def _form_to_resource( | |||
| enum=["Copenhagen", "Aarhus", "Odense"], | |||
| ), | |||
| ) | |||
There was a problem hiding this comment.
This field still needs to be more clear what it actually is. How does it relate to visit? Could you try to expand the description to make it clearer what it means? Maybe we can also rename it something clearer 🤔
There was a problem hiding this comment.
My understanding is that an event is a collection of forms/instruments filled in at the same time. I think an event usually corresponds to a patient interaction (e.g., visit, consultation, group meeting, phone call), but it can also be more abstract (e.g., randomisation). So all visits are events but not all events are visits. I think this is why we initially decided not to rename event to visit, but there may be a better name.
@K-Beicher, is this correct?
| description="The unique identifier of the participant.", | ||
| constraints=sp.ConstraintsProperties(required=True), | ||
| ) | ||
| event_field = sp.FieldProperties( |
There was a problem hiding this comment.
A visit ID should never be empty tho 🤔 Or at least in very few exceptions, like things collected between visits like the adverse events, but that will still have date fields that can line up with the visits. At the least, there should be a visit date.
| ) | ||
| default_fields = [event_field, center_field] | ||
| primary_key = ["event"] | ||
| submission_field = sp.FieldProperties( |
There was a problem hiding this comment.
Yea, I think only for forms that repeat.
| description="The unique identifier of the participant.", | ||
| constraints=sp.ConstraintsProperties(required=True), | ||
| ) | ||
| event_field = sp.FieldProperties( |
There was a problem hiding this comment.
What field does this relate to? Would we rename from another field? If so, would be good to put as a comment above the field variable.
There was a problem hiding this comment.
Updated all metadata when I downloaded the list of repeating instruments
There was a problem hiding this comment.
If we add more requests, it will probably be worth generalising some logic here.
Description
This PR updates the primary keys common to all resources.
Based on my understanding of the situation and our discussions, but very open to comments.
I tried to include all my assumptions.
Closes #95
This PR needs an in-depth review.
Checklist
just run-all