Skip to content

Load pk data#540

Open
PavanLomati wants to merge 2 commits into
humanpred:mainfrom
PavanLomati:load_pk_data
Open

Load pk data#540
PavanLomati wants to merge 2 commits into
humanpred:mainfrom
PavanLomati:load_pk_data

Conversation

@PavanLomati
Copy link
Copy Markdown
Contributor

Key Features
Automatic detection of concentration and dose datasets
Flexible column mapping via regex patterns
Support for both separate and combined datasets
BLQ handling with interpolation (linear pre-Cmax, log-linear post-Cmax)
Decimal precision standardization
Minimal assumptions about input data structure

…provides a standardized workflow to load, classify, clean, and preprocess pharmacokinetic (PK) data from multiple file formats prior to NCA analysis.

Key Features
Automatic detection of concentration and dose datasets
Flexible column mapping via regex patterns
Support for both separate and combined datasets
BLQ handling with interpolation (linear pre-Cmax, log-linear post-Cmax)
Decimal precision standardization
Minimal assumptions about input data structure
@PavanLomati PavanLomati requested a review from billdenney May 6, 2026 12:45
Comment thread DESCRIPTION
cowplot,
ggplot2,
ggtibble,
haven,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can it work just importing rio and not haven?

Comment thread DESCRIPTION
withr
withr,
rio,
zoo,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to avoid more than the required number of imports. If possible, please omit zoo and janitor.

Comment thread R/load_pk_data.R
#' Streamlines loading, cleaning, and standardisation of PK data from multiple
#' file formats (XPT, XLSX, XLS, CSV, TXT, SAS7BDAT).
#'
#' @param path Character. Directory containing PK files. Default: \code{getwd()}.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not make this have a default of getwd() as that would not typically be reproducible.

Comment thread R/load_pk_data.R
#' \code{c("xpt","xlsx","xls","csv","txt","sas7bdat")}.
#' @param patterns Named list. Regex patterns for PK column roles.
#' See \code{\link{get_pk_patterns}}.
#' @param decimal_control Logical. Apply smart decimal formatting? Default \code{TRUE}.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please define what "smart decimal formatting" means.

Comment thread R/load_pk_data.R
#' @param patterns Named list. Regex patterns for PK column roles.
#' See \code{\link{get_pk_patterns}}.
#' @param decimal_control Logical. Apply smart decimal formatting? Default \code{TRUE}.
#' @param blq_handling Logical. Apply BLQ interpolation? Default \code{TRUE}.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blq_handling should all occur within PKNCA. The only part that may happen before PKNCA is conversion (e.g. the text "BLQ", "BLOQ", "BQL", etc. may be converted to 0 for the user). if that is the goal, please change "interpolation" to "conversion" and give a more detailed example of what it means in the details section of this code.

Comment thread R/load_pk_data.R
#' detected. Warns if duplicate time values suggest multiple subjects.
#'
#' @keywords internal
auto_create_subject_id <- function(df, verbose = FALSE) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No subject column should be needed (or created) if no subject is given. Please remove this.

Comment thread R/load_pk_data.R
time_col <- get_mapped_column(data, "time")
conc_col <- get_mapped_column(data, "conc")

blq_strings <- c("blq", "bloq", "bql", "lloq", "na", "nr", "",
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"na", "nr", "", and "nd" are no data indicators and not BLQ. They should be set to NA and not to 0.

Comment thread R/load_pk_data.R
#' Interpolate BLQ Values for a Single Subject
#'
#' @keywords internal
interpolate_subject <- function(sub_df, time_col, conc_col, verbose) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not interpolate BLQ values. They are important for the PKNCA calculations and are used within the calculation methods based on user preferences there. Please keep the data loading scripts focused on loading the data only.

Comment thread R/load_pk_data.R
#' ensuring the metadata actually persists.
#'
#' @keywords internal
decimal_formatter <- function(df, col_max_map, verbose) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not reformat the columns, only load the data and categorize the type of data.

Comment thread R/load_pk_data.R
# =============================================================================
# 10. Usage Example (wrapped in if (FALSE) so it never auto-runs)
# =============================================================================
if (FALSE) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not put examples in the code. Please put this into a vignette.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants