Dependence aware tests for `ppc*ecdf` by florence-bockting · Pull Request #428 · stan-dev/bayesplot

florence-bockting · 2026-03-04T08:43:59Z

Description

The current approach in ppc_loo_pit_ecdf and ppc_pit_ecdf assumes independence of LOO-PIT values which is not valid (Marhunenda et al., 2005). The corresponding graphical test yields an envelope that is too wide, reducing the test's ability to reveal model miscalibration.
Tesso & Vehtari (2026, see preprint) propose three testing procedures that can handle any dependent uniform values and provide an updated graphical representation that uses color coding to indicate influential regions or most influential points of the ECDF. This PR implements the new development, by adding the updated approach (method = "correlated") additionally to the previous approach (method = "independent").

TODOs

updated ppc_loo_pit_ecdf() function in ppc-loo.R
updated ppc_pit_ecdf() and ppc_pit_ecdf_grouped() function in ppc-distributions.R
add unittests and visual regression tests
update documentation
deprecation suggestion of old method
- P1: old method (default) and new method available via method argument
- P2: new method (default) and old method available via method argument
- P3: old method is removed

…it_ecdf

…s-ppc.R

…improve input validation checks

…dation

…e-bockting/bayesplot into dependence-aware-LOO-PIT

florence-bockting · 2026-04-07T04:40:44Z

As mentioned in the PR introduction text, the idea would be migration in multiple stages.
First stage is non-breaking behavior: That is users can simply use "original" code but are informed that this behavior will change in future. Future changes consider changing the default method and thus are breaking.

Informing the user in the first stage is currently implemented as follows:

# using the "original" code provides a message about the change
> ppc_loo_pit_ecdf(y, yrep, lw)
ℹ In the next major release, the default `method` will change to 'correlated'.
• To silence this message, explicitly set `method = 'independent'` or `method = 'correlated'`.

# explicit use of "independent" method provides information that this method is superseded
> ppc_loo_pit_ecdf(y, yrep, lw, method = "independent")
The 'independent' method is superseded by the 'correlated' method.

jgabry

Here are a few comments and questions. So far I've only looked at some of the code, so I may have more, but I thought I would give you these now so we could start discussing.

jgabry · 2026-04-08T17:26:49Z

R/ppc-loo.R

+
+      test <- match.arg(test %||% "POT", choices = c("POT", "PRIT", "PIET"))
+      alpha <- 1 - prob
+      gamma <- gamma %||% 0


The doc says "If NULL, automatically determined based on p-value" but this just sets it to 0 if NULL. Is that intentional?

Doesn't this mean that right now by default we highlight every positive contribution point/segment whenever p < alpha? Maybe that's what we want, but that's more than what's documented, right? Or am I misunderstanding? (quite possibly!)

The doc says "If NULL, automatically determined based on p-value" but this just sets it to 0 if NULL. Is that intentional?

Thanks, this was wrong in the documentation. I updated it: "If NULL (default), gamma is set to 0, and thus all suspicious points are flagged."

jgabry · 2026-04-08T17:32:20Z

R/ppc-loo.R

+      linewidth <- linewidth %||% 0.3
+      color <- color %||% c(ecdf = "grey60", highlight = "red")
+      help_text <- help_text %||% TRUE
+      pareto_pit <- pareto_pit %||% is.null(pit) && test %in% c("POT", "PIET")


I think R will parse this as

(pareto_pit %||% is.null(pit)) && (test %in% c("POT", "PIET"))

and not as

pareto_pit %||% (is.null(pit) && test %in% c("POT", "PIET"))

So if we have

pareto_pit = TRUE pit = NULL test = "PRIT"

then I think this will parse as TRUE && FALSE -> FALSE.

Do we not want this instead?

pareto_pit <- pareto_pit %||% (is.null(pit) && test %in% c("POT", "PIET"))

This is a bit confusing for me, so I could definitely be wrong!

Thank you for catching this behavior! I fixed the notation and created corresponding unittests.
Furthermore, I noticed that I need to catch the situation where a user provides both pareto_pit = TRUE and pit = pit (thus, pit is not NULL). I added an error with the user message:

`pareto_pit = TRUE` cannot be used together with a non-`NULL` `pit` value. Set either `pareto_pit = FALSE` or `pit = NULL`.

jgabry · 2026-04-08T17:46:35Z

R/ppc-loo.R

+  y_label <- if (plot_diff) "ECDF difference" else "ECDF"
+
+  if (method == "correlated") {
+    test_res <- posterior::uniformity_test(pit = pit, test = test)


This assumes that a pvalue and pointwise are always returned, but can't uniformity_test error in some cases? E.g.

posterior::uniformity_test(c(0.2, 0.8), "POT") posterior::uniformity_test(c(0.25, 0.5, 0.75), "POT") posterior::uniformity_test(0.5, "POT")

How should we handle this? Or are we unlikely to encounter it in the wild, so to speak?

Regarding uniformity_test : "POT" and "PRIT" are based on a truncated Cauchy combination test. That is, they error when no p-values below 0.5 exist; I am not aware of any condition where they would return NA. "PIET" is based on an untruncated version of CCT, it can return pvalue=NA under special conditions (which yields in an error later on in ppc_*_ecdf.

In the following, I worked out some edge cases (I don't know how likely this situation would actually happen "in the wild"):

# fails when test = "PIET" as pvalue=NA ppc_loo_pit_ecdf(pit = c(0, 1, 0.5), method = "correlated", test = "PIET") # works when test="POT" or "PRIT" OR method="independent" ppc_loo_pit_ecdf(pit = c(0, 1, 0.5), method = "correlated", test = "POT") ppc_loo_pit_ecdf(pit = c(0, 1, 0.5), method = "independent") # fails when test "POT" or "PRIT" as no pvalues < 0.5 exist ppc_loo_pit_ecdf(pit = c(0.8, 0.5, 0.3, 0.1), method = "correlated", test = "POT") # works when test="PIET" OR method="independent" ppc_loo_pit_ecdf(pit = c(0.8, 0.5, 0.3, 0.1), method = "correlated", test = "PIET") ppc_loo_pit_ecdf(pit = c(0.8, 0.5, 0.3, 0.1), method = "independent")

Now, I am uncertain what a clean suggestion for a user from a methodological point of view is (@avehtari):

changing the test type (e.g., from POT to PIET or the other way around)

allowing to change truncation behavior for the test in case of failing

something else

…_pit'

… corresponding unittests

…e-bockting/bayesplot into dependence-aware-LOO-PIT

florence-bockting added 30 commits February 20, 2026 22:39

add helper functions for correlation-aware uniformity test

3e0e568

incorporate in ppc_loo_pit_ecdf correlation-aware uniformity test

c02215c

add tests for updated ppc_loo_pit_ecdf function

0d58176

formatting and efficiency improvements

ee2a3fc

improve description and computation of uniformity tests for pcc_loo_p…

583b3e2

…it_ecdf

add unit-tests for dependence-aware uniformity tests

317a753

do not remove NA when using sort() in pot_test

3afa4ac

move unittest for uniformity tests from test-ppc-loo.R to test-helper…

c8e63fd

…s-ppc.R

improve descripting and naming of function to compute shapley values

609f48f

improve descripting and naming of function to compute shapley values

bbfedc5

add unittest for compute_shapley_values

18f6539

improve computation of Cauchy combination test and std. Cauchy values

2ca91a3

make variable namings more descriptive

d6a18ce

add unittests for Cauchy combination test

8b55c2f

improve efficiency of influential_points_idx

da23cdf

add unittests for influential_points_idx

32a4087

update unittests for ppc_loo_pit_ecdf

7ca5b9b

remove influential_points_idx function

a66358f

remove unittests for influential_points_idx

d62cd06

remove infl_points_only argument, add linewith, color arguments, and …

753053f

…improve input validation checks

minor fixes in plotting behavior (fontsize, linestyle) and input vali…

1ae2595

…dation

adjust tests to new ppc_loo_pit_ecdf implementation

270f67e

add compute_cauchy function

7489701

transition from -qcauchy(x) to compute_cauchy(x)

22048bd

add unittests for compute_cauchy

8180cb7

minor adjustments for regression-tests

703b9f0

change svg snapshots for regression tests

7f6fb12

improve documentation for arguments linewidth, color in ppc_loo_pit_ecdf

11086f9

update PPC-loo.Rd file with new arguments

5321ac0

add tests for plot_diff=TRUE and method=correlated

478e96e

Merge branch 'master' into dependence-aware-LOO-PIT

c38b4e2

florence-bockting mentioned this pull request Mar 26, 2026

Vignette for updated ppc_loo_pit_ecdf and ppc_pit_ecdf incl. correlated method #513

Open

Florence Bockting and others added 10 commits March 26, 2026 12:24

Merge branch 'dependence-aware-LOO-PIT' of https://github.com/florenc…

27f6206

…e-bockting/bayesplot into dependence-aware-LOO-PIT

fix: remove custom files from gitignore

cf3f604

chore: update .gitignore to match master

496c435

Merge branch 'master' into dependence-aware-LOO-PIT

1315164

build: update minimum posterior version

5d969d9

feat: update ppc-pit-ecdf-grouped to support correlated method

c2d8d1c

Merge branch 'dependence-aware-LOO-PIT' of https://github.com/florenc…

a3d7b13

…e-bockting/bayesplot into dependence-aware-LOO-PIT

tests: fix monkey-patching

e086c9c

tests: check for vectorized in monkey-patch

990ab98

style: clean-up code

ebed5b8

florence-bockting changed the title ~~Dependence aware tests for ppc_loo_pit_ecdf~~ Dependence aware tests for ppc_*_ecdf Apr 5, 2026

florence-bockting changed the title ~~Dependence aware tests for ppc_*_ecdf~~ Dependence aware tests for ppc*ecdf Apr 5, 2026

Florence Bockting added 3 commits April 5, 2026 10:28

docs: update documentation

ba836cc

fix: update expected test output

4f0e866

fix: update x-label in loo-pit, independent plot

5e3d077

Florence Bockting and others added 2 commits April 7, 2026 07:56

fix: adjust test to modified x-label

954b2ef

Merge branch 'master' into dependence-aware-LOO-PIT

3b73f35

florence-bockting marked this pull request as ready for review April 7, 2026 05:46

florence-bockting requested review from avehtari and jgabry April 7, 2026 05:47

florence-bockting mentioned this pull request Apr 7, 2026

Vignette for use of ppc*pit_ecdf() with new method argument #527

Draft

1 task

jgabry reviewed Apr 8, 2026

View reviewed changes

Florence Bockting and others added 5 commits April 10, 2026 12:06

docs: fix description of 'gamma' and add special condition to 'pareto…

e91bde2

…_pit'

fix: update condition for 'pareto_pit', improve input checks, and add…

0afafbe

… corresponding unittests

Merge branch 'dependence-aware-LOO-PIT' of https://github.com/florenc…

ed7d6fa

…e-bockting/bayesplot into dependence-aware-LOO-PIT

Merge branch 'master' into dependence-aware-LOO-PIT

2cd5577

fix: merge description

2eac7c2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Dependence aware tests for `ppc*ecdf`#428

Dependence aware tests for `ppc*ecdf`#428
florence-bockting wants to merge 157 commits intostan-dev:masterfrom
florence-bockting:dependence-aware-LOO-PIT

florence-bockting commented Mar 4, 2026 •

edited

Loading

Uh oh!

florence-bockting commented Apr 7, 2026

Uh oh!

jgabry left a comment

Uh oh!

jgabry Apr 8, 2026

Uh oh!

jgabry Apr 8, 2026

Uh oh!

florence-bockting Apr 10, 2026

Uh oh!

jgabry Apr 8, 2026 •

edited

Loading

Uh oh!

florence-bockting Apr 10, 2026

Uh oh!

jgabry Apr 8, 2026 •

edited

Loading

Uh oh!

florence-bockting Apr 10, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

florence-bockting commented Mar 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

TODOs

Uh oh!

florence-bockting commented Apr 7, 2026

Uh oh!

jgabry left a comment

Choose a reason for hiding this comment

Uh oh!

jgabry Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

jgabry Apr 8, 2026

Choose a reason for hiding this comment

Uh oh!

florence-bockting Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

jgabry Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

florence-bockting Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

jgabry Apr 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

florence-bockting Apr 10, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

florence-bockting commented Mar 4, 2026 •

edited

Loading

jgabry Apr 8, 2026 •

edited

Loading

jgabry Apr 8, 2026 •

edited

Loading