Skip to content

Create new kfold.print method#342

Open
florence-bockting wants to merge 4 commits intostan-dev:masterfrom
florence-bockting:update-loo-print
Open

Create new kfold.print method#342
florence-bockting wants to merge 4 commits intostan-dev:masterfrom
florence-bockting:update-loo-print

Conversation

@florence-bockting
Copy link
Copy Markdown

@florence-bockting florence-bockting commented Mar 27, 2026

Description

In brms (PR#1869), we updated the kfold function such that it now also returns pareto-k diagnostics.

This PR suggests a new kfold.print method that prints additionally to the loo.print output information about pareto-k diagnostics.

TODO

  • add kfold.print method to support pareto-k diagnostics if they exist
  • add test data and unittests to ensure that pareto-k diagnostics are printed correctly with the updated method
  • add unittest that ensures that traditional behavior of kfold.print reduces to loo.print if no diagnostics$pareto_k in kfold object exists

@codecov-commenter
Copy link
Copy Markdown

codecov-commenter commented Mar 27, 2026

Codecov Report

❌ Patch coverage is 91.66667% with 1 line in your changes missing coverage. Please review.
✅ Project coverage is 92.80%. Comparing base (03a6932) to head (9bd9ff0).
⚠️ Report is 52 commits behind head on master.

Files with missing lines Patch % Lines
R/print.R 91.66% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #342      +/-   ##
==========================================
+ Coverage   92.78%   92.80%   +0.02%     
==========================================
  Files          31       31              
  Lines        2992     3004      +12     
==========================================
+ Hits         2776     2788      +12     
  Misses        216      216              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@florence-bockting
Copy link
Copy Markdown
Author

The current behavior shows the following:

If all pareto-k are acceptable:

Based on 10-fold cross-validation.

           Estimate   SE
elpd_kfold   -284.7 10.0
p_kfold         2.4  0.6
kfoldic       569.3 20.1
------

All Pareto k estimates are good (k < 0.7).
See help('pareto-k-diagnostic') for details.

If some pareto-k are problematic:

Based on 10-fold cross-validation.

           Estimate     SE
elpd_kfold  -5521.0  713.1
p_kfold       318.5   97.9
kfoldic     11042.0 1426.3
------

Pareto k diagnostic values:
                         Count Pct.    Min. ESS
(-Inf, 0.7]   (good)     249   95.0%   <NA>    
   (0.7, 1]   (bad)        6    2.3%   <NA>    
   (1, Inf)   (very bad)   7    2.7%   <NA>    
See help('pareto-k-diagnostic') for details.

If no pareto-k diagnostics exist in kfold output structure:

Based on 10-fold cross-validation.

           Estimate     SE
elpd_kfold  -5521.0  713.1
p_kfold       318.5   97.9
kfoldic     11042.0 1426.3

As we have only pareto-k information in the diagnostics the Min. ESS column is always <NA>. Shall we somehow communicate this further to the user @avehtari ? I didn't want to change the default structure of the pareto-k-table, therefore I didn't consider further the option to simply delete this column.

@florence-bockting florence-bockting marked this pull request as ready for review March 27, 2026 13:10
@florence-bockting
Copy link
Copy Markdown
Author

florence-bockting commented Mar 27, 2026

I don't really understand what the issue with the failing R-CMD-check for ubuntu-latest is. Do you have any idea what the problem on my side can be?

@avehtari
Copy link
Copy Markdown
Member

As we have only pareto-k information in the diagnostics the Min. ESS column is always .

Or we could add the pointwise ESS's to diagnostics

@jgabry
Copy link
Copy Markdown
Member

jgabry commented Mar 27, 2026

I don't really understand what the issue with the failing R-CMD-check for ubuntu-latest is. Do you have any idea what the problem on my side can be?

I think there's a bug in r-devel. I'm seeing this with cmdstanr too. I bet it will be fixed soon.

test_that("print.loo supports kfold with pareto-k diagnostics - calibrated", {
kfold1 <- readRDS("data-for-tests/kfold-calibrated.Rds")

expect_output(print(kfold1), "All Pareto k estimates are good")
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we use expect_snapshot here to test the entire output since it should be deterministic, right? Same with the others below. I know that the existing print tests don't do that, but they were written a long time ago and maybe we should update them to do that too? (not in this PR but at some point)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants