Add check_bfmi function #500

jsocolar · 2021-05-15T02:56:53Z

Submission Checklist

Run unit tests
Declare copyright holder and agree to license (see below)

Summary

Added function check_bfmi, which takes as input the output of sampler_diagnostics, computes the estimated Bayesian fraction of missing information (E-BFMI) for each chain, and prints a message if any chain has E-BFMI less than 0.3. Note that this uses the threshold of 0.3, consistent with cmdstan's diagnose.cpp (see line 154), but different from the 0.2 threshold used in rstan's check_energy function (see line 259).

Copyright and Licensing

Please list the copyright holder for the work you are submitting
(this will be you or your assignee, such as a university or company):
Jacob B. Socolar

By submitting this pull request, the copyright holder is agreeing to
license the submitted work under the following licenses:

Code: BSD 3-clause (https://opensource.org/licenses/BSD-3-Clause)

jsocolar · 2021-05-15T02:57:47Z

See also https://discourse.mc-stan.org/t/ebfmi-in-cmdstanr/22509

codecov-commenter · 2021-05-15T10:30:42Z

Codecov Report

Merging #500 (7d57394) into master (95da08a) will decrease coverage by 1.13%.
The diff coverage is 96.00%.

❗ Current head 7d57394 differs from pull request most recent head 0297c96. Consider uploading reports for the commit 0297c96 to get more accurate results

@@            Coverage Diff             @@
##           master     #500      +/-   ##
==========================================
- Coverage   93.11%   91.97%   -1.14%     
==========================================
  Files          12       12              
  Lines        3237     3214      -23     
==========================================
- Hits         3014     2956      -58     
- Misses        223      258      +35

Impacted Files	Coverage Δ
R/utils.R	`89.63% <95.00%> (-0.65%)`	⬇️
R/csv.R	`98.21% <100.00%> (-0.45%)`	⬇️
R/fit.R	`98.28% <100.00%> (+0.01%)`	⬆️
R/zzz.R	`75.00% <0.00%> (-6.82%)`	⬇️
R/install.R	`64.68% <0.00%> (-5.35%)`	⬇️
R/run.R	`94.08% <0.00%> (-1.65%)`	⬇️
R/model.R	`92.64% <0.00%> (-0.42%)`	⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 95da08a...0297c96. Read the comment docs.

jgabry

Thanks @jsocolar! I made a few small suggestions in review comments, but this looks good. Besides those suggestions we just need a test in https://github.com/stan-dev/cmdstanr/blob/master/tests/testthat/test-utils.R.

We also need to decide how to use this function (but that doesn't necessarily have to be decided before merging this). Currently this is implemented in utils.R but then not used anywhere so there won't actually be any warnings thrown. One option is to add this check to all the places we do the divergence and treedepth checks (that probably makes sense). We could also combine the divergence, treedepth and ebfmi checks into a fit$diagnose() method. Thoughts? Also tagging @rok-cesnovar to get his opinion.

R/utils.R

rok-cesnovar · 2021-05-15T17:11:49Z

One option is to add this check to all the places we do the divergence and treedepth checks (that probably makes sense).

For now, I think we should add it to the two places where other checks are used. In fact I just pushed a commit that does that.

But then we need to do the fit$diagnose() yes, but that should be done once this is in. See my comment in the issue: #205 (comment)

rok-cesnovar · 2021-05-15T17:16:39Z

And thanks Jacob for working on this!

jgabry · 2021-05-15T17:25:15Z

@rok-cesnovar Any idea why tests are failing? Currently I don't think anything in this PR should affect the tests that used to be passing. I can look into it if it's mysterious to you too.

R/utils.R

rok-cesnovar · 2021-05-15T18:31:02Z

Any idea why tests are failing?

One cause of failures is the edge case of iter_sampling = 1

cmdstanr_example("logistic", iter_sampling = 1)

I think one issue is var(x) in that case? This is a weird edge case and I am not sure what ebfmi should be in that case.

Currently I don't think anything in this PR should affect the tests that used to be passing.

It will run after sampling in all tests (where validate_csv isnt manually set to FALSE).

jgabry · 2021-05-15T18:35:31Z

This is a weird edge case and I am not sure what ebfmi should be in that case.

Good point. I don't think we should compute it for 1 iteration.

jsocolar · 2021-05-16T03:26:11Z

Thanks a bunch guys, especially for spotting the iter_sampling = 1 issue (I was mystified about the checks!). I may not have time to take care of this tomorrow (Sunday), but Monday should be doable.

rok-cesnovar · 2021-05-16T08:06:42Z

No rush at all! Thanks!

jgabry · 2021-05-17T01:15:45Z

Just a heads up that I merged in master and fixed a conflict introduced by a different PR that was merged.

Co-authored-by: Jonah Gabry <[email protected]>

jsocolar · 2021-05-20T22:03:44Z

This now passes checks (again) locally

rok-cesnovar · 2021-05-21T16:19:23Z

have a function to compute the diagnostic and another function that inputs the result of that and throws warnings if necessary

I like this one.

jgabry · 2021-05-21T16:24:52Z

would you prefer a fix so that check_ebfmi doesn't error in this case, or a fix so that diagnose doesn't run if there are no iterations to diagnose?

I think it's fine to just not run any of the diagnostics if there are no iterations (we can't diverge or hit max treedepth either in that case). @rok-cesnovar?

rok-cesnovar · 2021-05-21T16:28:19Z

Agreed. No sampler diagnostics => no checks.

jgabry · 2021-05-21T17:04:00Z

R/utils.R

+    )
+    if (any(ebfmi < ebfmi_threshold)) {
+      message(paste0(sum(ebfmi < ebfmi_threshold), " of ", length(ebfmi), " chains had energy-based Bayesian fraction ",
+      "of missing information (E-BFMI) less than ", ebfmi_threshold, ", which may indicate poor exploration of the ", 


Since we last discussed this @martinmodrak made a suggestion in

https://discourse.mc-stan.org/t/issues-with-e-bfmi-missing-doc-confusing-name-etc/22553/17

that we just use the name E-BFMI and don't bother with the full "energy-based Bayesian fraction of missing information", which we all agree is confusing. I think ultimately Martin is probably right, but for this PR I'm ok with either leaving it as or changing it.

jgabry · 2021-05-21T17:13:22Z

The fourth test failure is mysterious to me as I cannot reproduce locally.

Agree it's mysterious. I have a theory about it that I'm looking into. Will report back shortly.

R/utils.R

jgabry · 2021-05-21T17:24:43Z

Agree it's mysterious. I have a theory about it that I'm looking into. Will report back shortly.

Still mysterious. I thought perhaps it was due to the recently released R 4.1.0, but I just installed it and still can't reproduce the error locally.

rok-cesnovar · 2021-05-25T10:29:17Z

The tests should be fixed now. The issues were handling the check when no samples were output and a test that had hard coded order of the sampler diagnostics names (the order has now changed because we retrieve energy__ earlier if we run the checks (previously we only read in treedepth__ and divergences__). Sorry for the force-push, botched the first commit.

Co-authored-by: Jonah Gabry <[email protected]>

rok-cesnovar · 2021-07-19T12:54:33Z

I want to work on the fit$diagnose() stuff in the not to distant future and decided to finish the last few things missing here. Also did a bit of a cleanup. @jsocolar hopefully you dont mind.

Changes:

pulled out the actual computation. That function will warn if ebfmi can not be computed for any of the reasons (no energy__ column, less than 3 values, NAs in the column)
check_ebfmi additionally messages if the computed values are below the provided threshold.
the check_ebfmi and ebfmi functions require that the input is in one of the posterior draws formats (if used with sampler diagnostics it always will be). That way we avoid the convert that was there now.

This is ready for another look. No rush, I know its vacation season. I can work on the fit$diagnose() on top of this branch anyways now that its close to the finish line.

# Conflicts: # NEWS.md

jgabry · 2021-11-02T15:24:50Z

Sorry for the delay on this. I just pushed a few minor edits and I think this is ready to go.

But I'm wondering if we should wait to merge it until we have #505 so that we have a resource for users when they see these warnings. At the moment I think only a tiny subset of users will have any idea what to do when they see an E-BFMI warning. Not saying we should definitely wait, just wondering out loud.

rok-cesnovar · 2021-11-02T15:56:03Z

Definitely fine with waiting. If this becomes the last thing left for 1.0 then we can re-think.

jgabry · 2021-11-04T21:13:18Z

Going to leave this open for now, but we probably won't need to actually merge this since all of @jsocolar's commits are included in #585.

Also, @jsocolar in the branch for PR #585 I'm going to add you to the DESCRIPTION file as a contributor. I realized you're not listed but I think we should list you, both for code contributions (ebfmi , future #565 code, probably other stuff you've done that I don't recall at the moment) and especially for contributing to discussions about the development of the package, which can be just as important as actually contributing code. (If you'd rather not be included let me know and I can undo it.)

Add check_bfmi function

e70b844

jgabry requested changes May 15, 2021

View reviewed changes

R/utils.R Outdated Show resolved Hide resolved

R/utils.R Outdated Show resolved Hide resolved

run check_bfmi after sampling

fdd6d73

jgabry reviewed May 15, 2021

View reviewed changes

R/utils.R Outdated Show resolved Hide resolved

jgabry reviewed May 15, 2021

View reviewed changes

R/utils.R Outdated Show resolved Hide resolved

jgabry added 2 commits May 16, 2021 19:11

merge in master and fix conflict in utils.R

430d631

Merge branch 'master' into master

a571e97

jsocolar and others added 11 commits May 20, 2021 13:23

call var as stats::var

635b66e

Co-authored-by: Jonah Gabry <[email protected]>

better messages and error handling

9d0f895

brought the return inside the if statement

d5411b1

better message

7d7d602

fixed stupid error message typo

3050048

changed fmi to ebfmi everywhere for consistency

fe16508

formatting for consistency

5726402

added tests

039fc28

resolved merge conflict

383e144

better error messages

9634ad7

cleaning up typos from prev commit

d1ba154

realized the ebfmi is meaningless for chains of length 2

1c5cb62

jgabry reviewed May 21, 2021

View reviewed changes

R/utils.R Outdated Show resolved Hide resolved

fix testing issues

5e8e250

rok-cesnovar and others added 6 commits May 25, 2021 13:15

remove niterations call

e130563

Apply suggestions from code review

b1c43a1

Co-authored-by: Jonah Gabry <[email protected]>

cleanup

1a95d76

pull ebfmi compute out, fix tests

4079b59

Merge branch 'master' into master-jsocolar

adaffd6

updated NEWS.md

1cef6c1

rok-cesnovar added 4 commits July 21, 2021 12:00

Merge branch 'master' into master

6be57c9

Merge branch 'master' into master

38be933

Update test-utils.R

c33ebeb

Merge remote-tracking branch 'origin/master' into master-jsocolar

53719ec

# Conflicts: # NEWS.md

rok-cesnovar requested a review from jgabry October 14, 2021 19:09

rok-cesnovar and others added 2 commits November 2, 2021 09:38

Merge remote-tracking branch 'origin/master' into master-jsocolar

1f6bd1d

# Conflicts: # NEWS.md

Merge branch 'master' into pr/500

9c27923

jgabry mentioned this pull request Nov 2, 2021

Replace recommendations in warning messages with link to website #505

Closed

minor edits

0297c96

jgabry mentioned this pull request Nov 4, 2021

New method summarizing sampler diagnostics and warnings #585

Merged

2 tasks

jgabry merged commit 0297c96 into stan-dev:master Mar 15, 2022

Uh oh!

Add check_bfmi function #500

Add check_bfmi function #500

Uh oh!

Conversation

jsocolar commented May 15, 2021 • edited by jgabry Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Submission Checklist

Summary

Copyright and Licensing

Uh oh!

jsocolar commented May 15, 2021

Uh oh!

codecov-commenter commented May 15, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

jgabry left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

rok-cesnovar commented May 15, 2021

Uh oh!

rok-cesnovar commented May 15, 2021

Uh oh!

jgabry commented May 15, 2021

Uh oh!

Uh oh!

Uh oh!

rok-cesnovar commented May 15, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jgabry commented May 15, 2021

Uh oh!

jsocolar commented May 16, 2021

Uh oh!

rok-cesnovar commented May 16, 2021

Uh oh!

jgabry commented May 17, 2021

Uh oh!

jsocolar commented May 20, 2021

Uh oh!

rok-cesnovar commented May 21, 2021

Uh oh!

jgabry commented May 21, 2021

Uh oh!

rok-cesnovar commented May 21, 2021

Uh oh!

jgabry May 21, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jgabry commented May 21, 2021

Uh oh!

Uh oh!

jgabry commented May 21, 2021

Uh oh!

rok-cesnovar commented May 25, 2021

Uh oh!

rok-cesnovar commented Jul 19, 2021

Uh oh!

jgabry commented Nov 2, 2021

Uh oh!

rok-cesnovar commented Nov 2, 2021

Uh oh!

jgabry commented Nov 4, 2021

Uh oh!

Uh oh!

jsocolar commented May 15, 2021 •

edited by jgabry

Loading

codecov-commenter commented May 15, 2021 •

edited

Loading

rok-cesnovar commented May 15, 2021 •

edited

Loading

jgabry May 21, 2021 •

edited

Loading