Skip to content

Add check_bfmi function #500

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 32 commits into from
Mar 15, 2022
Merged

Add check_bfmi function #500

merged 32 commits into from
Mar 15, 2022

Conversation

jsocolar
Copy link
Contributor

@jsocolar jsocolar commented May 15, 2021

Submission Checklist

  • Run unit tests
  • Declare copyright holder and agree to license (see below)

Summary

Added function check_bfmi, which takes as input the output of sampler_diagnostics, computes the estimated Bayesian fraction of missing information (E-BFMI) for each chain, and prints a message if any chain has E-BFMI less than 0.3. Note that this uses the threshold of 0.3, consistent with cmdstan's diagnose.cpp (see line 154), but different from the 0.2 threshold used in rstan's check_energy function (see line 259).

Copyright and Licensing

Please list the copyright holder for the work you are submitting
(this will be you or your assignee, such as a university or company):
Jacob B. Socolar

By submitting this pull request, the copyright holder is agreeing to
license the submitted work under the following licenses:

@jsocolar
Copy link
Contributor Author

@codecov-commenter
Copy link

codecov-commenter commented May 15, 2021

Codecov Report

Merging #500 (7d57394) into master (95da08a) will decrease coverage by 1.13%.
The diff coverage is 96.00%.

❗ Current head 7d57394 differs from pull request most recent head 0297c96. Consider uploading reports for the commit 0297c96 to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master     #500      +/-   ##
==========================================
- Coverage   93.11%   91.97%   -1.14%     
==========================================
  Files          12       12              
  Lines        3237     3214      -23     
==========================================
- Hits         3014     2956      -58     
- Misses        223      258      +35     
Impacted Files Coverage Δ
R/utils.R 89.63% <95.00%> (-0.65%) ⬇️
R/csv.R 98.21% <100.00%> (-0.45%) ⬇️
R/fit.R 98.28% <100.00%> (+0.01%) ⬆️
R/zzz.R 75.00% <0.00%> (-6.82%) ⬇️
R/install.R 64.68% <0.00%> (-5.35%) ⬇️
R/run.R 94.08% <0.00%> (-1.65%) ⬇️
R/model.R 92.64% <0.00%> (-0.42%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 95da08a...0297c96. Read the comment docs.

Copy link
Member

@jgabry jgabry left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @jsocolar! I made a few small suggestions in review comments, but this looks good. Besides those suggestions we just need a test in https://github.com/stan-dev/cmdstanr/blob/master/tests/testthat/test-utils.R.

We also need to decide how to use this function (but that doesn't necessarily have to be decided before merging this). Currently this is implemented in utils.R but then not used anywhere so there won't actually be any warnings thrown. One option is to add this check to all the places we do the divergence and treedepth checks (that probably makes sense). We could also combine the divergence, treedepth and ebfmi checks into a fit$diagnose() method. Thoughts? Also tagging @rok-cesnovar to get his opinion.

@rok-cesnovar
Copy link
Member

One option is to add this check to all the places we do the divergence and treedepth checks (that probably makes sense).

For now, I think we should add it to the two places where other checks are used. In fact I just pushed a commit that does that.

But then we need to do the fit$diagnose() yes, but that should be done once this is in. See my comment in the issue: #205 (comment)

@rok-cesnovar
Copy link
Member

And thanks Jacob for working on this!

@jgabry
Copy link
Member

jgabry commented May 15, 2021

@rok-cesnovar Any idea why tests are failing? Currently I don't think anything in this PR should affect the tests that used to be passing. I can look into it if it's mysterious to you too.

@rok-cesnovar
Copy link
Member

rok-cesnovar commented May 15, 2021

Any idea why tests are failing?

One cause of failures is the edge case of iter_sampling = 1

cmdstanr_example("logistic", iter_sampling = 1)

I think one issue is var(x) in that case? This is a weird edge case and I am not sure what ebfmi should be in that case.

Currently I don't think anything in this PR should affect the tests that used to be passing.

It will run after sampling in all tests (where validate_csv isnt manually set to FALSE).

@jgabry
Copy link
Member

jgabry commented May 15, 2021

This is a weird edge case and I am not sure what ebfmi should be in that case.

Good point. I don't think we should compute it for 1 iteration.

@jsocolar
Copy link
Contributor Author

Thanks a bunch guys, especially for spotting the iter_sampling = 1 issue (I was mystified about the checks!). I may not have time to take care of this tomorrow (Sunday), but Monday should be doable.

@rok-cesnovar
Copy link
Member

No rush at all! Thanks!

@jgabry
Copy link
Member

jgabry commented May 17, 2021

Just a heads up that I merged in master and fixed a conflict introduced by a different PR that was merged.

@jsocolar
Copy link
Contributor Author

This now passes checks (again) locally

@rok-cesnovar
Copy link
Member

have a function to compute the diagnostic and another function that inputs the result of that and throws warnings if necessary

I like this one.

@jgabry
Copy link
Member

jgabry commented May 21, 2021

would you prefer a fix so that check_ebfmi doesn't error in this case, or a fix so that diagnose doesn't run if there are no iterations to diagnose?

I think it's fine to just not run any of the diagnostics if there are no iterations (we can't diverge or hit max treedepth either in that case). @rok-cesnovar?

@rok-cesnovar
Copy link
Member

Agreed. No sampler diagnostics => no checks.

R/utils.R Outdated
)
if (any(ebfmi < ebfmi_threshold)) {
message(paste0(sum(ebfmi < ebfmi_threshold), " of ", length(ebfmi), " chains had energy-based Bayesian fraction ",
"of missing information (E-BFMI) less than ", ebfmi_threshold, ", which may indicate poor exploration of the ",
Copy link
Member

@jgabry jgabry May 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we last discussed this @martinmodrak made a suggestion in

https://discourse.mc-stan.org/t/issues-with-e-bfmi-missing-doc-confusing-name-etc/22553/17

that we just use the name E-BFMI and don't bother with the full "energy-based Bayesian fraction of missing information", which we all agree is confusing. I think ultimately Martin is probably right, but for this PR I'm ok with either leaving it as or changing it.

@jgabry
Copy link
Member

jgabry commented May 21, 2021

The fourth test failure is mysterious to me as I cannot reproduce locally.

Agree it's mysterious. I have a theory about it that I'm looking into. Will report back shortly.

@jgabry
Copy link
Member

jgabry commented May 21, 2021

Agree it's mysterious. I have a theory about it that I'm looking into. Will report back shortly.

Still mysterious. I thought perhaps it was due to the recently released R 4.1.0, but I just installed it and still can't reproduce the error locally.

@rok-cesnovar
Copy link
Member

The tests should be fixed now. The issues were handling the check when no samples were output and a test that had hard coded order of the sampler diagnostics names (the order has now changed because we retrieve energy__ earlier if we run the checks (previously we only read in treedepth__ and divergences__). Sorry for the force-push, botched the first commit.

@rok-cesnovar
Copy link
Member

I want to work on the fit$diagnose() stuff in the not to distant future and decided to finish the last few things missing here. Also did a bit of a cleanup. @jsocolar hopefully you dont mind.

Changes:

  • pulled out the actual computation. That function will warn if ebfmi can not be computed for any of the reasons (no energy__ column, less than 3 values, NAs in the column)

  • check_ebfmi additionally messages if the computed values are below the provided threshold.

  • the check_ebfmi and ebfmi functions require that the input is in one of the posterior draws formats (if used with sampler diagnostics it always will be). That way we avoid the convert that was there now.

This is ready for another look. No rush, I know its vacation season. I can work on the fit$diagnose() on top of this branch anyways now that its close to the finish line.

@rok-cesnovar rok-cesnovar requested a review from jgabry October 14, 2021 19:09
@jgabry
Copy link
Member

jgabry commented Nov 2, 2021

Sorry for the delay on this. I just pushed a few minor edits and I think this is ready to go.

But I'm wondering if we should wait to merge it until we have #505 so that we have a resource for users when they see these warnings. At the moment I think only a tiny subset of users will have any idea what to do when they see an E-BFMI warning. Not saying we should definitely wait, just wondering out loud.

@rok-cesnovar
Copy link
Member

Definitely fine with waiting. If this becomes the last thing left for 1.0 then we can re-think.

@jgabry
Copy link
Member

jgabry commented Nov 4, 2021

Going to leave this open for now, but we probably won't need to actually merge this since all of @jsocolar's commits are included in #585.

Also, @jsocolar in the branch for PR #585 I'm going to add you to the DESCRIPTION file as a contributor. I realized you're not listed but I think we should list you, both for code contributions (ebfmi , future #565 code, probably other stuff you've done that I don't recall at the moment) and especially for contributing to discussions about the development of the package, which can be just as important as actually contributing code. (If you'd rather not be included let me know and I can undo it.)

@jgabry jgabry merged commit 0297c96 into stan-dev:master Mar 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants