-
Notifications
You must be signed in to change notification settings - Fork 3.8k
GH-46636: [R] Fix evaluation of external objects not in global environment in case_when()
#46667
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
I don't know the cause of the "Dev" workflow failure, it just says the job was cancelled at some point. Edit: looks related to #46668 |
Thanks for the PR @etiennebacher, much appreciated! OK, this is a bit long-winded, but I'm a bit unsure about something so wanted to spell out my reasoning.
Source of the issue:
Or, are there no unintended consequences likely (as cc @nealrichardson - is this good to go as-is? |
Thanks for the detailed reasoning. I should have probably explained a bit more what I had tried. My first attempt was indeed to use
This doesn't make any sense to me, I tried to play with the
(it's important to do local first, and then global). Here's the output of > waldo::compare(mask_global, mask_local)
`old$.data$x$.:xp:.` is <pointer: 0x000001b2a2bee6d0>
`new$.data$x$.:xp:.` is <pointer: 0x000001b2a2bee330>
`old$.data$x$schema$.:xp:.` is <pointer: 0x000001b2a2bee7d0>
`new$.data$x$schema$.:xp:.` is <pointer: 0x000001b2a2bee3d0>
`old$.data$x$schema$fields[[1]]$.:xp:.` is <pointer: 0x000001b2a2bee350>
`new$.data$x$schema$fields[[1]]$.:xp:.` is <pointer: 0x000001b2a2bee650>
`old$.data$x$schema$fields[[1]]$type$.:xp:.` is <pointer: 0x000001b2a2bee750>
`new$.data$x$schema$fields[[1]]$type$.:xp:.` is <pointer: 0x000001b2a2bee870>
`parent.env(old$.env)` is <env:global>
`parent.env(new$.env)` is <env:0x000001b2a6293438>
`parent.env(old$.top_env)` is <env:global>
`parent.env(new$.top_env)` is <env:0x000001b2a6293438> I see that the parent environment is different, but I can't figure out where this could be fixed. I'm open to suggestions on this, I agree that modifying |
Thanks for the extra context there, that's super helpful! I agree it should probably be in the test suite if anything else could affect it though there's always weird edge cases. I'll give it a few days so other folks get the chance to give it a look in case I'm missing anything here, but if not, happy to merge. Thanks again! |
As long as the new test fails without the fix and it passes with it, I guess so? I don't understand how passing the |
Cool, just checking there wasn't anything "obvious" I was missing. I'll approve then. Cheers @etiennebacher for fixing this! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fantastic, thanks for fixing!
After merging your PR, Conbench analyzed the 4 benchmarking runs that have been run so far on merge-commit 835992c. There were 67 benchmark results with an error:
There were no benchmark performance regressions. 🎉 The full Conbench report has more details. It also includes information about 4 possible false positives for unstable benchmarks that are known to sometimes produce them. |
…environment in `case_when()` (apache#46667) ### Rationale for this change When a script is called in an environment that isn't the global environment (for instance with `source("my-script.R", local = new.env())`, `case_when()` would fail to detect external objects used in conditions. This PR fixes this behavior. Fixes apache#46636 ### What changes are included in this PR? When evaluating expressions in `dplyr` functions, `eval_tidy()` now takes into account `mask` as an environment where it should look for external objects. @ thisisnic suggested in apache#46636 that the bug might be due to https://github.com/apache/arrow/blob/main/r/R/dplyr-funcs-conditional.R#L116 but I couldn't find a way to fix it there. ### Are these changes tested? I added a test for this scenario. I ensured it failed before the change and succeeds after. ### Are there any user-facing changes? There is one user-facing, non-breaking change, illustrated both in the related issue and in the new test. * GitHub Issue: apache#46636 Authored-by: etiennebacher <[email protected]> Signed-off-by: Nic Crane <[email protected]>
Rationale for this change
When a script is called in an environment that isn't the global environment (for instance with
source("my-script.R", local = new.env())
,case_when()
would fail to detect external objects used in conditions.This PR fixes this behavior.
Fixes #46636
What changes are included in this PR?
When evaluating expressions in
dplyr
functions,eval_tidy()
now takes into accountmask
as an environment where it should look for external objects.@thisisnic suggested in #46636 that the bug might be due to https://github.com/apache/arrow/blob/main/r/R/dplyr-funcs-conditional.R#L116 but I couldn't find a way to fix it there.
Are these changes tested?
I added a test for this scenario. I ensured it failed before the change and succeeds after.
Are there any user-facing changes?
There is one user-facing, non-breaking change, illustrated both in the related issue and in the new test.
case_when()
sometimes doesn't find existing objects in the environment #46636