Skip to content

[ENH] Minor fixes #833

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
May 20, 2021
Merged

[ENH] Minor fixes #833

merged 6 commits into from
May 20, 2021

Conversation

samukweku
Copy link
Collaborator

@samukweku samukweku commented May 16, 2021

PR Description

Please describe the changes proposed in the pull request:

  • coalesce returns the entire dataframe, along with a new column (if target_column_name is not None)
  • uses the bfill(1).ffill(1) syntax, which is faster than reduce(combine.first)
  • adds a default_value parameter, to fill any remaining nulls.
  • update update_where to use pd.eval for string-like conditions.

This PR resolves #831 .

PR Checklist

Please ensure that you have done the following:

  1. PR in from a fork off your branch. Do not PR from <your_username>:dev, but rather from <your_username>:<feature-branch_name>.
  1. If you're not on the contributors list, add yourself to AUTHORS.rst.
  1. Add a line to CHANGELOG.md under the latest version header (i.e. the one that is "on deck") describing the contribution.
    • Do use some discretion here; if there are multiple PRs that are related, keep them in a single line.

Automatic checks

There will be automatic checks run on the PR. These include:

  • Building a preview of the docs on Netlify
  • Automatically linting the code
  • Making sure the code is documented
  • Making sure that all tests are passed
  • Making sure that code coverage doesn't go down.

Relevant Reviewers

Please tag maintainers to review.

@samukweku samukweku self-assigned this May 16, 2021
@codecov-commenter
Copy link

codecov-commenter commented May 16, 2021

Codecov Report

Merging #833 (2990439) into dev (93203b8) will increase coverage by 0.07%.
The diff coverage is 100.00%.

❗ Current head 2990439 differs from pull request most recent head 828e015. Consider uploading reports for the commit 828e015 to get more accurate results

@@            Coverage Diff             @@
##              dev     #833      +/-   ##
==========================================
+ Coverage   95.26%   95.34%   +0.07%     
==========================================
  Files          19       19              
  Lines        1966     1975       +9     
==========================================
+ Hits         1873     1883      +10     
+ Misses         93       92       -1     

Comment on lines +71 to +75
expected = df.assign(s3=df.s1.combine_first(df.s2).fillna(0))
result = df.coalesce(
["s1", "s2"], target_column_name="s3", default_value=0
)
assert_frame_equal(result, expected)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a much nicer expression of the test!

Comment on lines -24 to +26
def test_update_where_query():
"""Test that function works with pandas query-style string expression."""
df = pd.DataFrame(
@pytest.fixture
def df():
return pd.DataFrame(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice refactor!

Copy link
Member

@ericmjl ericmjl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent work, @samukweku! Thank you for handling these improvements. I am approving! Will let it simmer for one or two more days before we merge.

@hectormz hectormz merged commit cdfab2e into dev May 20, 2021
@samukweku samukweku deleted the minor_fixes branch February 6, 2022 02:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Coalesce returns a Series, instead of a dataframe
4 participants