Skip to content

[ENH] add preserve_position kwarg to deconcatenate_column #478 #484

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Jul 21, 2019

Conversation

shandou
Copy link
Collaborator

@shandou shandou commented Jul 20, 2019

PR Description

This PR adds a boolean kwarg preserve_position to pyjanitor function deconcatenate_column. The default is set to False to avoid introducing breaking changes.

When preserve_position=True
df.columns change from [... id ...] into [... col1, col2, ...]
When preserve_position=False (default)
df.columns change from [... id ...] into [... id ... col1, col2]

This PR resolves #478.

PR Checklist

Please ensure that you have done the following:

  1. PR in from a fork off your branch. Do not PR from <your_username>:master, but rather from <your_username>:<branch_name>.
  1. If you're not on the contributors list, add yourself to AUTHORS.rst.

Quick Check

To do a very quick check that everything is correct, follow these steps below:

  • Run the command make check from pyjanitor's top-level directory. This will automatically run:
    • black formatting
    • pycodestyle checking
    • running the test suite
    • docs build

Once done, please check off the check-box above.

If make check does not work for you, you can execute the commands listed in the Makefile individually.

Code Changes

If you are adding code changes, please ensure the following:

  • Ensure that you have added tests.
  • Run all tests ($ pytest .) locally on your machine.
    • Check to ensure that test coverage covers the lines of code that you have added.
    • Ensure that all tests pass.

Documentation Changes

If you are adding documentation changes, please ensure the following:

  • Build the docs locally.
  • View the docs to check that it renders correctly.

Relevant Reviewers

Please tag maintainers to review.

@review-notebook-app
Copy link

Check out this pull request on ReviewNB: https://app.reviewnb.com/ericmjl/pyjanitor/pull/484

You'll be able to see visual diffs and write comments on notebook cells. Powered by ReviewNB.

@codecov
Copy link

codecov bot commented Jul 20, 2019

Codecov Report

Merging #484 into dev will increase coverage by 0.09%.
The diff coverage is 100%.

@@            Coverage Diff             @@
##              dev     #484      +/-   ##
==========================================
+ Coverage   92.52%   92.61%   +0.09%     
==========================================
  Files           9        9              
  Lines         789      799      +10     
==========================================
+ Hits          730      740      +10     
  Misses         59       59

@shandou
Copy link
Collaborator Author

shandou commented Jul 20, 2019

Oh wow, my apologies--yet to find my way in testing and not yet sure about what is causing the reduction in testing coverage and what remedies I should make.

@ericmjl also have additional questions regarding tests:

  1. Where is the dataframe for testing originally generated and where in the repo can I see how it is loaded for all the pieces in test folder?
  2. When writing tests, what should I be more aware of regarding test coverage?
  3. Is it possible/okay to only test function that involves new changes? (I had been using make test yesterday that ran all the tests; this may not have been necessary?)

I added tests but not certain if my approach is proper. Will dig deeper into TDD but if you have recommendations/suggestions, please let me know. Thanks!!

Copy link
Member

@ericmjl ericmjl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR, @shandou! I've given my review over the code, and I think it's in the right frame of mind, just needs a few touches on the structure of the test and on docstrings. After that, I think it'll be ready to merge.

@ericmjl
Copy link
Member

ericmjl commented Jul 20, 2019

Regarding your questions on tests, here’s my answers:

Where is the dataframe for testing originally generated and where in the repo can I see how it is loaded for all the pieces in test folder?

It is found in tests/conftest.py. All of the test fixtures (i.e. “constants”) that are shared across modules are found there.

When writing tests, what should I be more aware of regarding test coverage?

I would focus on getting the core functionality as well-tested as you can. I’m not worried about coverage as much as getting the test expressing the right thing.

Is it possible/okay to only test function that involves new changes? (I had been using make test yesterday that ran all the tests; this may not have been necessary?)

Yes! To do so, at the command line, run:

$ pytest -k test_myfunction_name  # replace with the correct function name

The -k flag can be discovered by running

$ pytest -h

@shandou
Copy link
Collaborator Author

shandou commented Jul 20, 2019

@ericmjl I made the changes and docs render correctly. When you have a chance please review and let me know if anything else could be improved. Also, thank you so much for all the suggestions and answers!

Another side note about the notebook-based examples on docs:
Because the code cells currently do not allow horizontal scrolling, when code lines run long (it is a layout issue; not because those lines have run over the 80 character limit) they can get chopped off. Are there options to control overflow behaviors for the code cells?

@ericmjl ericmjl changed the title [ENH] Deconcat columns enh #478 [ENH] add preserve_position kwarg to deconcatenate_column #478 Jul 21, 2019
@ericmjl
Copy link
Member

ericmjl commented Jul 21, 2019

Thanks @shandou!

On the issue:

Because the code cells currently do not allow horizontal scrolling, when code lines run long (it is a layout issue; not because those lines have run over the 80 character limit) they can get chopped off. Are there options to control overflow behaviors for the code cells?

Let’s bring this up in a separate GitHub issue, and continue the conversation there.

@@ -0,0 +1,29 @@
import pytest
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’m so sorry, I should have made this clearer in the docs. This test should be in test_deconcatenate_column.py, because all tests for a particular function go in a single .py file. I don’t want to hold up this PR, so I’ll let itgo, and PR back the correct changes into dev, so don’t worry about it 😄.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, sorry about that! (now I know what to do if similar situations are to happen in future). Thanks a lot!
Also, the entire CI process feels like magic~ I'll check if in the scipy talk you have talked about how this was set up, and how the stacks behind it was chosen.

@ericmjl ericmjl merged commit 2ff6499 into pyjanitor-devs:dev Jul 21, 2019
@ericmjl
Copy link
Member

ericmjl commented Jul 21, 2019

Thanks for the PR, @shandou!

@shandou shandou deleted the deconcat_columns_ENH-#478 branch July 21, 2019 01:29
@shandou
Copy link
Collaborator Author

shandou commented Jul 21, 2019

Thanks @shandou!

On the issue:

Because the code cells currently do not allow horizontal scrolling, when code lines run long (it is a layout issue; not because those lines have run over the 80 character limit) they can get chopped off. Are there options to control overflow behaviors for the code cells?

Let’s bring this up in a separate GitHub issue, and continue the conversation there.

@ericmjl Correction: Just checked again and the code cells actually allow horizontal scroll if I do a mouse click on them. So I think we are good (I think users who want to copy/paste code snippets would apply a mouse click) :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ENH] deconcatenate_column to preserve the original column location
2 participants