[ENH] Add reset index flag to row_to_name function #849

fireddd · 2021-07-20T04:08:59Z

PR Description

Please describe the changes proposed in the pull request:

Add reset index flag to row_to_name function

This PR resolves #285.

PR Checklist

Please ensure that you have done the following:

[x ] PR in from a fork off your branch. Do not PR from <your_username>:dev, but rather from <your_username>:<feature-branch_name>.

[x ] If you're not on the contributors list, add yourself to AUTHORS.rst.

[x ] Add a line to CHANGELOG.md under the latest version header (i.e. the one that is "on deck") describing the contribution.
- Do use some discretion here; if there are multiple PRs that are related, keep them in a single line.

Automatic checks

There will be automatic checks run on the PR. These include:

Building a preview of the docs on Netlify
Automatically linting the code
Making sure the code is documented
Making sure that all tests are passed
Making sure that code coverage doesn't go down.

Relevant Reviewers

Please tag maintainers to review.

janitor/functions.py

ericmjl

@fireddd thanks for your work!

A few things I hope you could fix, please.

The first one is a bit nitpicky, but it's in the interest of keeping the codebase uniform - could you replace all tabs with spaces? I noticed your PR has some tabs and some spaces mixed in, and in the interest of keeping the codebase uniform, I think it might be better to use all spaces.

Secondly, could you add an associated test for the code? This simply guarantees that if the pandas' implementation changes, we get an early warning system in place.

Thanks for handling this PR, much appreciated!

fireddd · 2021-07-20T14:08:54Z

1-> Sure, I will fix the spaces and tabs mixup.

2-> Sorry I am not getting your point. What exactly is needed in the test? The default behavoiur of the tests will change so I missed that in my PR. Thanks for pointing out

fireddd · 2021-07-21T04:29:47Z

@ericmjl I have tried to address the indentation issue. Can you let me know if now it's alright?

ericmjl · 2021-07-21T11:35:36Z

Thanks @fireddd! Yes, spaces/indentation looks correct to me.

For tests, we usually request that any code changes be accompanied by a test function that goes into the test suite.

The pattern usually should be not too difficult to follow, but if you're having issues let us know.

Regarding your question, we use pytest fixtures to obtain the data frames necessary. The reference docs are here: https://docs.pytest.org/en/6.2.x/fixture.html.

I'm in a bit of a rush to type my reply, so if there's something unclear still, firstly, pardon my brevity, and secondly, be sure to ask here!

ericmjl · 2021-07-21T11:36:28Z

Btw, looks like code style checks are failing; could you run black locally and push up?

fireddd · 2021-07-21T12:49:39Z

How can I run these tests locally?

ericmjl · 2021-07-21T13:27:51Z

The dev docs should be able to give you a starting point! (We should have them in our official documentation site.)

fireddd · 2021-07-21T18:46:25Z

I was following this doc-> https://pyjanitor.readthedocs.io/CONTRIBUTION_TYPES.html#code-pattern-guidelines
Tried to run all the tests.
fireddd@LAPTOP-AFM6T731:/mnt/c/Users/shubh/Desktop/Learn Git/pyjanitor$ make test
Running test suite...
pytest --cov-report html
Traceback (most recent call last):
File "/usr/lib/python2.7/dist-packages/_pytest/config.py", line 365, in _importconftest
mod = conftestpath.pyimport()
File "/usr/lib/python2.7/dist-packages/py/_path/local.py", line 668, in pyimport
import(modname)
File "/mnt/c/Users/shubh/Desktop/Learn Git/pyjanitor/tests/conftest.py", line 17
"Bell__Chart": [1.234_523_45, 2.456_234, 3.234_612_5] * 3,
^
SyntaxError: invalid syntax
ERROR: could not load /mnt/c/Users/shubh/Desktop/Learn Git/pyjanitor/tests/conftest.py

Makefile:22: recipe for target 'test' failed
make: *** [test] Error 4
Could this be because i have pyt
fireddd@LAPTOP-AFM6T731:/mnt/c/Users/shubh/Desktop/Learn Git/pyjanitor$ python --version
Python 2.7.15+
fireddd@LAPTOP-AFM6T731:/mnt/c/Users/shubh/Desktop/Learn Git/pyjanitor$ python3 --version
Python 3.6.7
I am using a jupyter notebook that sets by defaults to python3 for dev but for some reason i can't use terminal there so i was trying to use windows bash for testing this

ericmjl · 2021-07-22T11:44:57Z

@fireddd please hang tight! There's another PR open (#851) that is infrastructural in nature that reinstates code coverage. It will give you a way of running tests and get feedback on which lines of code the tests are covering, without needing to run them locally.

In the meantime, you might want to check your PATH environment variable. (If you're unsure about what that is, I wrote a blog post to consolidate what I've learned.)

fireddd · 2021-07-22T19:49:35Z

tried fixing the terminal in my jupyter notebook, was able to do it.
PS C:\Users\shubh\Desktop\Learn Git\pyjanitor> python --version
Python 3.7.1
Installed make as i am on windows.
Following this https://github.com/pyjanitor-devs/pyjanitor/blob/89e3bd4ffe735f16f657ec73da0627e948c8b158/CONTRIBUTING.rst

However still facing some issues:
I am trying to see if i can understand the errors and am trying to fix them.
Can you test now?

fireddd

Now I think this exercise is complete

codecov · 2021-07-23T10:08:45Z

Codecov Report

Merging #849 (3e99789) into dev (e76db2c) will increase coverage by 0.03%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##              dev     #849      +/-   ##
==========================================
+ Coverage   94.86%   94.89%   +0.03%     
==========================================
  Files          19       19              
  Lines        2065     2078      +13     
==========================================
+ Hits         1959     1972      +13     
  Misses        106      106

fireddd · 2021-07-23T10:43:12Z

this is kindoff annoying now :(

ericmjl · 2021-07-23T11:25:07Z

@fireddd thanks for all of your hard work here! Just a few more things and I think we might be able to pull it over the finish line.

Firstly, can you merge in the latest dev branch? We just merged in a PR that I did that helps with some infrastructural matters. That may bump up the code coverage on your PR branch up to the newly set baseline from that PR.

Secondly, could you check that the lines of code that you added are covered by the test suite?

Code test coverage is something we don't compromise on because it is one (even if imperfect) measure of code quality. Ensuring that all newly contributed lines of code are covered by a test makes maintenance much easier.

I might write docs about this at some point, but on "how to write the test", they should be small and easy to understand while accurately replicating real-world usage of the function being tested.

fireddd · 2021-07-23T17:11:05Z

I have added the tests regarding the functionality that i added. I think that should cover the code that i wrote regarding resetting the index. Let's bring this home :)

ericmjl

This is very good work, @fireddd! We have one final thing pertaining to the tests that you've provided. They're in the same mould as the other tests in the same file, but I think we have an opportunity here to radically improve the quality of the test suite, starting with your contribution 😄. Please see the review pointer I made.

Other than that, I have to say that I enjoyed reviewing the PR. It was small in scope, thus making it easy to follow. Though there may have been frustrations with the CI system, we have them there to ensure code quality and maintainability in the long run. Thanks for being patient with it, I know it can be intimidating for a newcomer to the library, but it's also a great learning opportunity on how to use automation to maintain standards. Hope you also learned something along the way!

Final thing, I am going to ask the dev team for one more pair of eyes on the PR to ensure that it's not just me who knows what's going on here. This too is important for maintenance. Hang tight as there may be one or two more review pointers coming your way.

tests/functions/test_row_to_names.py

janitor/functions.py

Co-authored-by: Eric Ma <[email protected]>

… into 285_row_to_names

ericmjl

Wonderful work, @fireddd - approving! Let's get one more pair of eyes on the PR from the team before we merge.

jk3587 · 2021-07-25T14:57:58Z

Before this change, the default is not reset index = True. Wondering why we are making this default to True? This would be a breaking change for those who rely on the index to join to other dataframes.

fireddd · 2021-07-25T15:32:08Z

Before this change, the default is not reset index = True. Wondering why we are making this default to True? This would be a breaking change for those who rely on the index to join to other dataframes.

In the original issue mentioned that having resetting the index by default is a good idea( personally agreeing with it)
#285 ->
In both these cases, when rows are deleted post a column name change operation, there is a good chance of having a bad index.
Sometimes it's exactly what you need, but most of the time it's probably bad news. 🐈
However you have also raised a valid point. @ericmjl

ericmjl

@fireddd one final change requested based on @jk3587's comment about default behaviours. I think the warning message suggested in my review should help disambiguate what a user needs to do to prepare for a breaking change.

ericmjl · 2021-07-25T22:03:39Z

janitor/functions.py

@@ -2289,6 +2293,7 @@ def row_to_names(
    row_number: int = None,
    remove_row: bool = False,
    remove_rows_above: bool = False,
+    reset_index: bool = True,


@fireddd I think @jk3587 brings up a good point and that we should set the default value of reset_index to False.

This isn't because I disagree with the reasons that you stated. It's just that I think we should avoid making breaking changes without warning users.

The best course of action here, then, is to set the value to False and raise a warning for users that says something like:

warnings.warn( "The function row_to_names will, in the official 1.0 release, " "change its behaviour to reset the dataframe's index by default. " "You can prepare for this change right now by explicitly setting " "`reset_index=True` when calling on `row_to_names`." )

(If you copy/paste the message verbatim, please preserve the spacing just before the quotes!)

ericmjl · 2021-07-25T22:12:22Z

@jk3587 could you also mark your review as "approved"? Once I see the final changes from @fireddd, I'd like to invite you to do the honors of hitting the big green button 😄.

ericmjl

@fireddd we're good to go here.

@jk3587 please do the honors of hitting the big green button after you've reviewed!

fireddd added 4 commits July 20, 2021 09:18

Add reset index flag to row_to_name function

205c149

Fix indentation

bc0a48c

Update on indentation fix

57c33d4

Final indentation fix

d9401d2

ericmjl reviewed Jul 20, 2021

View reviewed changes

janitor/functions.py Outdated Show resolved Hide resolved

Indentation fix on docstring

80d5639

ericmjl requested changes Jul 20, 2021

View reviewed changes

fireddd added 2 commits July 21, 2021 09:47

Merge branch 'dev' into 285_row_to_names

42eb45c

Updated indentation fixes

e170aed

fireddd added 3 commits July 22, 2021 00:00

minor fixes for definition

8d25f6e

Minor length fixes

d2712a0

updated definition

922681a

more error fixes

3b9ecd3

fireddd added 2 commits July 23, 2021 01:31

black fixes

654bbf1

Added test for the reset_index functionality of row_to_names

fb99a52

fireddd commented Jul 23, 2021

View reviewed changes

Fix more errors

e90c69e

Merge branch 'dev' into 285_row_to_names

ef454bc

fireddd requested a review from ericmjl July 23, 2021 21:08

fireddd changed the title ~~Add reset index flag to row_to_name function~~ [ENH] Add reset index flag to row_to_name function Jul 23, 2021

ericmjl requested changes Jul 24, 2021

View reviewed changes

tests/functions/test_row_to_names.py Show resolved Hide resolved

janitor/functions.py Outdated Show resolved Hide resolved

fireddd and others added 3 commits July 25, 2021 00:07

update the parameters according to sphinx-style

13f3e38

Co-authored-by: Eric Ma <[email protected]>

Update the tests for row_to_names function

3257eff

Merge branch '285_row_to_names' of https://github.com/fireddd/pyjanitor…

9b1f7aa

… into 285_row_to_names

fireddd requested a review from ericmjl July 24, 2021 21:03

Fix typo in testing

b299e8d

ericmjl approved these changes Jul 25, 2021

View reviewed changes

ericmjl requested review from samukweku, nvamsikrishna05, jk3587 and loganthomas July 25, 2021 14:41

ericmjl requested changes Jul 25, 2021

View reviewed changes

fireddd mentioned this pull request Jul 25, 2021

Clean up tests for row_to_name function #853

Open

fireddd added 2 commits July 31, 2021 13:36

revert the default behaviour of the function

752e1af

Correct the default value in documentation

76fd44e

fireddd requested a review from ericmjl July 31, 2021 08:09

fireddd added 2 commits July 31, 2021 14:01

Fix test while resetting

b77becc

fix the tests

3e99789

ericmjl approved these changes Jul 31, 2021

View reviewed changes

jk3587 approved these changes Jul 31, 2021

View reviewed changes

jk3587 merged commit 6040418 into pyjanitor-devs:dev Jul 31, 2021

[ENH] Add reset index flag to row_to_name function #849

[ENH] Add reset index flag to row_to_name function #849

Uh oh!

Conversation

fireddd commented Jul 20, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

PR Description

PR Checklist

Automatic checks

Relevant Reviewers

Uh oh!

Uh oh!

ericmjl left a comment

Choose a reason for hiding this comment

Uh oh!

fireddd commented Jul 20, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fireddd commented Jul 21, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ericmjl commented Jul 21, 2021

Uh oh!

ericmjl commented Jul 21, 2021

Uh oh!

fireddd commented Jul 21, 2021

Uh oh!

ericmjl commented Jul 21, 2021

Uh oh!

fireddd commented Jul 21, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ericmjl commented Jul 22, 2021

Uh oh!

fireddd commented Jul 22, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

fireddd left a comment

Choose a reason for hiding this comment

Uh oh!

codecov bot commented Jul 23, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

fireddd commented Jul 23, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ericmjl commented Jul 23, 2021

Uh oh!

fireddd commented Jul 23, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ericmjl left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

ericmjl left a comment

Choose a reason for hiding this comment

Uh oh!

jk3587 commented Jul 25, 2021

Uh oh!

fireddd commented Jul 25, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ericmjl left a comment

Choose a reason for hiding this comment

Uh oh!

ericmjl Jul 25, 2021

Choose a reason for hiding this comment

Uh oh!

ericmjl commented Jul 25, 2021

Uh oh!

ericmjl left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

fireddd commented Jul 20, 2021 •

edited

Loading

fireddd commented Jul 20, 2021 •

edited

Loading

fireddd commented Jul 21, 2021 •

edited

Loading

fireddd commented Jul 21, 2021 •

edited

Loading

fireddd commented Jul 22, 2021 •

edited

Loading

codecov bot commented Jul 23, 2021 •

edited

Loading

fireddd commented Jul 23, 2021 •

edited

Loading

fireddd commented Jul 23, 2021 •

edited

Loading

fireddd commented Jul 25, 2021 •

edited

Loading