Skip to content

Add checker for nested min/max #7550

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

osherdp
Copy link
Contributor

@osherdp osherdp commented Oct 1, 2022

  • Write a good description on what the PR does.
  • Create a news fragment with towncrier create <IssueNumber>.<type> which will be
    included in the changelog. <type> can be one of: new_check, removed_check, extension,
    false_positive, false_negative, bugfix, other, internal. If necessary you can write
    details or offer examples on how the new change is supposed to work.

Type of Changes

Type
✨ New feature

Description

This adds a new checker (not active by default) which identifies usages similar to min(<arg1>, min(<arg2>, <arg3>)) and suggests using a simplified form of min(<arg1>, <arg2>, <arg3>). Same goes for max usage.

The logic is as follows: it detects calls to either min or max functions, and whenever one of their arguments is that same function, it emits the message.

Closes #7546

@coveralls
Copy link

coveralls commented Oct 1, 2022

Pull Request Test Coverage Report for Build 3510087842

  • 40 of 40 (100.0%) changed or added relevant lines in 1 file are covered.
  • 103 unchanged lines in 9 files lost coverage.
  • Overall coverage increased (+0.02%) to 95.428%

Files with Coverage Reduction New Missed Lines %
pylint/init.py 1 97.3%
pylint/pyreverse/main.py 2 92.86%
pylint/checkers/refactoring/refactoring_checker.py 3 98.0%
pylint/checkers/utils.py 7 95.1%
pylint/epylint.py 7 84.13%
pylint/checkers/stdlib.py 8 96.67%
pylint/checkers/base/basic_checker.py 9 97.85%
pylint/checkers/variables.py 32 97.43%
pylint/checkers/typecheck.py 34 96.05%
Totals Coverage Status
Change from base Build 3455918345: 0.02%
Covered Lines: 17471
Relevant Lines: 18308

💛 - Coveralls

@osherdp osherdp force-pushed the feature/nested-min-max-checker branch 2 times, most recently from 79d4ebd to 5dea2aa Compare October 1, 2022 14:46
@@ -0,0 +1,54 @@
# Licensed under the GPL: https://www.gnu.org/licenses/old-licenses/gpl-2.0.html
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can become part of stdlib.py. I don't think it needs to be an extension.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it mean it will be activated by default?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better if we keep the file as it is before adding to stdlib (I think the base checker is an example of a checker dispathed across multiple files, the checker name is the same in all files).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't the splitting across multiple files make pylint slower? There are more functional calls, class initialisations etc. If the file isn't too large we can combine them right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It make everything harder to maintain though. Clearly differentiating checks help tremendously when keeping clean code because it's possible to understand everything about a 50 lines checker, but it's clearly impossible to understand and thus cleanup a 2000 lines stdlib checker behemoth if a function, constant (or more interestingly for performances: condition or calculation) become useless.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, okay. So how would you want to add this then?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we want to make this a default check we can add a new checker (just move the file in pylint/checker) and make it a default checker (register it) with a new name or with an existing checker name.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed it into a new checker!
Let me know which existing checker name I should use if you have any preference

@Pierre-Sassoulas Pierre-Sassoulas added the Enhancement ✨ Improvement to a component label Oct 1, 2022
@Pierre-Sassoulas Pierre-Sassoulas added this to the 2.16.0 milestone Oct 1, 2022
Copy link
Member

@Pierre-Sassoulas Pierre-Sassoulas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would make the message clearer by stating the actual replacing code. (This could be useful if we want to automatically fix in the future too).

@osherdp osherdp force-pushed the feature/nested-min-max-checker branch 2 times, most recently from 404d124 to e3060e2 Compare October 1, 2022 19:15
@DanielNoord
Copy link
Collaborator

@osherdp Please don't force-push your branch. It makes reviewing more difficult. We will squash merge your changes into the main branch anyway so the commit history in this branch doesn't need to be perfect 😄

@mbyrnepr2
Copy link
Member

I think the doc CI is failing because of missing good/bad examples for this new checker. Existing docs live here: https://github.com/PyCQA/pylint/tree/main/doc/data/messages

Copy link
Collaborator

@DanielNoord DanielNoord left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for moving the checker. Could you also add a good.py and bad.py example?

)

@classmethod
def get_redundant_calls(cls, node: nodes.Call) -> nodes.NodeNG:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
def get_redundant_calls(cls, node: nodes.Call) -> nodes.NodeNG:
def get_redundant_calls(cls, node: nodes.Call) -> list[nodes.Call]:

while len(redundant_calls) > 0:
for i, arg in enumerate(node.args):
if arg in redundant_calls:
node.args = node.args[:i] + arg.args + node.args[i + 1 :]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's not reassign anything on node itself. That changes the ast representation and is a side effect that might not work well with future checkers.

@DanielNoord
Copy link
Collaborator

Hi @osherdp are you still interested in working on this PR?

@osherdp
Copy link
Contributor Author

osherdp commented Nov 10, 2022

Hi @osherdp are you still interested in working on this PR?

Unfortunately I cannot find time to invest in it 😞
If anyone like to take over this work I wouldn't mind

@Pierre-Sassoulas Pierre-Sassoulas added the Needs take over 🛎️ Orignal implementer went away but the code could be salvaged. label Nov 11, 2022
@DanielNoord DanielNoord changed the title Adding optional checker for nested min/max Add checker for nested min/max Nov 13, 2022
@Pierre-Sassoulas
Copy link
Member

Merged upstream/main to get the primer's result.

@DanielNoord
Copy link
Collaborator

I think tests should now pass.

min(min(min(1, 2), 3), 4) # [nested-min-max, nested-min-max]

Raises twice which could be avoided by doing some name and inference look up on the parent of a nodes.Call. However, that would slow down the checker and the message isn't incorrect: there are two simplify-able messages.

Copy link
Member

@Pierre-Sassoulas Pierre-Sassoulas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM ! I'd suggest doing some documentation about the design choices. But feel free to disregard.

@DanielNoord DanielNoord removed the Needs take over 🛎️ Orignal implementer went away but the code could be salvaged. label Nov 13, 2022
@Pierre-Sassoulas
Copy link
Member

I forgot to wait for the primer's results.

osherdp and others added 5 commits November 13, 2022 20:16
This adds a new checker (not active by default) which identifies usages
similar to ``min(<arg1>, min(<arg2>, <arg3>))`` and suggests using a
simplified form of ``min(<arg1>, <arg2>, <arg3>)``. Same goes for
``max`` usage.

The logic is as follows: it detects calls to either ``min`` or ``max``
functions, and whenever one of their arguments is that same function, it
emits the message.
Update pylint/checkers/nested_min_max.py
@Pierre-Sassoulas Pierre-Sassoulas force-pushed the feature/nested-min-max-checker branch from 6abf7b0 to f035c32 Compare November 13, 2022 19:16
@Pierre-Sassoulas Pierre-Sassoulas dismissed DanielNoord’s stale review November 13, 2022 19:17

Took over the MR so approve automatically.

@github-actions

This comment has been minimized.

@Pierre-Sassoulas
Copy link
Member

I'm wondering how to make max_len = max(max(self.adj.len(x) for x in fmt_values), header_colwidth) better ? max_len = max(self.adj.len(x) for x in fmt_values) + [header_colwidth]) ? Is there another way ? I feel like it might be a false positive ?

@DanielNoord
Copy link
Collaborator

So, that would necessitate stopping the warning when there is a Comprehension inside the max call right?

@Pierre-Sassoulas
Copy link
Member

Maybe, but I'm not even sure it's a false positive tbh.

@DanielNoord
Copy link
Collaborator

I think the only code that works is:
max_len = max([self.adj.len(x) for x in fmt_values] + [header_colwidth])

That doesn't seem very nice to read, but that might just be personal preference.

@Pierre-Sassoulas
Copy link
Member

I have the intuition that this is better performance-wise as there is only one max call. If that's the case it would not be a false positive. But the solution is not super obvious I had to think about for longer than what pylint user are ready to invest imo.

@DanielNoord
Copy link
Collaborator

So, fix? Or keep it as a feature? 😄

@Pierre-Sassoulas
Copy link
Member

I was counting on you @jacobtylerwalls @mbyrnepr2 or @osherdp to take the decision for me 😄

Let's vote !

This is a false positive, we need to fix: : 👀
Let's keep it as a feature: 🚀

@nickdrozd
Copy link
Contributor

The proposed fix uses one less max call, but adds a list allocation. That could be slower, especially in an inner loop.

@Pierre-Sassoulas Pierre-Sassoulas added the Needs decision 🔒 Needs a decision before implemention or rejection label Nov 17, 2022
@mbyrnepr2
Copy link
Member

I voted in favour of fixing the false positive; the broken code (it is broken, right?) suggested by the pandas primer result. So perhaps we are not on the same page. I don't necessarily think suggesting the list approach is the solution to go for. Perhaps we need to suppress instead when there is a comprehension like Daniël mentioned.

@Pierre-Sassoulas
Copy link
Member

@nickdrozd your comment seems to indicate that you should have voted for considering it a false positive instead ?

@nickdrozd
Copy link
Contributor

@Pierre-Sassoulas Yes, I got the options backwards. I want it to NOT suggest rewriting max(max(self.adj.len(x) for x in fmt_values), header_colwidth) to get rid of the extra max. I can imagine being confronted with this warning and being very irritated by it!

That said, there is one way make this snippet IMO a little easier to read:

max(header_colwidth, max(self.adj.len(x) for x in fmt_values))

So there are nested max calls, but they're not too close to each other. Is there any way to suggest that?

@Pierre-Sassoulas
Copy link
Member

The council has spoken unanimousely : it's a false positive.

@clavedeluna clavedeluna removed the Needs decision 🔒 Needs a decision before implemention or rejection label Nov 17, 2022
@DanielNoord
Copy link
Collaborator

For now, I'm going to change the PR to not warn on the pattern anymore. The more intricate suggestion would require more time than I or the original author seems to have for this PR.

I just want to get this in a minimal viable state and merge as I still think it would be useful in its minimal state!

@DanielNoord DanielNoord requested review from Pierre-Sassoulas and removed request for Pierre-Sassoulas November 20, 2022 22:41
@github-actions
Copy link
Contributor

🤖 According to the primer, this change has no effect on the checked open source code. 🤖🎉

This comment was generated for commit 98b3408

@Pierre-Sassoulas Pierre-Sassoulas merged commit bb5781c into pylint-dev:main Nov 21, 2022
@Pierre-Sassoulas
Copy link
Member

Thank you for the initial implementation @osherdp, and congratulation on becoming a pylint contributor. Thank you for taking over @DanielNoord !

@cdce8p
Copy link
Member

cdce8p commented Nov 21, 2022

@Pierre-Sassoulas Please check the commit message before merging. I was just looking through the git log and saw that a new extension was added when in fact it's activated by default. I know these things are sometimes easy to miss 😅 bb5781c

@Pierre-Sassoulas
Copy link
Member

Good catch, that's my bad. I was thinking that we always add new things as extension now. I don't think it's fixable at this point :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement ✨ Improvement to a component
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Suggest >2 arguments for min/max functions
9 participants