-
Notifications
You must be signed in to change notification settings - Fork 86
Classification tools: use binary search in numpy case #760
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov ReportBase: 79.88% // Head: 79.90% // Increases project coverage by
Additional details and impacted files@@ Coverage Diff @@
## master #760 +/- ##
==========================================
+ Coverage 79.88% 79.90% +0.02%
==========================================
Files 19 19
Lines 4170 4175 +5
==========================================
+ Hits 3331 3336 +5
Misses 839 839
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. ☔ View full report at Codecov. |
Comparing the benchmarking on Numpy case, we got:
When the number of bins is 1, the perf is almost the same |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@thuydotm LGTM...should we make a new issue for CuPy case?
Fixes #758
Internally, most of our classification tools call to
_bin()
. In both NumPy and CuPy cases, it classifies data into different bins by looping through all the bins sequentially. This PR makes change to the way we look for correct bin of a value by using binary search.