Skip to content

RF: Rearranging matmul order and using hermitian flag in ICC_rep_anova #3453

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Apr 21, 2022

Conversation

kristoferm94
Copy link
Contributor

@kristoferm94 kristoferm94 commented Apr 21, 2022

Summary

See #3450

List of changes proposed in this PR (pull-request)

Calculation of predicted_Y refactored such that:

  • numpy.dot is replaced with the @ operator
  • The pseudo-inverse of X.T@X is now calculated with the hermitian flag set to True (pinv uses a more efficient algorithm when the matrix input is assumed to be hermitian, which will always be the case with X.T@X in this function).
  • The matrix multiplication order is changed in a way that we don't end up multiplying as many fat matrices as before.

Copy link
Member

@effigies effigies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! I ran some quick tests just to see what this looks like. The reordering seems to dominate, but both changes seem well worth making.

@@ -114,7 +114,7 @@ def ICC_rep_anova(Y):
X = hstack([x, x0])

# Sum Square Error
predicted_Y = dot(dot(dot(X, pinv(dot(X.T, X))), X.T), Y.flatten("F"))
predicted_Y = X @ (pinv(X.T @ X, hermitian=True) @ (X.T @ Y.flatten("F")))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For a simulated Y of shape 50 x 10, I get:

In []: %timeit dot(dot(dot(X, pinv(dot(X.T, X))), X.T), Y.flatten("F"))
1.42 ms ± 128 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

Convert to matmul:

In []: %timeit (X @ pinv(X.T @ X) @ X.T) @ Y.flatten("F")
1.4 ms ± 188 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

No change. Reorder multiplications:

In []: %timeit X @ (pinv(X.T @ X) @ (X.T @ Y.flatten("F")))
838 µs ± 43.3 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

41% speedup. Use hermetian optimization:

In []: %timeit X @ (pinv(X.T @ X, hermitian=True) @ (X.T @ Y.flatten("F")))
785 µs ± 17.8 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

6% relative speedup for total of 45%.

@effigies
Copy link
Member

I think you are going to want to remove the dot import from the top of the file, since it's no longer used. Otherwise looks good to me.

@codecov
Copy link

codecov bot commented Apr 21, 2022

Codecov Report

Merging #3453 (9cb58c3) into master (8d065ca) will not change coverage.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master    #3453   +/-   ##
=======================================
  Coverage   65.24%   65.24%           
=======================================
  Files         309      309           
  Lines       40833    40833           
  Branches     5376     5376           
=======================================
  Hits        26641    26641           
  Misses      13118    13118           
  Partials     1074     1074           
Flag Coverage Δ
unittests 65.02% <100.00%> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
nipype/algorithms/icc.py 57.53% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8d065ca...9cb58c3. Read the comment docs.

@kristoferm94
Copy link
Contributor Author

I removed the numpy.dot import

@effigies effigies merged commit 3dc858e into nipy:master Apr 21, 2022
@effigies
Copy link
Member

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants