Skip to content

[REF] Cache nodes in workflow to speed up construction, other optimizations #3331

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 11 commits into from
Apr 30, 2021

Conversation

HippocampusGirl
Copy link
Contributor

@HippocampusGirl HippocampusGirl commented Apr 30, 2021

Summary

I'm working with fMRIPrep and large data sets (via the ENIGMA consortium). In my testing, Nipype workflows with more than a few hundred thousand nodes can be prohibitively slow, so that it can take days before processing starts. This comes specifically from the function _generate_flatgraph and generate_expanded_graph. While my previous pull requests #3184 and #3260 fixed some bottlenecks, I recently had the time to do some more profiling and try to fix some more.
I understand that Pydra is just around the corner, and that switching to it may solve my performance troubles then. However, the small changes outlined here can reduce the time before processing starts by approximately half (in my testing with a workflow of 75k nodes).
I have tried to divide up the changes into independent commits, so that you can decide which ones may be interesting for inclusion.

Here is my profile before starting out. Please not that the absolute durations are not interpretable, because the profiler reduces performance by a lot.
2021-04-30_13-04

Here is the profile with all proposed changes. Note that _create_flat_graph has become the dark gray box without caption on the bottom right.
2021-04-30_14-16

List of changes proposed in this PR (pull-request)

  • The workflow._has_node function is called for every new connection, and currently iterates through all nodes in the workflow and nested workflows. It then compares each node to the wanted_node. We can speed this up, because networkx graphs are dictionaries and have support the in statement. This way we can reduce the scaling behavior from linear to almost constant time, but only in the best-case scenario that the wanted_node is in the workflow.
    Otherwise, we still need to iterate to find all nested workflows. To overcome this last limitations, we can cache the nested workflows in a set, and update it whenever the graph is changes.

    def _has_node(self, wanted_node):
    for node in self._graph.nodes():
    if wanted_node == node:
    return True
    if isinstance(node, Workflow):
    if node._has_node(wanted_node):
    return True
    return False

  • The _merge_graphs function uses np.unique for a list of strings. We can use the built-in set to avoid casting to Numpy array. np.unique additionally sorts the input array, which we really don't need here

    if len(np.unique(ids)) != len(ids):

  • Use sets wherever order is not important and we just need to check whether a string or a node is contained or not, again reducing linear scaling to a constant time look-up.

    if destname not in connected_ports[destnode]:
    connected_ports[destnode] += [destname]

  • Remove a topological sort that may not be necessary. Please correct me if I'm wrong :-)

    nodes = list(nx.topological_sort(self._graph))

  • Avoid re-compiling regular expressions in a loop by using Python's built-in functions

    if re.match(
    src_id + r"((\.[a-z](I\.[a-z])?|J)\d+)?$", node.itername
    ):

Acknowledgment

  • (Mandatory) I acknowledge that this contribution will be available under the Apache 2 license.

- Networkx represents graphs as a dictionary where the nodes are the
  keys. As such, we can use the built-in __contains__ function for fast
  lookup
- We can also iterate the graphs directly without constructing a
  NodeView using self._graph.nodes()
- Also avoids casting the list to np.array
- Update every time we connect/disconenct or add/remove a node
- Keep track of which nodes are workflows and which are not
- As a result, we do not need to iterate all nodes to determine the
  result of `_has_node`, we can use O(1) set operations
- Do not need to loop over entries
- Faster O(1) __contains__
- Faster operations for update and contains
- The function is  only adding/removing nodes and adjusting their
  connections
- As such, there are no serial dependencies, and we can iterate in any
  order
- The only thing that changes about the regular expression is the prefix
- We can also detect the prefix with startswith, and then use the same
  regular expression across the loop
- This means that Python can cache the compiled regex internally, and we
  save some time
@codecov
Copy link

codecov bot commented Apr 30, 2021

Codecov Report

Merging #3331 (0c485d3) into master (54f5029) will increase coverage by 3.26%.
The diff coverage is 92.68%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #3331      +/-   ##
==========================================
+ Coverage   65.05%   68.31%   +3.26%     
==========================================
  Files         302      302              
  Lines       39953    48781    +8828     
  Branches     5287     7224    +1937     
==========================================
+ Hits        25992    33327    +7335     
- Misses      12896    14234    +1338     
- Partials     1065     1220     +155     
Flag Coverage Δ
unittests 68.16% <92.68%> (+3.17%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
nipype/pipeline/engine/utils.py 76.10% <75.00%> (+1.61%) ⬆️
nipype/pipeline/engine/workflows.py 70.75% <96.96%> (+5.86%) ⬆️
nipype/utils/logger.py 81.60% <0.00%> (-3.01%) ⬇️
nipype/utils/onetime.py 81.81% <0.00%> (-2.80%) ⬇️
nipype/pipeline/plugins/base.py 57.89% <0.00%> (-1.84%) ⬇️
nipype/interfaces/niftyseg/label_fusion.py 55.42% <0.00%> (-1.72%) ⬇️
nipype/interfaces/mrtrix3/preprocess.py 83.72% <0.00%> (-1.58%) ⬇️
nipype/interfaces/diffusion_toolkit/dti.py 61.64% <0.00%> (-1.57%) ⬇️
nipype/pipeline/plugins/legacymultiproc.py 67.64% <0.00%> (-1.48%) ⬇️
nipype/interfaces/freesurfer/base.py 75.18% <0.00%> (-1.44%) ⬇️
... and 101 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 54f5029...0c485d3. Read the comment docs.

Copy link
Member

@effigies effigies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome. This all looks correct. A few thoughts, but nothing blocking. Looking at the style check, it looks like black released a new version that affects a lot of files, so it should not block this PR, either.

Copy link
Member

@effigies effigies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two last thoughts, but otherwise this LGTM.

- Apply suggestions from code review

Co-authored-by: Chris Markiewicz <[email protected]>
@HippocampusGirl
Copy link
Contributor Author

Thank you for your quick feedback! I always fell like I'm learning a lot from your code reviews

@effigies effigies changed the title [REF] Performance for large workflows [REF] Cache nodes in workflow to speed up construction, other optimizations Apr 30, 2021
@effigies
Copy link
Member

Thank you!

@effigies effigies merged commit c9e7097 into nipy:master Apr 30, 2021
@HippocampusGirl HippocampusGirl deleted the has-node-performance branch April 30, 2021 20:42
@effigies effigies added this to the 1.6.1 milestone May 1, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants