Skip to content

Performance issues with rdflib.compare #2528

Open
@william-vw

Description

@william-vw

We're having some performance issues with using rdflib.compare. While the test files contains a lot of blank nodes, they are relatively small (50, 75, 100 lines); even then one can see the performance drop with order(s) of magnitude. (Our actual files have close to 6000 lines).

Checkout the repo here with the test code: https://github.com/william-vw/rdflib_compare_test

On my machine, I get the following times:

file: test1.ttl
isomorphic: 0.06714916229248047
canonical: 0.03405618667602539

file: test2.ttl
isomorphic: 3.1934521198272705
canonical: 1.6439540386199951

file: test3.ttl
isomorphic: 17.75542116165161
canonical: 8.694658994674683

I'm basing myself on this example to perform the compare (for testing purposes I'm comparing a graph with itself).

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions