Skip to content

Extend dataset summary to create stats for each node/edge type #7203

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Apr 24, 2023

Conversation

kjkozlowski
Copy link
Contributor

example output for hetero data:

FakeHeteroDataset (#graphs=20):
+------------+----------+----------+
|            |   #nodes |   #edges |
|------------+----------+----------|
| mean       |   3945.4 |  49149.2 |
| std        |    247.8 |   3481.2 |
| min        |   3300   |  40440   |
| quantile25 |   3849   |  47016.8 |
| median     |   3974   |  49820   |
| quantile75 |   4129   |  51367.8 |
| max        |   4207   |  53761   |
+------------+----------+----------+
Number of nodes per node type:
+------------+--------+--------+--------+--------+
|            |    #v0 |    #v1 |    #v2 |    #v3 |
|------------+--------+--------+--------+--------|
| mean       |  980.5 |  994.8 |  944.7 | 1025.4 |
| std        |  140.7 |  123.8 |  123.9 |  123.7 |
| min        |  765   |  770   |  754   |  753   |
| quantile25 |  862.2 |  895   |  863   |  936.2 |
| median     |  979   | 1005   |  915   | 1067   |
| quantile75 | 1068.8 | 1064.8 | 1036   | 1103.2 |
| max        | 1235   | 1197   | 1201   | 1225   |
+------------+--------+--------+--------+--------+
Number of edges per edge type:
+------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+
|            |   #('v0', 'e0', 'v2') |   #('v2', 'e0', 'v0') |   #('v1', 'e0', 'v0') |   #('v3', 'e0', 'v0') |   #('v1', 'e0', 'v3') |
|------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------|
| mean       |                9749.5 |                9400.5 |                9897   |               10203   |                9899.2 |
| std        |                1395.8 |                1229.9 |                1230.2 |                1230.6 |                1228.8 |
| min        |                7613   |                7502   |                7650   |                7498   |                7657   |
| quantile25 |                8577.2 |                8589.5 |                8899.8 |                9315.2 |                8901.5 |
| median     |                9732   |                9108   |               10014   |               10611   |                9978   |
| quantile75 |               10624.5 |               10307.8 |               10584.8 |               10971.5 |               10597   |
| max        |               12278   |               11935   |               11909   |               12186   |               11908   |
+------------+-----------------------+-----------------------+-----------------------+-----------------------+-----------------------+

For datasets that haven't node_types or edge_types defined or calling from_dataset method with parameter per_type_breakdown=False output will be the same as before the change.

@codecov
Copy link

codecov bot commented Apr 18, 2023

Codecov Report

Merging #7203 (c4b2d49) into master (b07a84f) will decrease coverage by 0.43%.
The diff coverage is 100.00%.

❗ Current head c4b2d49 differs from pull request most recent head deb655c. Consider uploading reports for the commit deb655c to get more accurate results

@@            Coverage Diff             @@
##           master    #7203      +/-   ##
==========================================
- Coverage   92.03%   91.61%   -0.43%     
==========================================
  Files         437      437              
  Lines       24111    24140      +29     
==========================================
- Hits        22191    22116      -75     
- Misses       1920     2024     +104     
Impacted Files Coverage Δ
torch_geometric/data/summary.py 98.75% <100.00%> (+0.83%) ⬆️

... and 20 files with indirect coverage changes

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@kjkozlowski kjkozlowski force-pushed the krzysztofk/extend_dataset_summary branch 5 times, most recently from 286eb3a to 1890d6a Compare April 20, 2023 09:15
@kjkozlowski kjkozlowski force-pushed the krzysztofk/extend_dataset_summary branch from 1890d6a to 15cf788 Compare April 20, 2023 12:39
@rusty1s rusty1s changed the title extend dataset summary to create stats for each node/edge type, it is useful for hetero data Extend dataset summary to create stats for each node/edge type Apr 24, 2023
@rusty1s rusty1s enabled auto-merge (squash) April 24, 2023 10:32
@rusty1s rusty1s merged commit edbf8fc into pyg-team:master Apr 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants