[Feature] Separate graph rewriting and constant folding

For op fusion (like the fusion of conv and bn), we have implemented a "small onnxruntime" in [tensor.h](https://github.com/onnx/onnx/blob/master/onnx/common/tensor.h#L174-L368). It increases the workload (the more fusion we want to do, the more op we need to implement), and brings many problems (https://github.com/onnx/optimizer/issues/6, https://github.com/onnx/optimizer/issues/8, https://github.com/onnx/onnx/issues/2677). However, as we know, onnx itself is not designed to infer onnx ops. It is unwise to take the effort to maintain an "embedded runtime" in the presence of onnxruntime.

In my opinion, we should drop the "embedded runtime". Instead, we should only rewrite the graph, and then call onnxruntime library to fold the constants. In this way, we will not need tensor.h or another tensor library in optimizer anymore.

For example, to fuse Add(Add(x, 1), 2), instead of calculating the result of Add(1, 2) in onnx-optimizer itself, we can just rewrite the graph to Add(x, Add(1, 2)), and call onnxruntime to fold Add(1, 2) to 3.

It is also the way of tensorflow built-in optimizer.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature] Separate graph rewriting and constant folding #9

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature] Separate graph rewriting and constant folding #9

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions