Description
While fixing #56919, I did some analysis of the use patterns of DominatedUses.of
.
It is not clear that it is worth improving DominatedUses.of
, but there is potential to make the query more efficient by using fewer data structures in _compute
. The hottest place is the call to Setlet.add
, where the LinkedHashSet lookup shows up as ~5% of compile time in pathological methods with thousands of back-to-back array accesses.
for (HInstruction current in source.usedBy) {
if (!seen.add(current)) continue;
Statistic on the use of DominatedUses.of
for a large program.
DominatedUses.of(source, dominator)
is a query that finds all the uses of source
that are a use by an instruction that is dominated by dominator
.
For -O4
, most (83%) of queries have a single input.
Input size | count | fraction |
---|---|---|
all | 6569860 | 1.000 |
1 | 5491527 | .836 |
2 | 463784 | .071 |
3 | 173455 | .026 |
4 | 92028 | .014 |
5 | 47292 | .007 |
6–10 | 115160 | .017 |
11–100 | 172607 | .026 |
>100 | 13999 | .002 |
Most (93%) of queries give an empty result set.
Result size | count | fraction |
---|---|---|
all | 6639133 | 1.000 |
0 | 6203211 | .934 |
1 | 196187 | .030 |
2 | 48190 | .007 |
3 | 30531 | .005 |
4 | 13832 | .002 |
5 | 8636 | .001 |
6–10 | 23992 | .004 |
11–100 | 43502 | .007 |
>100 | 1779 | .000 |
This is not surprising since the query is used to find places where we 'know more' about a value, and to learn something about a value, one of the uses of the value is the operation that tests or checks the value.
Excluding single-input queries the distributions are:
Input size | count | fraction |
---|---|---|
all | 1078325 | 1.000 |
2 | 463784 | .430 |
3 | 173455 | .161 |
4 | 92028 | .085 |
5 | 47292 | .044 |
6–10 | 115160 | .107 |
11–100 | 172607 | .160 |
>100 | 13999 | .013 |
Result size | count | fraction |
---|---|---|
all | 1078325 | 1.000 |
0 | 711676 | .660 |
1 | 196187 | .182 |
2 | 48190 | .045 |
3 | 30531 | .028 |
4 | 13832 | .013 |
5 | 8636 | .008 |
6..10 | 23992 | .022 |
11..100 | 43502 | .040 |
>100 | 1779 | .002 |
Over all queries there are a total of 16055198 inputs, of which 2194037, or 13.7%, find their way into a result. If we exclude the 5491527 single-input queries, then of the 10563671 inputs, 20.8% are in a result.
Larger queries are more expensive since they dominance-test each element of the input. While queries with more than 100 inputs are just 0.2% of queries, they are 24.2% of the work, and queries with more than 10 inputs (4.2% of the queries) are 72.3% of the work.
Input size | tests | fraction |
---|---|---|
all | 10563671 | 1.000 |
2 | 927568 | .088 |
3 | 520365 | .049 |
4 | 368112 | .035 |
5 | 236460 | .022 |
6..10 | 866205 | .082 |
11..100 | 5083281 | .481 |
>100 | 2561680 | .242 |
To summarize:
- Most single-input queries can easily be avoided at the call-site (accounting for about a third of the dominance tests)
- Of the remaining queries, large queries account for most of the work, and maintaining the hash sets (
seen
andusers
) is a large component of that work.
The above results are for -O4
. Results for -O2
are similar. There are 12% more queries, but almost all are single-input queries for the implicit checks at method entry.