Clarify filter fields usage in javadocs #14660
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
To resolve #14427
Update documentation for Lucene Monitor to make clear that filter fields should appear only in the query metadata and incoming documents, and should not appear in the query itself.
Problem
This is described in more detail in #14427 -- a rough summary is given below:
When
TermFilteredPresearcher
builds a presearcher query, it separates out filter fields into their own clause (withBooleanClause.Occur.FILTER
), ANDed with a clause for all other terms. If a stored monitor query is a conjunction of fields including the filter field, it's possible that the filter field may be chosen as the indexed field for the document in the monitor's index, in which case the presearcher query will need to search for the filter field in order to match. However, since the filter field is deliberately omitted from the terms clause, the terms clause will never contain that filter field and can never match that stored monitor query. Although the filter clause will match the query, since the two clauses are ANDed together, the stored query doesn't match the presearcher query as a whole, so the stored query is not returned from presearching (and therefore does not match).Supporting this would potentially cause more queries to run than before, and it's debatable that allowing the filter field to float freely in the stored query (where it could have, for example, a NOT applied to it) would lead to inconsistency.
Solution
Side-stepping the larger question of whether there is a path forward with filter fields appearing in the query directly, this PR updates the documentation around filter fields to clarify that a user should not have the filter field(s) appear in the stored query itself, only in its metadata.
For reasons described in the linked issue (#14427), I think supporting filter fields in the query would introduce a performance penalty, so it seems preferable to leave that behavior as-is and augment the documentation to help users avoid problems with it.
Merge Request
If this PR gets merged, can you please use my
[email protected]
email address for the squash+merge? Thank you.