Skip to content

Clarify filter fields usage in javadocs #14660

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

bjacobowitz
Copy link
Contributor

Description

To resolve #14427

Update documentation for Lucene Monitor to make clear that filter fields should appear only in the query metadata and incoming documents, and should not appear in the query itself.

Problem

This is described in more detail in #14427 -- a rough summary is given below:

When TermFilteredPresearcher builds a presearcher query, it separates out filter fields into their own clause (with BooleanClause.Occur.FILTER), ANDed with a clause for all other terms. If a stored monitor query is a conjunction of fields including the filter field, it's possible that the filter field may be chosen as the indexed field for the document in the monitor's index, in which case the presearcher query will need to search for the filter field in order to match. However, since the filter field is deliberately omitted from the terms clause, the terms clause will never contain that filter field and can never match that stored monitor query. Although the filter clause will match the query, since the two clauses are ANDed together, the stored query doesn't match the presearcher query as a whole, so the stored query is not returned from presearching (and therefore does not match).

Supporting this would potentially cause more queries to run than before, and it's debatable that allowing the filter field to float freely in the stored query (where it could have, for example, a NOT applied to it) would lead to inconsistency.

Solution

Side-stepping the larger question of whether there is a path forward with filter fields appearing in the query directly, this PR updates the documentation around filter fields to clarify that a user should not have the filter field(s) appear in the stored query itself, only in its metadata.

For reasons described in the linked issue (#14427), I think supporting filter fields in the query would introduce a performance penalty, so it seems preferable to leave that behavior as-is and augment the documentation to help users avoid problems with it.

Merge Request

If this PR gets merged, can you please use my [email protected] email address for the squash+merge? Thank you.

Copy link

This PR does not have an entry in lucene/CHANGES.txt. Consider adding one. If the PR doesn't need a changelog entry, then add the skip-changelog-check label to it and you will stop receiving this reminder on future updates to the PR.

Copy link

This PR has not had activity in the past 2 weeks, labeling it as stale. If the PR is waiting for review, notify the [email protected] list. Thank you for your contribution!

@github-actions github-actions bot added the Stale label May 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Monitor TermFilteredPresearcher does not return stored query if it contains filter field
1 participant