Skip to content

Inconsistent Highlighting Behavior with exists Query in Elasticsearch 8.5 vs 8.14 #124533

Open
@sxh-lsc

Description

@sxh-lsc

Elasticsearch Version

8.5

Installed Plugins

No response

Java Version

bundled

OS Version

Linux 5.15.0-113-generic #123~20.04.1-Ubuntu SUTC 2024 x86_64 x86_64 x86_64 GNU/Linux

Problem Description

I am encountering an issue with the highlighting feature in Elasticsearch when using the exists query syntax. Specifically, I have noticed inconsistent behavior between Elasticsearch versions 8.5 and 8.14. When I include the exists query in the request body, the highlighting results differ between these two versions.
I would like to know if this discrepancy is due to a bug that has been fixed in a later version, and if so, which version addresses this issue. Any guidance on resolving this inconsistency would be greatly appreciated.

Steps to Reproduce

I create same index in both 8.5 and 8.14 like this:

{
    "mappings": {
        "properties": {
            "deleted_at": {
                "type": "date"
            },
            "customized_description": {
                "properties": {
                    "internvl_ft": {
                        "type": "text",
                    }
                }
            }
        }
    }
}

And my query body like this:

{
    "query": {
        "bool": {
            "must_not": [
                {
                    "match": {
                        "customized_description.internvl_ft": {
                            "query": "pig"
                        }
                    }
                },
                {
                    "exists": {
                        "field": "deleted_at"
                    }
                }
            ],
            "must": [
                {
                    "match": {
                        "customized_description.internvl_ft": {
                            "query": "dog"
                        }
                    }
                }
            ]
        }
    },
    "highlight": {
        "require_field_match": true,
        "fields": {
            "customized_description.internvl_ft": {
                "number_of_fragments": 0,
                "post_tags": [
                    "\"</b>\""
                ],
                "pre_tags": [
                    "\"<b __toBeReplaced__>\""
                ]
            }
        }
    },
    "size": 2,
}

with 8.14 version , it worked very well with response like:

        "hits": [
            {
                "_index": "test-index",
                "_id": "test_id",
                "_score": null,
                "_source": {
                    "customized_description": {
                        "internvl_ft": "this is a picture about a black dog"
                    }
                },
                "highlight": {
                    "customized_description.internvl_ft": [
                        "this is a picture about a black \"<b __toBeReplaced__>\"dog\"</b>\""
                    ]
                }
            }
        ]

However, with 8.5 , its response does not contain any highlight information,
if I drop the part of query:

                {
                    "exists": {
                        "field": "deleted_at"
                    }
                }

the 8.5 will response with highlight

Logs (if relevant)

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions