Skip to content

ESQL: Replace grouping by DateFormat with DateTrunc #129277

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

kanoshiou
Copy link
Contributor

Optimize date grouping with formatting in ESQL

This PR optimizes the performance of date grouping operations in ESQL by automatically converting DATE_FORMAT in GROUP BY clauses to more efficient DATE_TRUNC operations. The optimization:

  • Automatically detects and converts DATE_FORMAT patterns to equivalent DATE_TRUNC intervals
  • Moves date formatting from the grouping phase to a subsequent EVAL phase
  • Handles timezone and DST transitions correctly
  • Supports various time intervals from nanoseconds to years

Example optimization:

FROM test
| STATS avg = AVG(salary) BY date = DATE_FORMAT("yyyy-MM", hire_date)

becomes:

FROM test
| STATS avg = AVG(salary) BY date1 = DATE_TRUNC(1 month, hire_date) 
| EVAL date = DATE_FORMAT("yyyy-MM", date1) 
| KEEP avg, date

Optimized plan:

Project[[avg{r}#7, date{r}#4]]
\_Eval[[$$SUM$avg$0{r$}#21 / $$COUNT$avg$1{r$}#22 AS avg#7, DATEFORMAT([79 79 79 79 2d 4d 4d][KEYWORD],$$DATE_FORMAT(
"yy>$date$0{r$}#20) AS date#4]]
  \_Limit[1000[INTEGER],false]
    \_Aggregate[[$$DATE_FORMAT("yy>$date$0{r$}#20],[SUM(salary{f}#14,true[BOOLEAN]) AS $$SUM$avg$0#21, COUNT(salary{f}#14,true[
BOOLEAN]) AS $$COUNT$avg$1#22, $$DATE_FORMAT("yy>$date$0{r$}#20]]
      \_Eval[[DATETRUNC(P1M[DATE_PERIOD],hire_date{f}#16) AS $$DATE_FORMAT("yy>$date$0#20]]
        \_EsRelation[test][_meta_field{f}#15, emp_no{f}#9, first_name{f}#10, g..]

Closes #114772

@elasticsearchmachine elasticsearchmachine added v9.1.0 needs:triage Requires assignment of a team area label external-contributor Pull request authored by a developer outside the Elasticsearch team labels Jun 11, 2025
@ivancea ivancea added Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) :Analytics/ES|QL AKA ESQL and removed needs:triage Requires assignment of a team area label labels Jun 19, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Analytics/ES|QL AKA ESQL >enhancement external-contributor Pull request authored by a developer outside the Elasticsearch team Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ESQL: optimize date grouping with formatting
3 participants