Open
Description
When running a query that uses OFFSET and LIMIT clauses together, we see output is incorrect at times and is inconsistent across different runs. If only OFFSET is used, the results are correct in all runs I tried.
Your Environment
- Presto version used: Latest master
- Storage (HDFS/S3/GCS..): Local filesystem
- Data source and connector used: hive
- Deployment (Cloud or On-prem):
- Pastebin link to the complete debug logs:
Expected Behavior
The OFFSET clause should skip the required rows whether or not LIMIT clause is present in the query.
Current Behavior
When a combination of OFFSET and LIMIT clause are used, query returns inconsistent results
Possible Solution
Steps to Reproduce
- Create an external table on local with below DDL -
CREATE TABLE hive.default.stutos ( "accountid_key" varchar, "accountid" varchar, "applicationid" varchar, "app_name" varchar, "app_environment" varchar, "endpoint_name" varchar, "endpointid" varchar, "type" varchar, "linkedtoidentity" varchar, "status" varchar, "email" varchar, "gecos" varchar, "last_revalidation_by_serial" varchar, "last_revalidation_date" varchar, "last_access_date" varchar, "rec_status" varchar, "rec_source" varchar, "rec_created_date" varchar, "rec_updated_date" varchar, "rec_deleted_date" varchar, "active_riskscore" varchar, "transactional_rule" varchar, "priority" varchar, "multiple_infractions" varchar, "server_id" varchar, "account_name" varchar, "is_password_less" varchar, "is_mfa_enabled" varchar, "failedlogincount" varchar, "changepwdcount" varchar, "lastpwdchangetimestamp" varchar, "pwdexpirationdate" varchar, "usage" varchar, "description" varchar, "certificationtimestamp" varchar, "lastrevalidationdate" varchar, "functional_role" varchar, "is_privileged_acct" varchar, "is_functional_acct" varchar, "activation_date" varchar, "deactivation_date" varchar, "expiration_date" varchar, "lastlogintimestamp" varchar, "acct_owner" varchar, "rplex" varchar, "system_name" varchar, "ip_address" varchar, "pwd_change_date" varchar, "phr_change_date" varchar, "acct_owner_email" varchar, "revoke_date" varchar, "last_acc_login_date" varchar, "log_date" varchar ) WITH ( format = 'CSV', external_location = 'file:///Users/pratyakshsharma/Downloads/offset_test' );
- Use the attached file as data source
- Run below queries -
select accountid from hive.default.stutos ORDER BY accountid OFFSET 0 limit 1000; - this will give inconsistent results as shown in the screenshots attached.
select accountid from hive.default.stutos ORDER BY accountid OFFSET <n>; replace n with any number and the results are always correct when LIMIT clause is absent
Screenshots (if appropriate)






Context
Incorrect results can affect production workloads and reports where presto is being used.
Metadata
Metadata
Assignees
Type
Projects
Status
👀 Review