Thread: Tuning a query with ORDER BY and LIMIT

Tuning a query with ORDER BY and LIMIT

From
"Dirschel, Steve"
Date:

I am fairly new to tuning Postgres queries.  I have a long background tuning Oracle queries.

 

Posrgres version 10.11

 

Here is the DDL for the index the query is using:

 

create index workflow_execution_initial_ui_tabs

    on workflow_execution (workflow_id asc, status asc, result asc, completed_datetime desc);

 

 

explain (analyze, verbose, costs, buffers, timing, summary, hashes)

select * from workflow_execution

where workflow_id = 14560 and

      status = 'COMPLETED' and

      result in ('SUCCEEDED','REEXECUTED','ABORTED','DISCONTINUED','FAILED','PARTIAL_SUCCESS')

order by completed_datetime desc limit 50;

 

--

Limit  (cost=56394.91..56395.04 rows=50 width=1676) (actual time=3400.608..3400.622 rows=50 loops=1)

"  Output: execution_id, state_machine_id, workflow_id, started_datetime, completed_datetime, status, execution_context_s3_arn, ol_version, created_datetime, updated_datetime, deleted_millis, acquisition_channel_id, correlation_id, result, state_machine_execution_arn, created_by_id, updated_by_id, acquired_gcs_s3_object, sqs_trigger_id, trigger_message, acquired_gcs_s3_object_uuid, api_trigger_id, scheduled_trigger_id, notification_trigger_workflow_id, acquired_object_name, subscription_guid"

  Buffers: shared hit=142368

  ->  Sort  (cost=56394.91..56432.71 rows=15118 width=1676) (actual time=3400.607..3400.615 rows=50 loops=1)

"        Output: execution_id, state_machine_id, workflow_id, started_datetime, completed_datetime, status, execution_context_s3_arn, ol_version, created_datetime, updated_datetime, deleted_millis, acquisition_channel_id, correlation_id, result, state_machine_execution_arn, created_by_id, updated_by_id, acquired_gcs_s3_object, sqs_trigger_id, trigger_message, acquired_gcs_s3_object_uuid, api_trigger_id, scheduled_trigger_id, notification_trigger_workflow_id, acquired_object_name, subscription_guid"

        Sort Key: workflow_execution.completed_datetime DESC

        Sort Method: top-N heapsort  Memory: 125kB

        Buffers: shared hit=142368

        ->  Index Scan using workflow_execution_initial_ui_tabs on workflow.workflow_execution  (cost=0.69..55892.70 rows=15118 width=1676) (actual time=0.038..2258.579 rows=2634718 loops=1)

"              Output: execution_id, state_machine_id, workflow_id, started_datetime, completed_datetime, status, execution_context_s3_arn, ol_version, created_datetime, updated_datetime, deleted_millis, acquisition_channel_id, correlation_id, result, state_machine_execution_arn, created_by_id, updated_by_id, acquired_gcs_s3_object, sqs_trigger_id, trigger_message, acquired_gcs_s3_object_uuid, api_trigger_id, scheduled_trigger_id, notification_trigger_workflow_id, acquired_object_name, subscription_guid"

"              Index Cond: ((workflow_execution.workflow_id = 14560) AND ((workflow_execution.status)::text = 'COMPLETED'::text) AND ((workflow_execution.result)::text = ANY ('{SUCCEEDED,REEXECUTED,ABORTED,DISCONTINUED,FAILED,PARTIAL_SUCCESS}'::text[])))"

              Buffers: shared hit=142368

Planning time: 0.217 ms

Execution time: 3400.656 ms

 

With Oracle for a query like this since the index is on the 3 columns matching the WHERE clause and the ORDER BY clause is in the 4th position Oracle would be able to scan that index and as soon as it finds the first matching 50 rows.  But as you can see above Postgres is finding 2,634,718 matching rows for the WHERE clause , sorts them, and then returns the first 50 rows.   

 

I was questioning if the result IN clause was causing the issue so I ran the query with result = and see the same results:

 

explain (analyze, verbose, costs, buffers, timing, summary, hashes)

select * from workflow_execution

where workflow_id = 14560 and

      status = 'COMPLETED' and

      result = 'SUCCEEDED'

order by completed_datetime desc limit 50;

 

Limit  (cost=54268.09..54268.22 rows=50 width=1676) (actual time=3372.453..3372.467 rows=50 loops=1)

"  Output: execution_id, state_machine_id, workflow_id, started_datetime, completed_datetime, status, execution_context_s3_arn, ol_version, created_datetime, updated_datetime, deleted_millis, acquisition_channel_id, correlation_id, result, state_machine_execution_arn, created_by_id, updated_by_id, acquired_gcs_s3_object, sqs_trigger_id, trigger_message, acquired_gcs_s3_object_uuid, api_trigger_id, scheduled_trigger_id, notification_trigger_workflow_id, acquired_object_name, subscription_guid"

  Buffers: shared hit=140313

  ->  Sort  (cost=54268.09..54304.46 rows=14547 width=1676) (actual time=3372.452..3372.460 rows=50 loops=1)

"        Output: execution_id, state_machine_id, workflow_id, started_datetime, completed_datetime, status, execution_context_s3_arn, ol_version, created_datetime, updated_datetime, deleted_millis, acquisition_channel_id, correlation_id, result, state_machine_execution_arn, created_by_id, updated_by_id, acquired_gcs_s3_object, sqs_trigger_id, trigger_message, acquired_gcs_s3_object_uuid, api_trigger_id, scheduled_trigger_id, notification_trigger_workflow_id, acquired_object_name, subscription_guid"

        Sort Key: workflow_execution.completed_datetime DESC

        Sort Method: top-N heapsort  Memory: 125kB

        Buffers: shared hit=140313

        ->  Index Scan using workflow_execution_initial_ui_tabs on workflow.workflow_execution  (cost=0.69..53784.85 rows=14547 width=1676) (actual time=0.034..2238.867 rows=2616747 loops=1)

"              Output: execution_id, state_machine_id, workflow_id, started_datetime, completed_datetime, status, execution_context_s3_arn, ol_version, created_datetime, updated_datetime, deleted_millis, acquisition_channel_id, correlation_id, result, state_machine_execution_arn, created_by_id, updated_by_id, acquired_gcs_s3_object, sqs_trigger_id, trigger_message, acquired_gcs_s3_object_uuid, api_trigger_id, scheduled_trigger_id, notification_trigger_workflow_id, acquired_object_name, subscription_guid"

              Index Cond: ((workflow_execution.workflow_id = 14560) AND ((workflow_execution.status)::text = 'COMPLETED'::text) AND ((workflow_execution.result)::text = 'SUCCEEDED'::text))

              Buffers: shared hit=140313

Planning time: 0.264 ms

Execution time: 3372.511 ms

 

 

Is Postgres unable to optimize the query similar to Oracle?  Is it possible this is possible but we are running on too old of a version?

 

Thanks in advance for any input.

 

Steve

 

 

This e-mail is for the sole use of the intended recipient and contains information that may be privileged and/or confidential. If you are not an intended recipient, please notify the sender by return e-mail and delete this e-mail and any attachments. Certain required legal entity disclosures can be accessed on our website: https://www.thomsonreuters.com/en/resources/disclosures.html

Re: Tuning a query with ORDER BY and LIMIT

From
Michael van der Kolff
Date:
What do you see when you remove the LIMIT clause? It may be possible to rewrite this using ROW_NUMBER.

--Michael

On Thu, Jun 23, 2022 at 5:39 AM Dirschel, Steve <steve.dirschel@thomsonreuters.com> wrote:

I am fairly new to tuning Postgres queries.  I have a long background tuning Oracle queries.

 

Posrgres version 10.11

 

Here is the DDL for the index the query is using:

 

create index workflow_execution_initial_ui_tabs

    on workflow_execution (workflow_id asc, status asc, result asc, completed_datetime desc);

 

 

explain (analyze, verbose, costs, buffers, timing, summary, hashes)

select * from workflow_execution

where workflow_id = 14560 and

      status = 'COMPLETED' and

      result in ('SUCCEEDED','REEXECUTED','ABORTED','DISCONTINUED','FAILED','PARTIAL_SUCCESS')

order by completed_datetime desc limit 50;

 

--

Limit  (cost=56394.91..56395.04 rows=50 width=1676) (actual time=3400.608..3400.622 rows=50 loops=1)

"  Output: execution_id, state_machine_id, workflow_id, started_datetime, completed_datetime, status, execution_context_s3_arn, ol_version, created_datetime, updated_datetime, deleted_millis, acquisition_channel_id, correlation_id, result, state_machine_execution_arn, created_by_id, updated_by_id, acquired_gcs_s3_object, sqs_trigger_id, trigger_message, acquired_gcs_s3_object_uuid, api_trigger_id, scheduled_trigger_id, notification_trigger_workflow_id, acquired_object_name, subscription_guid"

  Buffers: shared hit=142368

  ->  Sort  (cost=56394.91..56432.71 rows=15118 width=1676) (actual time=3400.607..3400.615 rows=50 loops=1)

"        Output: execution_id, state_machine_id, workflow_id, started_datetime, completed_datetime, status, execution_context_s3_arn, ol_version, created_datetime, updated_datetime, deleted_millis, acquisition_channel_id, correlation_id, result, state_machine_execution_arn, created_by_id, updated_by_id, acquired_gcs_s3_object, sqs_trigger_id, trigger_message, acquired_gcs_s3_object_uuid, api_trigger_id, scheduled_trigger_id, notification_trigger_workflow_id, acquired_object_name, subscription_guid"

        Sort Key: workflow_execution.completed_datetime DESC

        Sort Method: top-N heapsort  Memory: 125kB

        Buffers: shared hit=142368

        ->  Index Scan using workflow_execution_initial_ui_tabs on workflow.workflow_execution  (cost=0.69..55892.70 rows=15118 width=1676) (actual time=0.038..2258.579 rows=2634718 loops=1)

"              Output: execution_id, state_machine_id, workflow_id, started_datetime, completed_datetime, status, execution_context_s3_arn, ol_version, created_datetime, updated_datetime, deleted_millis, acquisition_channel_id, correlation_id, result, state_machine_execution_arn, created_by_id, updated_by_id, acquired_gcs_s3_object, sqs_trigger_id, trigger_message, acquired_gcs_s3_object_uuid, api_trigger_id, scheduled_trigger_id, notification_trigger_workflow_id, acquired_object_name, subscription_guid"

"              Index Cond: ((workflow_execution.workflow_id = 14560) AND ((workflow_execution.status)::text = 'COMPLETED'::text) AND ((workflow_execution.result)::text = ANY ('{SUCCEEDED,REEXECUTED,ABORTED,DISCONTINUED,FAILED,PARTIAL_SUCCESS}'::text[])))"

              Buffers: shared hit=142368

Planning time: 0.217 ms

Execution time: 3400.656 ms

 

With Oracle for a query like this since the index is on the 3 columns matching the WHERE clause and the ORDER BY clause is in the 4th position Oracle would be able to scan that index and as soon as it finds the first matching 50 rows.  But as you can see above Postgres is finding 2,634,718 matching rows for the WHERE clause , sorts them, and then returns the first 50 rows.   

 

I was questioning if the result IN clause was causing the issue so I ran the query with result = and see the same results:

 

explain (analyze, verbose, costs, buffers, timing, summary, hashes)

select * from workflow_execution

where workflow_id = 14560 and

      status = 'COMPLETED' and

      result = 'SUCCEEDED'

order by completed_datetime desc limit 50;

 

Limit  (cost=54268.09..54268.22 rows=50 width=1676) (actual time=3372.453..3372.467 rows=50 loops=1)

"  Output: execution_id, state_machine_id, workflow_id, started_datetime, completed_datetime, status, execution_context_s3_arn, ol_version, created_datetime, updated_datetime, deleted_millis, acquisition_channel_id, correlation_id, result, state_machine_execution_arn, created_by_id, updated_by_id, acquired_gcs_s3_object, sqs_trigger_id, trigger_message, acquired_gcs_s3_object_uuid, api_trigger_id, scheduled_trigger_id, notification_trigger_workflow_id, acquired_object_name, subscription_guid"

  Buffers: shared hit=140313

  ->  Sort  (cost=54268.09..54304.46 rows=14547 width=1676) (actual time=3372.452..3372.460 rows=50 loops=1)

"        Output: execution_id, state_machine_id, workflow_id, started_datetime, completed_datetime, status, execution_context_s3_arn, ol_version, created_datetime, updated_datetime, deleted_millis, acquisition_channel_id, correlation_id, result, state_machine_execution_arn, created_by_id, updated_by_id, acquired_gcs_s3_object, sqs_trigger_id, trigger_message, acquired_gcs_s3_object_uuid, api_trigger_id, scheduled_trigger_id, notification_trigger_workflow_id, acquired_object_name, subscription_guid"

        Sort Key: workflow_execution.completed_datetime DESC

        Sort Method: top-N heapsort  Memory: 125kB

        Buffers: shared hit=140313

        ->  Index Scan using workflow_execution_initial_ui_tabs on workflow.workflow_execution  (cost=0.69..53784.85 rows=14547 width=1676) (actual time=0.034..2238.867 rows=2616747 loops=1)

"              Output: execution_id, state_machine_id, workflow_id, started_datetime, completed_datetime, status, execution_context_s3_arn, ol_version, created_datetime, updated_datetime, deleted_millis, acquisition_channel_id, correlation_id, result, state_machine_execution_arn, created_by_id, updated_by_id, acquired_gcs_s3_object, sqs_trigger_id, trigger_message, acquired_gcs_s3_object_uuid, api_trigger_id, scheduled_trigger_id, notification_trigger_workflow_id, acquired_object_name, subscription_guid"

              Index Cond: ((workflow_execution.workflow_id = 14560) AND ((workflow_execution.status)::text = 'COMPLETED'::text) AND ((workflow_execution.result)::text = 'SUCCEEDED'::text))

              Buffers: shared hit=140313

Planning time: 0.264 ms

Execution time: 3372.511 ms

 

 

Is Postgres unable to optimize the query similar to Oracle?  Is it possible this is possible but we are running on too old of a version?

 

Thanks in advance for any input.

 

Steve

 

 

This e-mail is for the sole use of the intended recipient and contains information that may be privileged and/or confidential. If you are not an intended recipient, please notify the sender by return e-mail and delete this e-mail and any attachments. Certain required legal entity disclosures can be accessed on our website: https://www.thomsonreuters.com/en/resources/disclosures.html

Re: Tuning a query with ORDER BY and LIMIT

From
"Peter J. Holzer"
Date:
On 2022-06-22 19:39:33 +0000, Dirschel, Steve wrote:
> Posrgres version 10.11
>
> Here is the DDL for the index the query is using:
>
> create index workflow_execution_initial_ui_tabs
>     on workflow_execution (workflow_id asc, status asc, result asc,
> completed_datetime desc);
>
> explain (analyze, verbose, costs, buffers, timing, summary, hashes)
> select * from workflow_execution
> where workflow_id = 14560 and
>       status = 'COMPLETED' and
>       result in
> ('SUCCEEDED','REEXECUTED','ABORTED','DISCONTINUED','FAILED','PARTIAL_SUCCESS')
> order by completed_datetime desc limit 50;
[...]
> With Oracle for a query like this since the index is on the 3 columns matching
> the WHERE clause and the ORDER BY clause is in the 4^th position Oracle would
> be able to scan that index and as soon as it finds the first matching 50 rows.
> But as you can see above Postgres is finding 2,634,718 matching rows for the
> WHERE clause , sorts them, and then returns the first 50 rows.

The index cannot be used for sorting, since the column used for sorting
isn't in the first position in the index. That's just how btree indexes
work and Oracle will have the same limitation. What would be possible is
to use an index only scan (returning 2,634,718 matching results), sort
that to find the 50 newest entries and retrieve only those from the
table. That should be faster since the index contains only 4 of 28 (if I
counted correctly) columns and should be quite a bit smaller. It's
possible that Oracle does this. But I'm not sure whether you could tell
that from the execution plan.

        hp

--
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"

Attachment

Re: Tuning a query with ORDER BY and LIMIT

From
"Peter J. Holzer"
Date:
On 2022-06-22 19:39:33 +0000, Dirschel, Steve wrote:
> create index workflow_execution_initial_ui_tabs
>     on workflow_execution (workflow_id asc, status asc, result asc,
> completed_datetime desc);
[...]
> explain (analyze, verbose, costs, buffers, timing, summary, hashes)
> select * from workflow_execution
> where workflow_id = 14560 and
>       status = 'COMPLETED' and
>       result = 'SUCCEEDED'
> order by completed_datetime desc limit 50;

This query should actually be able to use the index, since the first
columns in the index are all compared to single values. So the you can
just jump to the first matching index and then get the next 50 entries.

> Is Postgres unable to optimize the query similar to Oracle?  Is it possible
> this is possible but we are running on too old of a version?

PostgreSQL 10 is quite old, so that's a possibility.

        hp

--
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"

Attachment

Re: Tuning a query with ORDER BY and LIMIT

From
"Peter J. Holzer"
Date:
On 2022-06-22 23:48:37 +0200, Peter J. Holzer wrote:
> On 2022-06-22 19:39:33 +0000, Dirschel, Steve wrote:
> > Posrgres version 10.11
> >
> > Here is the DDL for the index the query is using:
> >
> > create index workflow_execution_initial_ui_tabs
> >     on workflow_execution (workflow_id asc, status asc, result asc,
> > completed_datetime desc);
> >
> > explain (analyze, verbose, costs, buffers, timing, summary, hashes)
> > select * from workflow_execution
> > where workflow_id = 14560 and
> >       status = 'COMPLETED' and
> >       result in
> > ('SUCCEEDED','REEXECUTED','ABORTED','DISCONTINUED','FAILED','PARTIAL_SUCCESS')
> > order by completed_datetime desc limit 50;
> [...]
> The index cannot be used for sorting, since the column used for sorting
> isn't in the first position in the index.

compared to a single value

>That's just how btree indexes work and Oracle will have the same
>limitation. What would be possible is to use an index only scan
>(returning 2,634,718 matching results), sort that to find the 50 newest
>entries and retrieve only those from the table. That should be faster
>since the index contains only 4 of 28 (if I counted correctly) columns
>and should be quite a bit smaller.

Another - better - optimization would be to fetch the first 50 results
for each of the 6 possible values of result, then choose the 50 largest
of those. That sounds tricky to generalize, though.

        hp

--
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"

Attachment

Re: Tuning a query with ORDER BY and LIMIT

From
"Peter J. Holzer"
Date:
On 2022-06-23 00:19:19 +0200, Peter J. Holzer wrote:
> On 2022-06-22 23:48:37 +0200, Peter J. Holzer wrote:
> > The index cannot be used for sorting, since the column used for sorting
> > isn't in the first position in the index.
>
> compared to a single value
  ^ not

E-Mail really needs a "Supercedes" header.

        hp

--
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"

Attachment

Re: Tuning a query with ORDER BY and LIMIT

From
Tom Lane
Date:
"Peter J. Holzer" <hjp-pgsql@hjp.at> writes:
> On 2022-06-22 19:39:33 +0000, Dirschel, Steve wrote:
>> create index workflow_execution_initial_ui_tabs
>> on workflow_execution (workflow_id asc, status asc, result asc,
>> completed_datetime desc);
> [...]
>> explain (analyze, verbose, costs, buffers, timing, summary, hashes)
>> select * from workflow_execution
>> where workflow_id = 14560 and
>> status = 'COMPLETED' and
>> result = 'SUCCEEDED'
>> order by completed_datetime desc limit 50;

> This query should actually be able to use the index, since the first
> columns in the index are all compared to single values. So the you can
> just jump to the first matching index and then get the next 50 entries.

Yeah, that sure looks like it should work.  Since we've not been
shown the whole table's DDL, it's hard to guess at why it isn't;
maybe there's a data type issue?

>> Is Postgres unable to optimize the query similar to Oracle?  Is it possible
>> this is possible but we are running on too old of a version?

> PostgreSQL 10 is quite old, so that's a possibility.

That's worked for many ages.

            regards, tom lane



Re: Tuning a query with ORDER BY and LIMIT

From
Jeff Janes
Date:
On Wed, Jun 22, 2022 at 3:39 PM Dirschel, Steve <steve.dirschel@thomsonreuters.com> wrote:
 

explain (analyze, verbose, costs, buffers, timing, summary, hashes)


'hashes', what's that?  Are you using community PostgreSQL, or someones fork?
 

With Oracle for a query like this since the index is on the 3 columns matching the WHERE clause and the ORDER BY clause is in the 4th position Oracle would be able to scan that index and as soon as it finds the first matching 50 rows. 


Can you show the plan from Oracle?  I'm not an expert at reading their plans, but certainly they are easier to attempt to read if you can at least see them.
 

I was questioning if the result IN clause was causing the issue so I ran the query with result = and see the same results:


I can't reproduce this at all.  With simple equality it absolutely uses the index for ordering in that case, even in v10.

Cheers,

Jeff

Re: Tuning a query with ORDER BY and LIMIT

From
Jeff Janes
Date:
On Wed, Jun 22, 2022 at 6:19 PM Peter J. Holzer <hjp-pgsql@hjp.at> wrote:

>That's just how btree indexes work and Oracle will have the same
>limitation. What would be possible is to use an index only scan
>(returning 2,634,718 matching results), sort that to find the 50 newest
>entries and retrieve only those from the table. That should be faster
>since the index contains only 4 of 28 (if I counted correctly) columns
>and should be quite a bit smaller.

Another - better - optimization would be to fetch the first 50 results
for each of the 6 possible values of result, then choose the 50 largest
of those. That sounds tricky to generalize, though.

You don't even need to read 50 from each of the 6 branches.  If you use a merge append operation, you would need to read  55 rows.  50 to be returned, and one non-returned from each branch other than the one returning the last row.  I suspect this may be what Oracle is doing.  With some trickery, you can get PostgreSQL to do the same thing.

(select * from foo where a=4 order by b)
union all 
(select * from foo where a=7 order by b)
order by b
limit 50

                                             QUERY PLAN                                              
-----------------------------------------------------------------------------------------------------
 Limit  (cost=0.86..131.25 rows=50 width=8)
   ->  Merge Append  (cost=0.86..26079.86 rows=10000 width=8)
         Sort Key: foo.b
         ->  Index Only Scan using foo_a_b_idx on foo  (cost=0.42..12939.92 rows=5000 width=8)
               Index Cond: (a = 4)
         ->  Index Only Scan using foo_a_b_idx on foo foo_1  (cost=0.42..12939.92 rows=5000 width=8)
               Index Cond: (a = 7)

Cheers,

Jeff

Re: Tuning a query with ORDER BY and LIMIT

From
"Peter J. Holzer"
Date:
On 2022-06-22 23:10:25 -0400, Jeff Janes wrote:
> On Wed, Jun 22, 2022 at 6:19 PM Peter J. Holzer <hjp-pgsql@hjp.at> wrote:
>     >That's just how btree indexes work and Oracle will have the same
>     >limitation. What would be possible is to use an index only scan
>     >(returning 2,634,718 matching results), sort that to find the 50 newest
>     >entries and retrieve only those from the table. That should be faster
>     >since the index contains only 4 of 28 (if I counted correctly) columns
>     >and should be quite a bit smaller.
>
>     Another - better - optimization would be to fetch the first 50 results
>     for each of the 6 possible values of result, then choose the 50 largest
>     of those. That sounds tricky to generalize, though.
>
>
> You don't even need to read 50 from each of the 6 branches.  If you use a merge
> append operation, you would need to read  55 rows.  50 to be returned, and one
> non-returned from each branch other than the one returning the last row.

Yes, but that means a lot of jumping around in the index.

        hp

--
   _  | Peter J. Holzer    | Story must make more sense than reality.
|_|_) |                    |
| |   | hjp@hjp.at         |    -- Charles Stross, "Creative writing
__/   | http://www.hjp.at/ |       challenge!"

Attachment