Re: Improving Performance of Query ~ Filter by A, Sort by B - Mailing list pgsql-performance

From Lincoln Swaine-Moore
Subject Re: Improving Performance of Query ~ Filter by A, Sort by B
Date
Msg-id CABcidkLEZ=WifDQJLtPFaTeFLYTFkEbXDhbFLGVk5FzTVDSbbw@mail.gmail.com
Whole thread Raw
In response to Re: Improving Performance of Query ~ Filter by A, Sort by B  (legrand legrand <legrand_legrand@hotmail.com>)
Responses Re: Improving Performance of Query ~ Filter by A, Sort by B  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-performance
Thanks for looking into this!

Here's the result (I turned off the timeout and got it to finish):

EXPLAIN ANALYZE
SELECT "a"."id"
FROM  a_partition1 "a"
WHERE "a"."parent_id" IN (
    49188,14816,14758,8402
)
ORDER BY "a"."tmstmp" DESC
LIMIT 20;

                                        QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=0.43..5710.03 rows=20 width=12) (actual time=1141878.105..1142350.296 rows=20 loops=1)
   ->  Index Scan Backward using a_tmstmp_idx1 on a_partition1 a  (cost=0.43..1662350.21 rows=5823 width=12) (actual time=1141878.103..1142350.274 rows=20 loops=1)
         Filter: (parent_id = ANY ('{49188,14816,14758,8402}'::integer[]))
         Rows Removed by Filter: 7931478
 Planning time: 0.122 ms
 Execution time: 1142350.336 ms
(6 rows)
(Note: I've chosen parent_ids that I know are associated with the part_key 1, but the query plan was the same with the 4 parent_ids in your query.)

Looks like it's using the filter in the same way as the query on the parent table, so seems be a problem beyond the partitioning.

And as soon as I cut it back to 3 parent_ids, jumps to a query plan using a_parent_id_idx1 again:

EXPLAIN ANALYZE
SELECT "a"."id"
FROM  a_partition1 "a"
WHERE "a"."parent_id" IN (
    19948,21436,41220
)
ORDER BY "a"."tmstmp" DESC
LIMIT 20;

                             QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=5004.57..5004.62 rows=20 width=12) (actual time=36.329..36.341 rows=20 loops=1)
   ->  Sort  (cost=5004.57..5015.49 rows=4367 width=12) (actual time=36.328..36.332 rows=20 loops=1)
         Sort Key: tmstmp DESC
         Sort Method: top-N heapsort  Memory: 26kB
         ->  Index Scan using a_parent_id_idx1 on a_partition1 a  (cost=0.43..4888.37 rows=4367 width=12) (actual time=5.581..36.270 rows=50 loops=1)
               Index Cond: (parent_id = ANY ('{19948,21436,41220}'::integer[]))
 Planning time: 0.117 ms
 Execution time: 36.379 ms
(8 rows)


Thanks again for your help!




On Wed, Jul 11, 2018 at 5:41 PM, legrand legrand <legrand_legrand@hotmail.com> wrote:
Hello,

I have tested it with release 11 and limit 20 is pushed to each partition
when using index on tmstmp.

Could you tell us what is the result of your query applyed to one partition

EXPLAIN ANALYZE
SELECT "a"."id"
FROM  a_partition1 "a"
WHERE "a"."parent_id" IN (
    34226,24506,40987,27162
)
ORDER BY "a"."tmstmp" DESC
LIMIT 20;

May be that limit 20 is not pushed to partitions in your version ?
Regards
PAscal





--
Sent from: http://www.postgresql-archive.org/PostgreSQL-performance-f2050081.html




--
Lincoln Swaine-Moore

pgsql-performance by date:

Previous
From: legrand legrand
Date:
Subject: Re: Improving Performance of Query ~ Filter by A, Sort by B
Next
From: Tom Lane
Date:
Subject: Re: Improving Performance of Query ~ Filter by A, Sort by B