Re: partitioned table and ORDER BY indexed_field DESC LIMIT 1 - Mailing list pgsql-performance

From Mark Kirkwood
Subject Re: partitioned table and ORDER BY indexed_field DESC LIMIT 1
Date
Msg-id 4732788A.6050405@paradise.net.nz
Whole thread Raw
In response to Re: partitioned table and ORDER BY indexed_field DESC LIMIT 1  (Gregory Stark <stark@enterprisedb.com>)
Responses Re: partitioned table and ORDER BY indexed_field DESC LIMIT 1  (Gregory Stark <stark@enterprisedb.com>)
List pgsql-performance
Gregory Stark wrote:
> "Mark Kirkwood" <markir@paradise.net.nz> writes:
>
>
>> I spent today looking at getting this patch into a self contained state.
>> Working against HEAD I'm getting bogged down in the PathKeyItem to
>> PathKey/EquivalenceClass/EquivalenceMember(s) change. So I figured I'd divide
>> and conquer to some extent, and initially provide a patch:
>>
>> - against 8.2.(5)
>> - self contained  (i.e no mystery functions)
>>
>
> That would be helpful for me. It would include the bits I'm looking for.
>
>
>> The next step would be to update to to HEAD. That would hopefully provide some
>> useful material for others working on this.
>>
>
> If that's not too much work then that would be great but if it's a lot of work
> then it may not be worth it if I'm planning to only take certain bits. On the
> other hand if it's good then we might just want to take it wholesale and then
> add to it.
>
>

Here is a (somewhat hurried) self-contained version of the patch under
discussion. It applies to 8.2.5 and the resultant code compiles and
runs. I've left in some unneeded parallel stuff (PathLocus struct),
which I can weed out in a subsequent version if desired. I also removed
the 'cdb ' from  most of the function names and (I  hope) any Greenplum
copyrights.

I discovered that the patch solves a slightly different problem... it
pulls up index scans as a viable path choice, (but not for the DESC
case) but does not push down the LIMIT to the child tables ... so the
actual performance improvement is zero - however hopefully the patch
provides useful raw material to help.

e.g - using the examine schema from the OP email - but removing the DESC
from the query:

part=# set enable_seqscan=off;
SET
part=# explain SELECT * FROM n_traf ORDER BY date_time LIMIT 1;
                                                                   QUERY
PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=198367.14..198367.15 rows=1 width=20)
   ->  Sort  (cost=198367.14..200870.92 rows=1001510 width=20)
         Sort Key: public.n_traf.date_time
         ->  Result  (cost=0.00..57464.92 rows=1001510 width=20)
               ->  Append  (cost=0.00..57464.92 rows=1001510 width=20)
                     ->  Index Scan using n_traf_date_time_login_id on
n_traf  (cost=0.00..66.90 rows=1510 width=20)
                     ->  Index Scan using
n_traf_y2007m01_date_time_login_id on n_traf_y2007m01 n_traf
(cost=0.00..4748.38 rows=83043 width=20)
                     ->  Index Scan using
n_traf_y2007m02_date_time_login_id on n_traf_y2007m02 n_traf
(cost=0.00..4772.60 rows=83274 width=20)
                     ->  Index Scan using
n_traf_y2007m03_date_time_login_id on n_traf_y2007m03 n_traf
(cost=0.00..4782.12 rows=83330 width=20)
                     ->  Index Scan using
n_traf_y2007m04_date_time_login_id on n_traf_y2007m04 n_traf
(cost=0.00..4818.29 rows=83609 width=20)
                     ->  Index Scan using
n_traf_y2007m05_date_time_login_id on n_traf_y2007m05 n_traf
(cost=0.00..4721.85 rows=82830 width=20)
                     ->  Index Scan using
n_traf_y2007m06_date_time_login_id on n_traf_y2007m06 n_traf
(cost=0.00..4766.56 rows=83357 width=20)
                     ->  Index Scan using
n_traf_y2007m07_date_time_login_id on n_traf_y2007m07 n_traf
(cost=0.00..4800.44 rows=83548 width=20)
                     ->  Index Scan using
n_traf_y2007m08_date_time_login_id on n_traf_y2007m08 n_traf
(cost=0.00..4787.55 rows=83248 width=20)
                     ->  Index Scan using
n_traf_y2007m09_date_time_login_id on n_traf_y2007m09 n_traf
(cost=0.00..4830.67 rows=83389 width=20)
                     ->  Index Scan using
n_traf_y2007m10_date_time_login_id on n_traf_y2007m10 n_traf
(cost=0.00..4795.78 rows=82993 width=20)
                     ->  Index Scan using
n_traf_y2007m11_date_time_login_id on n_traf_y2007m11 n_traf
(cost=0.00..4754.26 rows=83351 width=20)
                     ->  Index Scan using
n_traf_y2007m12_date_time_login_id on n_traf_y2007m12 n_traf
(cost=0.00..4819.51 rows=84028 width=20)
(18 rows)







Attachment

pgsql-performance by date:

Previous
From: "Guillaume Smet"
Date:
Subject: Re: Estimation problem with a LIKE clause containing a /
Next
From: "Luke Lonergan"
Date:
Subject: Re: partitioned table and ORDER BY indexed_field DESC LIMIT 1