Re: Ordered Partitioned Table Scans - Mailing list pgsql-hackers

From Julien Rouhaud
Subject Re: Ordered Partitioned Table Scans
Date
Msg-id CAOBaU_aJ86P=rpc5t4_rghqL4iXhPgZFYvK-Gs=+Eyt0KwsHMw@mail.gmail.com
Whole thread Raw
In response to Re: Ordered Partitioned Table Scans  (David Rowley <david.rowley@2ndquadrant.com>)
Responses Re: Ordered Partitioned Table Scans
List pgsql-hackers
On Wed, Dec 19, 2018 at 3:01 PM David Rowley
<david.rowley@2ndquadrant.com> wrote:
>
> On Thu, 20 Dec 2018 at 01:58, Julien Rouhaud <rjuju123@gmail.com> wrote:
> >
> > The multi-level partitioning case is another
> > thing that would need to be handled for instance (and that's the main
> > reason I couldn't submit a new patch when I was working on it), and
> > I'm definitely not arguing to cover it in this patch.
>
> As far as I'm aware, the multi-level partitioning should work just
> fine with the current patch. I added code for that a while ago. There
> are regression tests to exercise it. I'm not aware of any cases where
> it does not work.

Sorry to come back this late.  What I was mentioning about
sub-partitioning is when a whole partition hierarchy is natively
ordered, we could avoid the generate merge appends.  But unless I'm
missing something with your patch, that won't happen.

Considering

CREATE TABLE nested (id1 integer, id2 integer, val text) PARTITION BY
LIST (id1);

CREATE TABLE nested_1 PARTITION OF nested FOR VALUES IN (1) PARTITION
BY RANGE (id2);
CREATE TABLE nested_1_1 PARTITION OF nested_1 FOR VALUES FROM (1) TO (100000);
CREATE TABLE nested_1_2 PARTITION OF nested_1 FOR VALUES FROM (100000)
TO (200000);
CREATE TABLE nested_1_3 PARTITION OF nested_1 FOR VALUES FROM (200000)
TO (300000);

CREATE TABLE nested_2 PARTITION OF nested FOR VALUES IN (2) PARTITION
BY RANGE (id2);
CREATE TABLE nested_2_1 PARTITION OF nested_2 FOR VALUES FROM (1) TO (100000);
CREATE TABLE nested_2_2 PARTITION OF nested_2 FOR VALUES FROM (100000)
TO (200000);
CREATE TABLE nested_2_3 PARTITION OF nested_2 FOR VALUES FROM (200000)
TO (300000);

CREATE INDEX ON nested(id1, id2);

ISTM that a query like
SELECT * FROM nested ORDER BY 1, 2;
could simply append all the partitions in the right order (or generate
a tree of ordered appends), but:

                            QUERY PLAN
-------------------------------------------------------------------
 Append
   ->  Merge Append
         Sort Key: nested_1_1.id1, nested_1_1.id2
         ->  Index Scan using nested_1_1_id1_id2_idx on nested_1_1
         ->  Index Scan using nested_1_2_id1_id2_idx on nested_1_2
         ->  Index Scan using nested_1_3_id1_id2_idx on nested_1_3
   ->  Merge Append
         Sort Key: nested_2_1.id1, nested_2_1.id2
         ->  Index Scan using nested_2_1_id1_id2_idx on nested_2_1
         ->  Index Scan using nested_2_2_id1_id2_idx on nested_2_2
         ->  Index Scan using nested_2_3_id1_id2_idx on nested_2_3
(11 rows)


Also, a query like
SELECT * FROM nested_1 ORDER BY 1, 2;
could generate an append path, since the first column is guaranteed to
be identical in all partitions, but instead:

                         QUERY PLAN
-------------------------------------------------------------
 Merge Append
   Sort Key: nested_1_1.id1, nested_1_1.id2
   ->  Index Scan using nested_1_1_id1_id2_idx on nested_1_1
   ->  Index Scan using nested_1_2_id1_id2_idx on nested_1_2
   ->  Index Scan using nested_1_3_id1_id2_idx on nested_1_3
(5 rows)

and of course

# EXPLAIN (costs off) SELECT * FROM nested_1 ORDER BY 2;
             QUERY PLAN
------------------------------------
 Sort
   Sort Key: nested_1_1.id2
   ->  Append
         ->  Seq Scan on nested_1_1
         ->  Seq Scan on nested_1_2
         ->  Seq Scan on nested_1_3
(6 rows)

I admit that I didn't re-read the whole thread, so maybe I'm missing
something (if that's the case my apologies, and feel free to point me
any relevant discussion).  I'm just trying to make sure that we don't
miss some cases, as those seems possible and useful to handle.  Or is
that out of the perimeter?


pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: Special role for subscriptions
Next
From: Tom Lane
Date:
Subject: Re: CPU costs of random_zipfian in pgbench