Re: Ordered Partitioned Table Scans - Mailing list pgsql-hackers

From Julien Rouhaud
Subject Re: Ordered Partitioned Table Scans
Date
Msg-id CAOBaU_YRys5PG8v2WWPe9njn3CGQdwc_b-biTez7H2BVg7cYDA@mail.gmail.com
Whole thread Raw
In response to Re: Ordered Partitioned Table Scans  (David Rowley <david.rowley@2ndquadrant.com>)
Responses Re: Ordered Partitioned Table Scans  (David Rowley <david.rowley@2ndquadrant.com>)
List pgsql-hackers
On Wed, Dec 19, 2018 at 1:18 PM David Rowley
<david.rowley@2ndquadrant.com> wrote:
>
> On Wed, 19 Dec 2018 at 23:25, Julien Rouhaud <rjuju123@gmail.com> wrote:
> >
> > I see.  But since for now the optimisation will only be done
> > considering all partitions, I still think that it's better to store a
> > bool flag in the PartitionDesc to describe if it's natively ordered or
> > not, and therefore also handle the case for
> > non-intervleaved-multi-datums list partitioning.  It won't add much
> > overhead and will benefit way more cases.
>
> I'm not really in favour of adding a flag there only to remove it
> again once we can more easily determine the pruned partitions.
> Remember the flag, because it's stored in the relation cache, must be
> set accounting for all partitions. As soon as we want to add smarts
> for pruned partitions, then the flag becomes completely useless for
> everything.

I don't see why we should drop this flag.  If we know that the
partitions are naturally ordered, they'll still be ordered after some
partitions have been prune, so we can skip later checks if we already
have the information.  The only remaining cases this flag doesn't
cover are:

- partitions are naturally ordered but there's a default partition.
We could store this information and later check if the default
partition has been pruned or not
- partitions are not naturally ordered, but become naturally ordered
if enough partitions are pruned.  I may be wrong but that doesn't seem
like a very frequent use case to me  I'd imagine that in a lot of
cases either almost no partition are prune (or at least not enough so
that the remaining one are ordered), or all but one partition is
pruned),.  So keeping a low overhead for the
almost-no-pruned-partition with naturally ordered partitions case
still seems like a good idea to me.

>  If covering all cases in the first hit is your aim then
> the way to go is to add the live_parts field to RelOptInfo in this
> patch rather than in Amit's patch in [1].  I'd much rather add the
> pruned partitions smarts as part of another effort.  The most likely
> cases to benefit from this are already covered by the current patch;
> range partitioned tables.

Covering all cases is definitely not my goal here, just grabbing the
low hanging fruits.  The multi-level partitioning case is another
thing that would need to be handled for instance (and that's the main
reason I couldn't submit a new patch when I was working on it), and
I'm definitely not arguing to cover it in this patch.  That being
said, I'll try to have a look at this patch too, but as I said I have
a lot of catch-up to do in this part of the code, so I'm afraid that
I'll not be super efficient.


pgsql-hackers by date:

Previous
From: Massimo Fidanza
Date:
Subject: GraalVM
Next
From: David Rowley
Date:
Subject: Re: Ordered Partitioned Table Scans