Re: Ordered Partitioned Table Scans - Mailing list pgsql-hackers

From Amit Langote
Subject Re: Ordered Partitioned Table Scans
Date
Msg-id 6a5caa48-b6be-0c8a-4cdf-ef0fe3342da7@lab.ntt.co.jp
Whole thread Raw
In response to Ordered Partitioned Table Scans  (David Rowley <david.rowley@2ndquadrant.com>)
Responses Re: Ordered Partitioned Table Scans  (David Rowley <david.rowley@2ndquadrant.com>)
List pgsql-hackers
On 2018/10/26 11:50, David Rowley wrote:
> RANGE partitioning of time-series data is quite a common range to use
> partitioning, and such tables tend to grow fairly large.  I thought
> since we always store RANGE partitioned tables in the PartitionDesc in
> ascending range order that it might be useful to make use of this and
> when the required pathkeys match the order of the range, then we could
> make use of an Append node instead of uselessly using a MergeAppend,
> since the MergeAppend will just exhaust each subplan one at a time, in
> order.
> 
> It does not seem very hard to implement this and it does not add much
> in the way of additional processing to the planner.
> 
> Performance wise it seems to give a good boost to getting sorted
> results from a partitioned table. I performed a quick test just on my
> laptop with:
> 
> Setup:
> CREATE TABLE partbench (id BIGINT NOT NULL, i1 INT NOT NULL, i2 INT
> NOT NULL, i3 INT NOT NULL, i4 INT NOT NULL, i5 INT NOT NULL) PARTITION
> BY RANGE (id);
> select 'CREATE TABLE partbench' || x::text || ' PARTITION OF partbench
> FOR VALUES FROM (' || (x*100000)::text || ') TO (' ||
> ((x+1)*100000)::text || ');' from generate_Series(0,299) x;
> \gexec
> \o
> INSERT INTO partbench SELECT x,1,2,3,4,5 from generate_Series(0,29999999) x;
> create index on partbench (id);
> vacuum analyze;
> 
> Test:
> select * from partbench order by id limit 1 offset 29999999;
> 
> Results Patched:
> 
> Time: 4234.807 ms (00:04.235)
> Time: 4237.928 ms (00:04.238)
> Time: 4241.289 ms (00:04.241)
> Time: 4234.030 ms (00:04.234)
> Time: 4244.197 ms (00:04.244)
> Time: 4266.000 ms (00:04.266)
> 
> Unpatched:
> 
> Time: 5917.288 ms (00:05.917)
> Time: 5937.775 ms (00:05.938)
> Time: 5911.146 ms (00:05.911)
> Time: 5906.881 ms (00:05.907)
> Time: 5918.309 ms (00:05.918)
> 
> (about 39% faster)
> 
> The implementation is fairly simple. One thing I don't like about is
> I'd rather build_partition_pathkeys() performed all the checks to know
> if the partition should support a natural pathkey, but as of now, I
> have the calling code ensuring that there are no sub-partitioned
> tables. These could cause tuples to be output in the wrong order.
> 
> Does this idea seem like something we'd want?

Definitely!  Thanks for creating the patch.

I recall Ronan Dunklau and Julien Rouhaud had proposed a patch for this
last year, but the partitioning-related planning code hadn't advanced then
as much as it has today, so they sort of postponed working on it.
Eventually their patch was returned with feedback last November.  Here's
the link to their email in case you wanted to read some comments their
proposal and patch got, although some of them might be obsolete.

https://www.postgresql.org/message-id/2401607.SfZhPQhbS4%40ronan_laptop

Thanks,
Amit



pgsql-hackers by date:

Previous
From: Chapman Flack
Date:
Subject: Re: PostgreSQL vs SQL/XML Standards
Next
From: Chapman Flack
Date:
Subject: Re: PostgreSQL vs SQL/XML Standards