Re: A problem about partitionwise join - Mailing list pgsql-hackers

From Richard Guo
Subject Re: A problem about partitionwise join
Date
Msg-id CAMbWs48sQ78BoQ4JkY=FHLYtjiiD-5cK2QtQG8WkkRb8y6ZOLg@mail.gmail.com
Whole thread Raw
In response to Re: A problem about partitionwise join  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: A problem about partitionwise join
List pgsql-hackers

On Fri, May 3, 2024 at 9:31 PM Robert Haas <robertmhaas@gmail.com> wrote:
On Fri, May 3, 2024 at 7:47 AM Richard Guo <guofenglinux@gmail.com> wrote:
> I think one concern regarding performance cost is that the function
> exprs_known_equal() would be called O(N^2) times, where N is the number
> of partition key expressions.  But I think this might not be a problem.
> The number of a joinrel's partition key expressions would only be equal
> to the join degree, since each base relation within the join contributes
> only one partition key expression, according to
> set_joinrel_partition_key_exprs().  This number would not scale with the
> number of partitions.  But I have not measured the performance in
> practice by running benchmarks.  Maybe I'm just wrong.

I don't know, but I do think you should do some benchmarking and see
if you can find cases where this regresses performance. In my opinion,
this feature is worth having only if it's basically free. There's lots
of things we could do in the planner that would give better (perhaps
much better) plans in certain cases, but which we don't do because in
all other cases we'd pay a certain number of CPU cycles to have them
and it just doesn't make sense given how often we'd actually get a
benefit. This might be another such case.

Thank you for the suggestion.  In order to obtain a rough estimation of
how this patch affects planning time, I did the following benchmarking:

* create a partitioned table with 3 keys and 1000 partitions, which
looks like

       Partitioned table "public.t1_parted"
 Column |  Type   | Collation | Nullable | Default
--------+---------+-----------+----------+---------
 a      | integer |           |          |
 b      | integer |           |          |
 c      | integer |           |          |
 d      | integer |           |          |
Partition key: RANGE (a, b, c)
Number of partitions: 1000 (Use \d+ to list them.)

* compose a query involving 5-way joins of this partitioned table, which
looks like:

select * from t1_parted t1
 natural join t1_parted t2
 natural join t1_parted t3
 natural join t1_parted t4
 natural join t1_parted t5
where t1.b = 1 and t1.c = 2;

This query is composed in such a way that it could actually generate
partitionwise join, because there exist equi-join condition for each
pair of matching partition keys; but currently on master it is not able
to generate partitionwise join, because of the filters 't1.b = 1 and
t1.c = 2', which is the issue fixed by this patch.

* run this query 5 times with enable_partitionwise_join set to on, and
collect the average planning time on master and on patched.

To ensure fairness, on master, a little hack is required to enable the
generation of partitionwise join for this query.  This allows us to
eliminate any potential impact on planning partitionwise joins and
evaluate the effects of this patch accurately.

Below is what I got on my local machine.

-- on master

  measurement  | average  | maximum  | minimum  | std_dev | std_dev_as_perc_of_avg
---------------+----------+----------+----------+---------+------------------------
 planning time | 30355.07 | 33148.47 | 29020.82 | 1681.23 | 5.54%


-- on patched

  measurement  | average  | maximum  | minimum  | std_dev | std_dev_as_perc_of_avg
---------------+----------+----------+----------+---------+------------------------
 planning time | 30600.00 | 33523.23 | 28680.75 | 1861.90 | 6.08%


-- without partitionwise join

  measurement  | average | maximum | minimum | std_dev | std_dev_as_perc_of_avg
---------------+---------+---------+---------+---------+------------------------
 planning time | 4840.18 | 5184.05 | 4528.87 | 299.98  | 6.20%


So it seems that the planning time is not significantly affected by this
patch, particularly when compared to the impact caused by partitionwise
join.

BTW, I was using Ashutosh's script [1] for setting up the benchmarking.
I find the script very handy.

[1] https://www.postgresql.org/message-id/flat/CAExHW5s%3DbCLMMq8n_bN6iU%2BPjau0DS3z_6Dn6iLE69ESmsPMJQ%40mail.gmail.com

Thanks
Richard

pgsql-hackers by date:

Previous
From: "Hayato Kuroda (Fujitsu)"
Date:
Subject: RE: Slow catchup of 2PC (twophase) transactions on replica in LR
Next
From: Bertrand Drouvot
Date:
Subject: Re: Synchronizing slots from primary to standby