Re: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning - Mailing list pgsql-hackers

From Ashutosh Bapat
Subject Re: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning
Date
Msg-id CAExHW5vNk4f8VSTnizL-avREYhPgocEtH3E7MCEJgp=R33ChyQ@mail.gmail.com
Whole thread Raw
In response to Re: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning  (Amit Langote <amitlangote09@gmail.com>)
Responses Re: Reducing memory consumed by RestrictInfo list translations in partitionwise join planning
List pgsql-hackers
Hi Amit,


On Tue, Mar 18, 2025 at 4:02 PM Amit Langote <amitlangote09@gmail.com> wrote:
>
> I ran a couple of benchmarks of my own as follows.
>
> cat benchmark_amit.sh
> for p in 0 16 32 64 128 256 512 1024; do
>   echo -ne "$p\t";
>   pgbench -i --partitions=$p > /dev/null 2>&1
>   pgbench -n -T 30 -f /tmp/query.sql | grep latency | awk '{print $4}';
> done
>
> For a 3-way join:
> cat /tmp/query.sql
> select * from pgbench_accounts t1, pgbench_accounts t2,
> pgbench_accounts t3 where t2.aid = t1.aid AND t3.aid = t2.aid;
>
> nparts  master      patched     %change
> 0       35.508      36.066      1.571
> 16      66.79       67.704      1.368
> 32      67.774      68.179      0.598
> 64      51.023      50.471     -1.082
> 128     56.4        55.759     -1.137
> 256     70.134      68.401     -2.471
> 512     120.621     113.552    -5.861
> 1024    405.339     312.726    -22.848
>
> For a 6-way jon
> cat /tmp/query.sql
> select * from pgbench_accounts t1, pgbench_accounts t2,
> pgbench_accounts t3, pgbench_accounts t4, pgbench_accounts t5,
> pgbench_accounts t6 where t2.aid = t1.aid AND t3.aid = t2.aid and
> t4.aid = t3.aid and t5.aid = t4.aid and t6.aid = t5.aid;
>
> nparts  master      patched     %change
> 0       66.144      64.932     -1.832
> 16      100.874     100.491    -0.380
> 32      104.645     104.536    -0.104
> 64      114.415     109.193    -4.564
> 128     145.422     130.458    -10.290
> 256     273.761     209.919    -23.320
> 512     1359.896    616.295    -54.681
> 1024    7183.765    2857.086   -60.229
>
> -60% means 60% reduction in latency due to the patch.
>
> As others have already found, performance improvements become
> substantial as both partition count and join depth increase. The patch
> seems to have minimal impact at low partition counts and low join
> complexity -- base cases (e.g., 0–32 partitions, 3-way joins) are
> essentially unchanged, which is good to see.

I assume these are with enable_partitionwise_join = off since it's not
enabled in the query.

With lower number of partitions and joins the execution time will be
substantial compared to planning time hence the changes in execution
time will affect the changes in latency. So I agree that the
difference in numbers for lower number of partitions and joins is
noise. That agrees with my results posted a few emails before.

At a higher number of partitions and joins, the planning time
dominates the latency. Hence any variation in the planning time
dominates a variation in the latency. Thus the improvements seen here
are due to improvements in the planning time. Again that agrees with
my results in the previous email.

>
> I haven’t measured memory increase from the patch, but let me know if
> that has already been evaluated and shown not to be a showstopper.

There's a slight increase in memory consumption because of the hash
table but it's very minimal.

Here are memory numbers in kb (presented in the same format as before)
with pwj disabled
 num_parts |            2             |            3             |
        4             |              5

-----------+--------------------------+--------------------------+---------------------------+-----------------------------
         0 | s=0 md=15 pd=15          | s=0 md=21 pd=21          | s=0
md=27 pd=27           | s=0 md=33 pd=33
        10 | s=0 md=218 pd=218        | s=-20 md=455 pd=475      |
s=-20 md=868 pd=888       | s=-20 md=1697 pd=1717
       100 | s=-20 md=1824 pd=1844    | s=-80 md=3718 pd=3798    |
s=-160 md=6400 pd=6560    | s=-320 md=10233 pd=10553
       500 | s=-160 md=9395 pd=9555   | s=-320 md=20216 pd=20536 |
s=-640 md=35735 pd=36375  | s=-1280 md=60808 pd=62088
      1000 | s=-320 md=19862 pd=20182 | s=-640 md=45739 pd=46379 |
s=-1280 md=84210 pd=85490 | s=-2561 md=149740 pd=152301

each column is s=difference in memory consumed, md = memory consumed
without patch, pd = memory consumed with patch. -ve difference shows
increase in memory consumption.

with pwj enabled
 num_parts |            2             |             3              |
           4              |               5

-----------+--------------------------+----------------------------+-----------------------------+-------------------------------
         0 | s=0 md=15 pd=15          | s=0 md=21 pd=21            |
s=0 md=27 pd=27             | s=0 md=33 pd=33
        10 | s=0 md=365 pd=365        | s=-20 md=1198 pd=1218      |
s=-20 md=3571 pd=3591       | s=-20 md=10426 pd=10446
       100 | s=-21 md=3337 pd=3358    | s=-80 md=11237 pd=11317    |
s=-160 md=33845 pd=34005    | s=-320 md=99502 pd=99822
       500 | s=-160 md=17206 pd=17366 | s=-320 md=60096 pd=60416   |
s=-640 md=183306 pd=183946  | s=-1280 md=556705 pd=557985
      1000 | s=-320 md=36119 pd=36439 | s=-640 md=131457 pd=132097 |
s=-1280 md=404809 pd=406089 | s=-2561 md=1263664 pd=1266225

%wise this is 3-4% maximum.
>
> I also noticed using perf that create_join_clause() is a hotspot when
> running without the patch, especially at high partition counts (> 500)
> and the more join relations.

That's an interesting observation. Any hotspot in planning would show
up as hotspot in total execution time. So expected.

>
> Let me know if my methodology seems off or if the results look reasonable.

Thanks for benchmarking using a different method. The results agree
with my results.
--
Best Wishes,
Ashutosh Bapat



pgsql-hackers by date:

Previous
From: Amit Kapila
Date:
Subject: Re: Adding a '--clean-publisher-objects' option to 'pg_createsubscriber' utility.
Next
From: John Naylor
Date:
Subject: Re: CRC32C Parallel Computation Optimization on ARM