Re: [PoC] Reducing planning time when tables have many partitions - Mailing list pgsql-hackers
From | Ashutosh Bapat |
---|---|
Subject | Re: [PoC] Reducing planning time when tables have many partitions |
Date | |
Msg-id | CAExHW5tdgM_hbFm5c_grqVnQGxidK6rLdJLEbnUfs_OYKD88pw@mail.gmail.com Whole thread Raw |
In response to | Re: [PoC] Reducing planning time when tables have many partitions (David Rowley <dgrowleyml@gmail.com>) |
Responses |
Re: [PoC] Reducing planning time when tables have many partitions
|
List | pgsql-hackers |
Hi David, On Fri, Apr 4, 2025 at 11:34 AM David Rowley <dgrowleyml@gmail.com> wrote: > > I also did some benchmarking using the attached script. I've attached > the results of running that on my AMD Zen2 machine. See the end of the > script for the CREATE TABLE statement for loading that into postgres. > > The results look pretty good. v37 came out slightly faster than v36, > either noise or because of dispose_eclass_member_iterator() removal. Here are my benchmarking results Planning time: columns correspond to number of joins, rows to number of partitions, each cell is a triplet (s, md, pd) where s is improvement as percentage of planning time without patch (higher the better) md and pd are standard deviation in planning time with and without patch respectively as % of respective averages. planning time improvement with PWJ=off num_parts | 2 | 3 | 4 | 5 -----------+------------------------------+----------------------------+----------------------------+---------------------------- 0 | s=-4.43% md=16.72% pd=16.41% | s=-2.93% md=5.33% pd=5.27% | s=-0.10% md=4.28% pd=4.49% | s=-2.60% md=4.80% pd=4.28% 10 | s=1.32% md=9.93% pd=9.13% | s=2.70% md=1.90% pd=1.90% | s=4.53% md=1.55% pd=1.59% | s=4.96% md=0.99% pd=0.94% 100 | s=29.15% md=3.96% pd=4.66% | s=38.11% md=0.43% pd=1.22% | s=44.17% md=1.19% pd=1.21% | s=43.97% md=0.37% pd=0.27% 500 | s=63.12% md=1.39% pd=3.80% | s=69.57% md=1.76% pd=0.73% | s=71.73% md=0.88% pd=0.81% | s=66.08% md=0.72% pd=0.57% 1000 | s=76.33% md=0.82% pd=1.72% | s=80.37% md=0.30% pd=0.82% | s=75.30% md=1.23% pd=0.64% | s=67.06% md=0.83% pd=0.19% (5 rows) planning time improvement with PWJ=on num_parts | 2 | 3 | 4 | 5 -----------+----------------------------+----------------------------+----------------------------+---------------------------- 0 | s=-2.08% md=5.87% pd=6.16% | s=-2.22% md=4.70% pd=5.29% | s=-1.77% md=5.40% pd=4.23% | s=-3.96% md=3.96% pd=3.89% 10 | s=-0.93% md=3.34% pd=2.89% | s=0.06% md=0.96% pd=0.52% | s=2.09% md=0.43% pd=0.60% | s=2.03% md=0.39% pd=0.66% 100 | s=20.31% md=1.70% pd=1.19% | s=16.98% md=1.28% pd=1.70% | s=13.35% md=0.32% pd=0.77% | s=14.12% md=1.19% pd=0.43% 500 | s=51.98% md=3.12% pd=4.25% | s=50.85% md=0.45% pd=0.48% | s=47.27% md=0.16% pd=0.82% | s=40.60% md=0.30% pd=0.57% 1000 | s=67.34% md=1.67% pd=1.16% | s=69.54% md=0.20% pd=0.44% | s=61.31% md=1.13% pd=0.63% | s=54.66% md=0.38% pd=0.57% (5 rows) The deviations are mostly within noise range so the results are reliable. There are some cells, corresponding to lower number of partitions and join, which show regression in planning time but that's within noise range. I think that can be ignored. For a higher number of partitions and joins the improvements are impressive. planning memory improvement with PWJ=off num_parts | 2 | 3 | 4 | 5 -----------+----------------------------------+----------------------------------+----------------------------------+---------------------------------- 0 | s=0.00%, mm=15 kB, pm=15 kB | s=0.00%, mm=21 kB, pm=21 kB | s=0.00%, mm=27 kB, pm=27 kB | s=0.00%, mm=33 kB, pm=33 kB 10 | s=-0.46%, mm=218 kB, pm=219 kB | s=-0.44%, mm=455 kB, pm=457 kB | s=-0.35%, mm=868 kB, pm=871 kB | s=-0.24%, mm=1697 kB, pm=1701 kB 100 | s=-0.88%, mm=1824 kB, pm=1840 kB | s=-0.62%, mm=3718 kB, pm=3741 kB | s=-0.50%, mm=6400 kB, pm=6432 kB | s=-0.38%, mm=10233 kB, pm=10 MB 500 | s=-0.83%, mm=9395 kB, pm=9473 kB | s=-0.56%, mm=20 MB, pm=20 MB | s=-0.44%, mm=35 MB, pm=35 MB | s=-0.30%, mm=59 MB, pm=60 MB 1000 | s=-0.79%, mm=19 MB, pm=20 MB | s=-0.49%, mm=45 MB, pm=45 MB | s=-0.37%, mm=82 MB, pm=83 MB | s=-0.24%, mm=146 MB, pm=147 MB (5 rows) planning memory improvement with PWJ=on num_parts | 2 | 3 | 4 | 5 -----------+----------------------------------+----------------------------------+----------------------------------+---------------------------------- 0 | s=0.00%, mm=15 kB, pm=15 kB | s=0.00%, mm=21 kB, pm=21 kB | s=0.00%, mm=27 kB, pm=27 kB | s=0.00%, mm=33 kB, pm=33 kB 10 | s=-0.55%, mm=365 kB, pm=367 kB | s=-0.25%, mm=1198 kB, pm=1201 kB | s=-0.08%, mm=3571 kB, pm=3574 kB | s=-0.04%, mm=10 MB, pm=10 MB 100 | s=-0.48%, mm=3337 kB, pm=3353 kB | s=-0.20%, mm=11 MB, pm=11 MB | s=-0.09%, mm=33 MB, pm=33 MB | s=-0.04%, mm=97 MB, pm=97 MB 500 | s=-0.45%, mm=17 MB, pm=17 MB | s=-0.19%, mm=59 MB, pm=59 MB | s=-0.08%, mm=179 MB, pm=179 MB | s=-0.03%, mm=544 MB, pm=544 MB 1000 | s=-0.43%, mm=35 MB, pm=35 MB | s=-0.17%, mm=128 MB, pm=129 MB | s=-0.08%, mm=395 MB, pm=396 MB | s=-0.03%, mm=1234 MB, pm=1234 MB (5 rows) The memory profile too is impressive. There's almost no impact on memory consumption. The increase in memory consumption is acceptable given the significant improvements in planning time. I have not reviewed patches though. I haven't measured if the patches improve performance of simple scans with thousands of partitions. Have you tried measuring that? -- Best Wishes, Ashutosh Bapat
pgsql-hackers by date: